Stat Mech Proteins - Theoretical and Computational

Download Report

Transcript Stat Mech Proteins - Theoretical and Computational

Statistical Mechanics
of Proteins
Ioan Kosztin
Department of Physics & Astronomy
University of Missouri - Columbia
 Equilibrium and non-equilibrium
properties of proteins
 Free diffusion of proteins
 Coherent motion in proteins:
temperature echoes
 Simulated cooling of proteins
Molecular Modeling
1. Model building
2. Molecular Dynamics Simulation
3. Analysis of the
• model
• results of the simulation
Collection of MD Data
• DCD trajectory file
 coordinates for each atom
 velocities for each atom
• Output file
 global energies
 temperature, pressure, …
Analysis of MD Data
1. Structural properties
2. Equilibrium properties
3. Non-equilibrium properties
Can be studied via both equilibrium and
non-equilibrium MD simulations
Equilibrium (Thermodynamic)
Properties
MD simulation
microscopic information
Phase space trajectory
[r(t),p(t)]
Statistical
Mechanics
macroscopic properties
Ensemble average over
probability density
()
Statistical Ensemble
Collection of large number of replicas (on a
macroscopic level) of the system
Each replica is characterized by the same
macroscopic parameters (e.g., NVT, NPT)
The microscopic state of each replica (at a
given time) is determined by  in phase space
Time vs Ensemble Average
(t) generates an ensemble with
(  d  lim d / t
For t,
t 
Ergodic Hypothesis: Time and Ensemble averages are
equivalent, i.e.,
 A(r, p)t   A()
T
Time average:
1
 A t   dt A[r(t ), p(t )]
T 0
Ensemble average:
 A   d ( ) A( )
Thermodynamic Properties
from MD Simulations
Thermodynamic (equilibrium) averages can be calculated
via time averaging of MD simulation time series
 A 
N
A(t )

N
1
i 1
Thermodynamic
average
i
MD simulation
time series
Finite simulation time
means incomplete sampling!
Common Statistical Ensembles
1. Microcanonical (N,V,E):
NVE ()  [ H ()  E ]
Newton’s eq. of motion
2. Canonical (N,V,T):
NVT ()  exp{[F  H ()] / kBT }
Langevin
dynamics
3. Isothermal-isobaric (N,p,T)
NPT ()  exp{[G  H ()] / kBT } Nose-Hoover
method
Different simulation protocols [(t) (t+t )] sample
different statistical ensembles
Examples of Thermodynamic
Observables
• Energies (kinetic, potential, internal,…)
• Temperature [equipartition theorem]
• Pressure [virial theorem]
Thermodynamic derivatives are related to mean
square fluctuations of thermodynamic quantities
•
•
•
•
Specific heat capacity Cv and CP
Thermal expansion coefficient P
Isothermal compressibility T
Thermal pressure coefficient V
Mean Energies
Total (internal) energy:
E
N
E (t )

N
1
i
i 1
Kinetic energy:
K
N
M

N
1
i 1 j 1
Potential energy:
U EK
p 2j (ti )
2m j
TOTAL
KINETIC
BOND
ANGLE
DIHED
IMPRP
ELECT
VDW
Note: You can conveniently use namdplot to graph the
time evolution of different energy terms (as well as
T, P, V) during simulation
Temperature
From the equipartition theorem
T
 pk H/pk   kBT
2
3Nk B
K
Instantaneous kinetic temperature
T 
2K
namdplot TEMP vs TS …
3NkB
Note: in the NVTP ensemble NN-Nc, with Nc=3
Pressure
 rk H/rk   kBT
From the virial theorem
PV  NkBT + W 
The virial is defined as
W
M
rf

3
1
j 1
with
j
j

w( r )

3
1
i , j i
ij
w(r )  r dv (r ) / dr
Instantaneous pressure function (not unique!)
P  kBT + W / V
pairwise
interaction
Thermodynamic Fluctuations (TF)
 A 
N
A(t )   A 

N
1
i 1
i
Mean Square Fluctuations (MSF)
A2   ( A   A)2    A2    A 2
According to Statistical Mechanics, the
probability distribution of thermodynamic
fluctuations is
 P  V  T  S 

 fluct  exp
2k BT


TF in NVT Ensemble
In MD simulations distinction must be made between properly
defined mechanical quantities (e.g., energy E, kinetic temperature T,
instantaneous pressure P ) and thermodynamic quantities, e.g., T, P, …
For example:
E   H   kBT CV 
But:
P   P   kBT / VT 
2
2
2
2
2
Other useful formulas:  K  
2
3N
2
( k BT ) 2
U   kBT (CV  3NkB / 2)
2
2
U P   kBT ( V  kB )
2
CV  (E / T )V
V  (P / T )V
How to Calculate CV ?
1. From definition
CV  (E / T )V
Perform multiple simulations to determine
E   E
as a function of T,
then calculate the derivative of E(T) with respect to T
2. From the MSF of the total energy E
CV  E  / kBT
2
with
2
E    E    E
2
2
2
Analysis of MD Data
1. Structural properties
2. Equilibrium properties
3. Non-equilibrium properties
Can be studied via both equilibrium
and/or non-equilibrium MD simulations
Non-equilibrium Properties
1. Transport properties
2. Spectral properties
Can be obtained from equilibrium MD simulations
by employing linear response theory
Linear Response Theory
External weak perturbation
Response function R(t)
Vext (t )
Equilibrium
state
Relaxation
(dissipation)
Non-equilibrium
state
Thermal fluctuations
autocorrelation function C(t)
Related
through the
Fluctuation
Dissipation
Theorem
(FDT)
Time Correlation Functions
CAB (t  t ' )   A(t ) B(t ' )   A(t  t ' ) B(0)
since eq is t independent !
A  B crosscorrelation function
autoA B

Correlation time:
c   dt C AA (t ) / C AA (0)
0
Estimates how long the “memory” of the system lasts
In many cases (but not always):
C(t )  C(0) exp(t / c )
Response Function
or generalized susceptibility
External perturbation:Vext (t )   A 
t
Response of the system: A(t )
Response function:
with
  1 / kBT
f ext (t )
  dt' R(t  t ' ) f ext (t ' )
0
R(t )  {A(t ), A}PB    t A(t ) A
Generalized susceptibility:( )
 R ( )   dt e
Rate of energy dissipation/absorption:
Q   A(t )
df
dt
1

i t
R (t )
0
  " () | f 0 |2 , f (t )  Re f 0 e i t
2
Fluctuation-Dissipation Theorem
Relates R(t) and C(t), namely:
" ()  (/ 2) C()
In the static limit (t  ):
C(0)   A2   kBT R(0)
Note: quantum corrections are important when kBT  
1
" ()   tanh( / 2) C()
Diffusion Coefficient

Generic transport coefficient:
   dt  t A(t )  t A(0)
0
Einstein relation:
2 t  [ A(t )  A(0)]2 
Example: self-diffusion coefficient
D
1

dt  v (t ) v (0)

3
0
6Dt   r (t )  r (0)
2
Free Diffusion (Brownian Motion) of Proteins
 in living organisms proteins exist and function
in a viscous environment, subject to
stochastic (random) thermal forces
 the motion of a globular protein in a viscous
aqueous solution is diffusive
 e.g., ubiquitin can be
modeled as a spherical
particle of radius
R~1.6nm and mass
M=6.4kDa=1.1x10-23 kg
2 R ~ 3.2 nm
Free Diffusion of Ubiquitin in Water
 ubiquitin in water is subject to two forces:
• friction (viscous drag) force:
Ff   v
friction (damping) coeff
  6 R
(Stokes law)
viscosity
• stochastic thermal (Langevin) force:
FL   (t )
 (t )  0
often modeled as a “Gaussian white noise”
 (t )  (0)  2 kBT  (t )
kB  1.38 1023 J K (Boltzmann constant ), T  temperature
The Dirac delta function
50

 t   
0
  0.01
  0.03
  0.05
40
for t  0
for t  0
30
20
10
-0.1
0
0.1
In practice, it can be approximated as:
1
 (t )   t   exp  t   , as   0
2
(t) describes =0 correlation time (“white noise”)
stochastic processes
Useful formulas:
t
f (t )   f (t ')  (t  t ') dt '
0
 (at )   (t ) / a
Equation of Motion and Solution
Newton’s 2nd law:
dv
Ma  Ff + FL  M
  v +  (t )
dt
Formal solution (using the variation of const. method):
1 t / t
t '/ 
v(t )  v0 e + e   (t ') e dt '
0
M
 t / t
 t /
x(t )  x0 + v0  e  + e   (t ') 1  e(t 't ) /  dt '
 t /
M

M

0
= velocity relaxation (persistence) time
The motion is stochastic and requires statistical description
formulated in terms of averages & probability distributions
Statistical Averages
 (t )  (0)  2 kBT  (t )
 (t )  0
Exponential relaxation of x and v with characteristic time 
t / 
v(t )  v0e
 0 as t  
x(t )  x0 + v0 1  e  t /   x0 + v0 as t  
1 e  
as



+ 2 Dt  3D + O  e 
v (t )  v(t ) +
2
x (t )  x(t )
2
 x(t ) 
2
2
D
t /
D
t 
t /
x  x
2
2
Diffusion coefficient:
(Einstein relation)
 2 Dt
as t  
D  kBT / 
Typical Numerical Estimates
example: ubiquitin - small globular protein
mass : M  8.6kDa  1.42 1023 kg , size : R 1.6 nm ,
density :   M V  103 kg m3 , temperature : T  310 K
Property
Theory
Simulation
vT  3k BT / M
29.6 ms
?
  Μ 
d  vT 

0.56 ps
?
0.16 A
?
25.4 pN  s m
?
1.6 1010 m 2 s
?
0.9 mPa  s
?
D
k BT


1
3
vT 
   6 R
2
Thermal and Friction Forces
 Friction force:
Ff   vT  4.510 N  4.5 nN
9
 Thermal force:
FT 
2 kBT

2

 vT
3
Ff  4.5 nN
For comparison, the corresponding gravitational force:
Fg  Mg ~1014 nN
Ff ~ FT
Diffusion can be Studied by MD Simulations!
ubiquitin in water
PDB entry:
1UBQ
solvate
total # of atoms: 7051 = 1231 (protein) + 5820 (water)
simulation conditions: NpT ensemble (T=310K, p=1atm),
periodic BC, full electrostatics, time-step 2fs (SHAKE)
simulation output: Cartesian coordinates and velocities of
all atoms saved at each time-step (10,000 frames = 40 ps)
in separate DCD files
How To: vel.dcd —> vel.dat
 namd2 produces velocity trajectory (DCD) file if in the
configuration file contains
velDCDfile
velDCDfreq
vel.dcd
1
;# your file name
;# write vel after 1 time-step
 load vel.dcd into VMD [e.g., mol load psf ubq.psf dcd vel.dcd]
note: run VMD in text mode, using the: -dispdev text option
 select only the protein (ubiquitin) with the VMD command
set ubq [atomselect top “protein”]
 source and run the following tcl procedure:
source v_com_traj.tcl
v_com_traj COM_vel.dat
 the file “COM_vel.dat” contains 4 columns:
time [fs], vx, vy and vz [m/s]
70
12.6188434361 -18.6121653643 -34.7150913537
note: an ASCII data file with the trajectory of the COM
coordinates can be obtained in a similar fashion
the v_COM_traj Tcl procedure
proc v_com_traj {filename {dt 2} {selection "protein"} {first_frame 0}
{frame_step 1} {mol top} args} {
set outfile [open $filename "w"]
set convFact 2035.4
set sel [atomselect $mol $selection frame 0]
set num_frames [molinfo $mol get numframes]
for {set frame $first_frame} {$frame < $num_frames} {incr frame
$frame_step} {
$sel frame $frame
set vcom [vecscale $convFact [measure center $sel weight mass]]
puts $outfile "$frame\t $vcom"
}
close $outfile
}
Goal: calculate D and

by fitting the theoretically calculated center of
mass (COM) velocity autocorrelation function to the
one obtained from the simulation
 theory:
Cvv (t )  v(t ) v(0)  v02 e  t /
k BT D
2
v0 

(equipartition theorem)
M

 simulation: consider only the x-component  vx  v 
replace ensemble average by time average
1 N i
Cvv (t )  Ci 
vn+i vn

N  i n1
t  ti  it , vn  v tn  , N  # of frames in vel.DCD
Velocity Autocorrelation Function
Vx(t) [m/s]
300
300
250
250
Cvv (t ) 200
200
150
150
100
100
50
50
0
40
20
0
20
40
0
0
2
4
6
time [ps]
8
100 400200
400
200
600 300
800 1000
time [ fs ]
10
  0.1 ps
D  kBT  
 vx2   3.3 1011 m2 s 1
Fit
300
250
200
150
100
50
Cvv (t ) 
100
200
D

e  t /
300
time [ fs ]
400
Probability distribution of

p(v)  2 v
2

1/ 2

exp v  v
2
  D exp   v 2  D 
with
v  vx, y, z
vx, y , z
2

Maxwell distribution of v
COM
COM velocity [m/s]
P(v)dv  p(vx ) p (v y ) p (vz )dvx dv y dvz
  D 
2 M 



  k BT 
3/ 2
3/ 2
exp   v 2  D  4 v 2 dv
2

Mv
v 2 exp  
 2k BT

 dv

What have we learned ?
soluble, globular proteins in aqueous solution at
physiological temperature execute free diffusion
(Brownian motion with typical parameter values:
Property
Theory
Simulation
vT  3k BT / M
29.6 ms
31.6 ms
  Μ 
d  vT 

0.56 ps
0.1 ps
0.16 A
0.03 A
25.4 pN  s m
141.6 pN  s m
D
k BT


1
3
vT 
   6 R
2
1.6  1010 m 2 s 0.3  1010 m 2 s
0.9 mPa  s
4.7 mPa  s
How about the motion of parts of
the protein ?
 parts of a protein (e.g., side groups, a group of
amino acids, secondary structure elements,
protein domains, …), besides the viscous,
thermal forces are also subject to a resultant
force from the rest of the protein
 for an effective degree of freedom x (reaction
coordinate) the equation of motion is
mx   x + f ( x) +  (t )
In the harmonic approximation f ( x)  kx
and we have a 1D Brownian oscillator