The Role of Entropy in Biomolecular Modelling

Download Report

Transcript The Role of Entropy in Biomolecular Modelling

Biomolecular Modelling: Goals, Problems,
Perspectives
1. Goal
simulate/predict processes such as
1.
2.
3.
4.
polypeptide folding
biomolecular association
partitioning between solvents
membrane/micelle formation
common characteristics:
- degrees of freedom:
- equations of motion:
- governing theory:
thermodynamic
equilibria governed
by weak (nonbonded)
forces
atomic (solute + solvent)
classical dynamics
statistical mechanics
hamiltonian or
force field
entropy
Processes: Thermodynamic Equilibria
Folding
folded/native
Micelle Formation
denatured
Complexation
bound
unbound
micelle
mixture
Partitioning
in membrane
in water in mixtures
Definition of a model for molecular simulation
Every molecule consists of atoms that are very strongly bound to each other
Degrees of freedom:
atoms are the
elementary particles
Forces or
interactions
between atoms
Boundary conditions
MOLECULAR
MODEL
Force Field =
physico-chemical
knowledge
Methods to generate
configurations of
atoms: Newton
system
temperature
pressure
Biomolecular Modelling: Goals, Problems,
Perspectives
Four Problems
1. the force field problem
3. the ensemble problem
A very small (free) energy differences
B entropic effects
C size problem
2. the search problem
A the search problem alleviated
B the search problem aggravated
Perspectives
4. the experimental problem
A averaging
B insufficient accuracy
Four Problems
1. The Force Field Problem
A very small (free) energy differences (kBT = 2.5 kJ/mol)
resulting from summation over very many contributions (atoms)
i
106 – 108
B accounting for entropic effects
not only energy minima are of
importance but whole range of
i
must be very accurate
energy
E(x)
x-values with energies ~kBT
must be included in the
force field parameter calibration
may have higher energy
but lower free energy
than
coordinate x
Four Problems
C size problem
the larger the system, the more accurate the individual energy
contributions (from atoms) must be to reach the same overall accuracy
Fazit calibrate force field using thermodynamic data for small molecules
in the condensed phase
keep force field physical + simple
transferable
computable
Example GROMOS biomolecular force field
Choice of Model, Force Field, Sampling
3. Scoring Function, Energy Function, Force Field
-
continuous n lattice
basis for force field or scoring function:
1. structural data
- large molecules:
crystal structures
solution structures of proteins
2. thermodynamic data
- small molecules:
heat of vaporization, density
in condensed phase
partition coefficients
e, D, h, etc.
3. theoretical data
- small molecules:
in gas phase
electrostatic potential and gradient
torsional–angle rotation profiles
Determination of Force Field Parameters
Calibration sets of small molecules
1. Non-polar molecules
2. Polar molecules
2. Polar Molecules
3. Ionic molecules
Calibration set: 28 compounds
Chris Oostenbrink
methanol
ethers, alcohols, esters, ketones,
ethanol
acids, amines, amides, aromatics,
sulfides, thiols
diethylether
2-propanol
butanol
Determination of Force Field Parameters
Calibration set: 28 compounds
ethylamine
acetone
1-butylamine
2-butanone
ethyldiamine
3-pentanone
acetic acid
diethylamine
n-methylacetamide
Determination of Force Field Parameters
Calibration set: 28 compounds
O
methyl acetate
O
C
O
H3C
butyl acetate
CH3
H3C
C
O
O
ethyl acetate
H3C
H3C
CH3
O
ethyl glycol dipropanoate
O
ethyl propanoate
H3C
H2
C
C
C
H2
H3C
CH3
O
H2
C
C
C
H2
H3C
H2
C
CH3
O
glycerol tripropanoate H3C
O
propyl acetate
H2
C
C
H3C
H2
C
C
C
H2 O
O
C
C
H2
O
CH2
O
O
ethyl butanoate H3C
CH3
C
H2
O
H2
C
C
H2
C
H2
C
O
C
H2
CH3
H3C
H2
C
C
C O
H2
O
C
C O
H2
CH
O
C
C
H2
O
CH2
Determination of Force Field Parameters
Calibration set: 28 compounds
dimethylsulfide
benzene
ethanethiol
ethylmethylsulfide
toluene
dimethylsulfide
Determination of Force Field Parameters
force field parameter set: 17 parameters
quantities to reproduce:
for all 28 compounds
- heat of vaporisation
- density (liquid)
for analogues of polar amino acid sidechains (14):
- free enthalpy of solvation:
in cyclohexane
in water
Heat of Vaporization for Pure Liquids
ethyldiamine
1-butylamine
ethylamine
average absolute deviation: 1.9 kJ/mol
Density for Pure Liquids
dimethylsulfide
ethanethiol
2-propanol
acetone
average absolute deviation: 4.0%
Free Energy of Solvation in Cyclohexane
amino acid analogues (polar)
Ser
Cys Thr
Lys
Met
Asn
His
Glu
Arg
Trp
Asp
Gln
Phe
Tyr
average absolute deviation: 2.2 kJ/mol (53A5)
Free Energy of Solvation in Water
amino acid analogues (polar)
calibrated to
p, DHvap liquids
Met
Cys
calibrated to
DGhydration
Lys
Glu
Gln
Asn
Phe
Asp
Ser
Thr
His
Trp
Arg
Tyr
CHARMM: 4.4 kJ/mol
AMBER: 5.1 kJ/mol
OPLS: 3.1 kJ/mol
average absolute deviation: 10.3 kJ/mol (53A5)
average absolute deviation: 0.9 kJ/mol (53A6)
Stability of a 314 helix for a dodeca-beta-peptide in methanol
H
N
O
S
+
HO
NH
+
NH3
+
NH3
NH3
O
HO
A
+
H3 N
B
N
H
N
H
O
O
O
O
O
N
H
N
H
C
HO
N
H
N
H
N
H
D
N
H
N
H
N
H
E
O
O
O
O
O
O
O
N
H
OH
F
Backbone RMSD from a helical NMR model structure
determined for the beta-peptide in methanol
methanol
water
Applications of Molecular Simulation in
(Bio)Chemistry and Physics
1. Types of Systems
-
liquids
solutions
electrolytes
polymers
-
-
proteins
DNA, RNA
sugars
other polymers
membranes
crystals
glasses
zeolites
metals
…
2. Types of Processes
-
-
melting
adsorption
segregation
complex formation
protein folding
order-disorder
transitions
crystallisation
reactions
protein stabilisation
membrane
permeation
membrane
formation
…
3. Types of Properties
-
structural
mechanical
dynamical
thermodynamical
electric
…
objectives
• characterisation of the populated microscopic states of
peptides by molecular dynamics of spontaneous reversible
folding in solution
• investigate the effect of
 thermodynamic conditions
 solvent environment
 amino acid composition, chain length
on the peptide folding behaviour
• characterisation of the unfolded state
O
O
O
O
O
O
under
OH
N
N investigation
H N of peptides
N
N
N
E types
H
H
H
H
H
2
O
•
•
H2N
O
O
N
H
F
N
H
O
N
H
H2N
SYINSDGTWT
O
N
H
G
B
N
H
N
H
OH
C
O
H
N
O
O
O
N
H
O
O
N
H
H
N
N
H
O
O
N
H
N
H
O
O
O
H
N
N
H
N
H
O
C
H
N
O OH
O
aminoxy-peptides in water and chloroform
O
H
•
O
-peptides in water, DMSO or methanol
-peptides in methanol and/or water
A
•
O
O
N
H
O
O
N
H
O
O
N
H
O
carbohydrate containing peptides
O
N
H
O
O
the -world
O
•
O
O
O
O
H N
N
N
N
N
H
H
H
H
additional
backbone
carbon
atom
 four main chain torsional angles , ,  and 
2
O
N
H
OH
A
O
H2N
O
N
H
O
N
H
O
 NH   
O
N
H
O
N
H
OH
B
•
•
•
non-degradable peptide mimetics (resistant to several peptidases)
stable secondary structures, tunable due to side-chain composition
(substitution at - and/or -carbon)
– helices
– -sheet-like conformations, -hairpins
soluble in methanol, some also in water
R.P. Cheng, S.H. Gellman, W.F. DeGrado, Chem. Rev. 2001, 101, 3219-3232
method
• Molecular Dynamics (MD) simulation using the GROMOS
biomolecular simulation program
• GROMOS 43A1 or 45A3 force field
• NPT using the weak coupling method to hold temperature &
pressure constant
• Periodic boundary conditions
• explicit solvent models
• starting structure: fully extended (unless stated otherwise)
Effect of thermodynamic conditions & solvent
environment (pH, viscosity) on folding
equilibrium
• -heptapeptide 1 adopts a 314-helix in methanol (and pyridine)
• MD simulation starting from NMR model structure (in explicit
methanol)
–
–
–
–
at
at
at
at
five different temperatures: 298 K, 320 K, 340 K, 350 K and 360 K
three different pressures: 1 atm, 1000 atm, 2000 atm
three different solvent viscosities
four different charge states
 How is the folding/unfolding equilibrium affected?
(reference simulation at the melting temperature (340 K)
O
A
H2N
O
N
H
OH
N
H
N
H
N
H
N
H
N
H
O
O
O
O
O
1
O
B
H N
O
N
N
N
O
O
O
O
N
N
OH
temperature dependence
unfolded
folded
folding equilibrium depends on temperature
pressure dependence
2000 atm
1000 atm
1 atm
folding equilibrium depends on pressure
effect of solvent viscosities
scale masses of the solvent atoms (& adapt simulation time step accordingly)
normal
scaling factor: 0.1
 1/3 h of MeOH
scaling factor: 0.01
 1/10 h of MeOH
equilibrium must not and does not depend on solvent viscosity
folding rate does depend on solvent viscosity
effect of charge states (pH)
exp. conditions (acidic)
NH3+….COOH
314 helix dominant
basic: NH2….COO314 helix NOT dominant
neutral: NH2….COOH
314 helix barely present
all charges set to zero
314 helix not present
most stable fold changes with pH
The unfolded state of peptides:
Much smaller than expected ! (?)
all different?
how different?
RMSD [nm]
Unfolded
structures
321  1010 possibilities!!
0.4
0.3
0.2
0.1
0
0
50
100
150
200
t [ns]
100 ns MD:
5x107 configurations
2 fs apart
Folded
structures
Alice Glättli
all the same
W.F. van Gunsteren, R. Bürgi, C. Peter, X. Daura , Angew. Chem. Int. Ed. 2001, 40, 352-355
X. Daura, A.G., P. Gee, C. Peter, W.F. van Gunsteren, Adv. Prot. Chem. 2002, 62, 341-360
A sample of unfolded states
residues
O
A
H2N
O
N
H
O
B
H2N
O
N
H
O
C
H2N
N
H
OH
N
H
N
H
N
H
N
H
O
O
O
O
O
OH
N
H
N
H
N
H
O
O
O
O
N
H
OH
N
H
N
H
N
H
N
H
N
H
O
O
O
O
O
torsional angles
MeOH
7
21
MeOH
6
18
MeOH
6
18
MeOH
6
18
MeOH
6
18
H2O
10
20
DMSO
8
16
CHCl3
3
9
A
B
E
F
D
NH2
D
O
H2N
N
H
N
H
O
E
H2N
O
N
H
F
G
SYINSDGTWT
H
N
O
O
O
H
N
N
H
O
N
H
O
N
H
O
H
N
N
H
O
O
O
O
H
N
N
H
O
O
H
OH
N
H
N
H
N
H
N
H
O
O
O
O
OH
N
H
N
H
N
H
O
O
O
O
O
O
O
N
H
O
O
O
N
H
O
G1
O
G2
G3
G4
H
System
A
B
Exp. conformer
314-helix
314-helix
Exp. temperature
298
298
298
298
298
278
340
298
Sim. temperature
340
298
340
340
340
353
340
340
Sim. length
200
102
101
100
50
60
150
73
RMSD similarity cut-off [nm]
0.1
0.08
0.08
0.08
0.08
0.12
0.1
0.07
Sampling of EC
Ranking of EC in cluster analysis
yes
C
D
unknown
yes
E
hairpin
yes
F
G
-hairpin
3:5
12/10helix
yes
yes
H
, 310
helices
yes
1.88
helix
yes
1
2
2
1
21
13
1
30
15
10
16
1
2
19
2
4
6
5
15
11
4
Life time of EC [ps]
463
750
90
205
113
116
74
# events of folding to EC
129
21
105
38
3
25
209
1723
4133
616
694
7509
4000
220
# conformers visited during folding to EC
19
22
9
9
23
17
6
Weight of MPC [%]
30
18
26
19
16
32
19
19
2
4
3
4
5
2
4
4
Life time of MPC [ps]
463
278
230
157
205
2079
367
74
# events of folding to MPC
129
68
123
131
38
9
81
209
1723
1245
204
473
694
2409
1465
220
19
10
3
7
9
13
9
6
Total number of conformers
360
200
76
286
129
111
179
148
# unfolded conformers (to 99% weight)
234
131
39
208
88
78
108
100
# unfolded conformers (to 75% weight)
28
19
7
36
13
14
23
15
# unfolded conformers (to 50% weight)
6
6
3
9
5
5
9
6
Weight of EC [%]
Estimated free energy of folding [kJ mol-1]
Time of folding to EC [ps]
Estimated free energy of folding [kJ mol-1]
Time of folding to MPC [ps]
# conformers visited during folding to MPC
conclusion
unfolded state of peptides
• The accessible conformational space seems to be much smaller than
the theoretical conformational space
(even at high temperature)
 key factor for the observed fast folding of these peptides
• The correlation analysis suggests that as the chain length of the
peptide increases the gain in kinetic stability overcasts the loss in
folding speed.
 folding – a more efficient process for longer chains
 more systematic investigation needed
(larger sample of peptides with increasing chain length)
Four Problems
4. The Experimental Problem
A Any experiment involves averaging over time and space (molecules)
So it determines the average of a distribution, not the distribution itself
However:
Very different
distributions may
yield same average
probability
P(Q)
(linear) average
<Q>
quantity Q
Example:
circular dichroism(CD)-spectra -peptides
NOE’s + J-values of peptides in
crystal
solution
Four Problems
NOE’s:
J-values:
X-ray:
are notoriously insensitive to the (atom-atom-distance)
distribution provided a small part satisfies the NOE bounds
may be sensitive to dihedral angle distribution
crystal contains a much narrower distribution than a
(aqueous) solution
Fazit Experimental data cannot define a conformational ensemble
B Experimental data have insufficient accuracy for force field calibration
and testing
accuracy of NOE’s, J-values, structure factors, etc. is limited but may
improve with methodological and technical progress
Example: NMR data on beta-hexapeptide, alpha-octapeptide
Fazit Experimental data may converge over time towards simulation
results
Calculation of CD-spectra from
molecular dynamics trajectories
Calculation of circular dichroism (CD) spectra
O
H2N
O
N
H
O
O
N
H
O
N
H
positive Cotton effect at ~200 nm
O
N
H
N
H
OH
zero crossing between
205 and 210 nm
A
Peptide A: DM-BHP
• geminal dimethylation inhibits the
formation of a 314 helix
O
H2N
O
N
H
O
O
N
H
O
N
H
N
H
O
N
H
OH
B
• no NMR data available
• CD spectrum shows a pattern, which
is “typical” for a 314 helix
O
H2N
O
N
H
O
N
H
O
N
H
O
N
H
O
N
H
OH
A
O
H2N
O
N
H
O
N
H
O
N
H
O
N
H
O
N
H
OH
B
Peptide B: BHP
• can adopt a 314 helix, confirmed by
NMR experiments, CD spectrum similar
negative Cotton effect between
215 and 220 nm
A. Glättli, X. Daura, D. Seebach and W.F. van Gunsteren J. Am. Chem. Soc. 2002, 124, 12972 – 12978
Mean CD spectra
MD at T = 298 K, 1 atm in 1462/1463 methanol, starting from extended structure
• DM-BHP:
 peak at 197 nm
 weak negative Cotton effect at
223 nm
 zero crossing at 213 nm
• BHP:
 peak at 197 nm
 negative Cotton effect at
221 nm
 zero crossing at 215 nm
Same spectra
Same structure ?
CD spectra per cluster
Similarity criterion: backbone RMSD  0.09nm
10000 structures, 10psec apart
A
B
helical
cluster 2
cluster 1
helical structure
74.6 %
20.5 %
18.1 %
14.5 %
12.9 %
4.5 %
2.6 %
1.9 %
6.8 %
1.7 %
4.6 %
1.6 %
Non-helical conformers exhibit the CD
pattern assigned to the 314 helix, the
“helical” conformer doesn’t.
CD spectra of
single structures
A
a
b
c
a, b, c
d
e
f
B
a, b, c
A certain CD pattern originates
from spatially very different
structures.
a
d
b
e
c
f
Cluster analysis of the combined (100nsec) trajectories
DM-BHP & BHP at 298K
conformers present in
both ensembles
cluster = conformation
 virtually NO OVERLAP between the conformational ensembles
of both molecules, which have similar CD spectra !!
A β-hexapeptide
O
O
O
N
H
OH
O
O
N
H
OH
N
H
OH
O
O
N
H
OH
N
H
OH
O
N
H
O
Ph
OH
• β-hexapeptide with hydroxyl groups
attached to the α-carbons
• NMR model structure suggests the
formation of a 28-P-helix
• MD simulation from totally extended
conformation at two different
temperatures (298 K & 340K) using
the GROMOS 45A3 force field
• no NOE-distance or J-value restraining
Bundle of 20 NMR model
structures
(protection groups not shown)
Gademann et al., Angew. Chem. Int. Ed., 42 (2003), p. 1534
NOE distance violations & backbone J-values
• at 298 K
2 violations (~0.05 nm)
average deviation from
exp. J-values: 0.44 Hz
• at 340 K
1 violation ( ~ 0.03 nm)
average deviation from
exp. J-values: 0.91 Hz
• NMR bundle
no violation
average deviation from
exp. J-values: 0.57 Hz
Occurrence of Hydrogen Bonds [%]
MD simulation refinement
Donor-Acceptor
298 K 340 K X-PLOR
NH(i)-O(i-2) [HB8]
MD simulation refinement
Donor-Acceptor
298 K 340 K X-PLOR
OH(i)-O(i-1) [HB7]
NH(3)-O(1)
0
1
20
OH(6)-O(5)
NH(4)-O(2)
0
1
25
OH(i)-O(i-2) [HB11]
NH(5)-O(3)
2
4
10
NH(i)-O(i-3) [HB12]
0
14
0
OH(4)-O(2)
0
8
10
OH(5)-O(3)
1
22
0
1
10
0
NH(3)-O(0)
0
30
0
OH(6)-O(4)
NH(4)-O(1)
0
26
0
OH(i)-O(i-3) [HB15]
NH(5)-O(2)
0
35
0
OH(4)-O(1)
1
26
0
NH(6)-O(3)
1
18
0
OH(5)-O(2)
0
10
0
38
0
0
NH(i)-O(i+1) [HB10]
OH(i)-O(i+2) [HB13]
NH(2)-O(3)
11
0
0
NH(5)-O(6)
11
1
0
OH(3)-O(5)
None of the H-bond patterns supporting the formation of a 28-P-helix
were detected in the simulations.
Conformational Analysis of the combined
MD & NMR “ensembles”
MD at 298 K + NMR bundle MD at 340 K + NMR bundle
Another possible secondary structure element:
2.512-P-helix
• 2.512-P-helix is for ~ 35 % populated at 340 K
• stability 298 is to be confirmed (by simulation at 298 K starting from helix)
Conclusions
1. MD simulation using a “thermodynamic” force field (GROMOS)
(without NMR restraints) reproduces experimental NOE/J-value
data equally good or better than a set of 20 NMR model structures
derived by classical single structure refinement techniques
(XPLOR)
(aspect: force field problem)
2. Single structures may not be representative for the (Boltzmann)
ensemble of structures in solution
(aspect: ensemble problem)
3. Standard (NMR) structure refinement procedures should be
revised in order to avoid the deposition of non-representative
model structures in structure data banks
(aspect: search problem)
4. Don’t compare secondary (derived) data (structures, angles) but
primary (measured) data (NOE’s, 3J-values) when comparing
models with experimental data
(aspect: experimental problem)
Computer-aided Chemistry: ETH Zuerich
Molecular Simulation Package
GROMOS = Groningen Molecular Simulation + GROMOS Force Field
Generally available: http://www.igc.ethz.ch/gromos
Research Topics
•
searching conformational space
•
force field development
– atomic
– polarization
– long range Coulomb
•
•
techniques to compute free
energy
3D structure determination
– NMR data
– X-ray data
•
quantum MD: reactions
•
•
•
solvent mixtures, partitioning
interpretation exp. data
applications
– proteins, sugar, DNA, RNA, lipids,
membranes, polymers
– protein folding, stability
– ligand binding
– enzyme reactions
Acknowledgements
Gruppe informatikgestützte Chemie (igc)
http://www.igc.ethz.ch
Dirk Bakowies (Germany)
Alice Glättli (Switzerland)
Riccardo Baron (Italy)
David Kony (France)
Indira Chandrasekhar (India)
Chris Oostenbrink (Holland)
Markus Christen (Switzerland)
Merijn Schenk (Holland)
Peter Gee (England)
Daniel Trzesniak (Brasil)
Daan Geerke (Holland)
Haibo Yu (China)
Daniela Kalbermatter (Switzerland)
Bojan Zagrovic (Croatia)
Conformational Distribution in Crystal versus
Solution
Solute Molecule
polypeptide:
(Aib)6 – Leu – Aib
h
h
achiral
L-amino-acid
NMR:
NOE’s suggest helical structure in solution
3J-values (H -H ) = 6.9 Hz
N
C
X-ray:
R-helix is found
R- or L-helix?
?
crystal structure: satisfies NOE’s but 3J-value = 4.2 Hz
MD Simulations
DMSO solution:
NOE’s satisfied
3J = 6.8 Hz
L- and R-helical
fragments are present
crystal:
R-helix agrees with X-ray
only one
NOE’s satisfied
3J = 4.0 Hz
conformation present
R-helix
Conformational Distribution in Crystal versus
Solution
Conformational Distribution in Solution and in Crystal is different
NOE’s:
3J’s:
not sensitive
are sensitive
probability
P(x)
<NOE> same
<3J>
different
crystal
solution
conformation x