Introduction and review of Matlab

Download Report

Transcript Introduction and review of Matlab

Week 1
•
Lecture 1: Introduction to cells and their contents. Proteins,
polypeptide chains made from 20 amino acids. Paradigm shift in
Molecular Biology from soft (descriptive) to hard (predictive)
science.
•
Lecture 2: Quantum, classical and stochastic description of
biomolecules. Classical molecular dynamics as the standard model
of molecular biology.
Cells
Cells are the fundamental structural and functional units in organisms.
They are the subject of Molecular Biology, Biochemistry & Biophysics
with significant overlaps among the disciplines.
This course will focus on computational aspects cellular biophysics.
Two kinds of cells:
•
Prokaryotes (single cells, bacteria, e.g. Escherichia coli)
Size: 1 mm (micrometer), thick cell wall, no nucleus
The first life forms. Simpler molecular structures, hence easier to study.
•
Eukaryotes (everything else)
Size: 10 mm, no cell wall (animals), has a nucleus,
Organelle: subcompartments that carry out specific tasks
e.g. mitochondria produces ATP from metabolism (the energy currency)
chloroplast produces ATP from sunlight.
Structure of a typical cell
Plasma membrane
Background: salt water
Water (70%) is a highly viscous
medium. It also has a very
high dielectric constant (e=80),
which screens charges.
Ions (Na+, K+, Cl-,…)
Typical concentration 0.15 M
Debye length: 8 Å
Mobile ions completely screen
charges beyond few nm.
 no directed motion is possible
in cells beyond few nm!
Organic molecules
Hydrocarbon chains
(hydrophobic)
Double bonds
Functional groups in organic molecules
Polar groups are hydrophilic. When attached to hydrocarbons,
they modify their behaviour.
Four classes of macromolecules in cells
•Sugars (polysaccharides) :
Functions: provide energy, scaffold in DNA and RNA structure, confer proteins
stability after translation.
•Lipids (triglycerides)
Functions: long-term energy storage, lipid bilayer forms the cell membrane
•Proteins (polypeptides)
Functions: main workhorses in cells, perform all the mechanical and chemical
operations, signal transduction, also provide structural elements.
•Nucleic acids (DNA, RNA)
Functions: Carries the genetic code, double-helix splits to replicate, contains
the blueprint for all proteins.
Simple sugars (monosaccharides): e.g. ribose (C5H10O5),
Six-carbon sugars
Glucose is a product of photosynthesis.
Glucose and fructose have the same formula (C6H12O6) but
different structure
Disaccharides are formed when two monosaccharides are chemically
bonded together.
Lipids (fatty acids)
Saturated
fatty acids
Unsaturated
fatty acid
(C=C bonds)
Phospholipids (lipids with a phosphate head group)
Phosphatide:
In neutral pH (7), the oxygens in
the OH groups are deprotonated,
leading to a negatively charged
membrane.
Phosphatidylcholine (PC):
The most common phospholipid
has a choline group attached
….PO4-CH2-CH2-N+-(CH3)3
Proteins (polypeptide chains folded into functional forms)
The building blocks of proteins are the 20 amino acids.
In water
pH  2
10  H  2
pH  10
Gas phase
NH 3+ - C - COOH
NH 3+ - C - COO NH 2 - C - COO -
At normal pH, amino acids are neutral
but have +/- charges at the amino/
carboxyl groups (zwitter ions).
Formation of polypeptides
Gas phase reaction
In water:
NH 3+ - C - COO - + NH 3+ - C - COO  NH 3+ - C - CO - NH - C - COO - + H 2O
Protein structure
3.6 amino acids per turn, r=2.5 Å
pitch (rise per turn) is 5.4 Å
 -helix
b -sheet
Nucleic acids are formed from ribose+phosphate+base pairs
The base pairs are A-T and C-G in DNA (A pairs only with T and
C pairs only with G).
In RNA Thymine is substituted by Uracil
Adenosine triphosphate (ATP) has three phosphate groups.
In the usual nucleotides, there is only one phosphate group
which is called Adenosine monophosphate (AMP)
Another important variant is Adenosine diphosphate (ADP)
B-DNA (B helix)
ROM (Read-Only Memory) contains1.5 Gigabyte of genetic information
Base pairs per turn (3.4 nm): 10
Primary structure of
a single strand of DNA
Primary structure of
a single strand of RNA
Hydrogen bonds
among the base
pairs A-T and C-G
Local structure of DNA
Dynamic and flexible
structure
Bends, twists and knots
Essential for packing 1 m
long DNA in 1 mm long
nucleus
Central dogma
Paradigm shift in Molecular Biology
BBC news, July 3, 2104: 99.6% of drug trials for Alzheimer’s disease
during the last decade have failed (i.e., only 1 out of 250 succeeded).
This is not specific to Alzheimer’s disease but pervades the whole
pharma. All the low-lying fruits have been picked and to find novel
drugs one has to do more than trial-and-error work. The answer is in
rational drug design which combines experimental work with
computational models of drug action.
Biomolecular systems are quite complex and their accurate modelling
requires a great deal of computing power. This has become feasible in
the last decade with the advance of the High Performance Computing
systems based on parallel clusters of PCs, which are more affordable.
Computational work has now become cheaper and less laborious than
performing experiments.
The main barrier in turning Molecular Biology in to a hard science like
Physics and Chemistry is convincing people that biomolecular systems
can be accurately described using computational methods.
Chemical accuracy has been achieved in relatively few examples so far
and much more work (both in applications and methodological
development) needs to be done to complete the paradigm shift.
Model/
theory
Scientific
method
Exp.
data
Predic
tion
Exp.
test
Quantum, classical and stochastic description
of biomolecular systems
•
Quantum mechanics (Schroedinger equation)
The most fundamental approach but feasible only for few atoms (~10).
Approximate methods (e.g. density functional theory) allows treatment of
larger systems (~1000) and dynamic simulations for several picoseconds.
•
Classical mechanics (Newton’s equation of motion)
Most atoms are heavy enough to justify a classical treatment (except H).
The main problem is finding accurate potential functions (force fields).
MD simulation of over 100,000 atoms for microseconds is now feasible.
•
Stochastic mechanics (Langevin equation)
Most biological processes occur in the range of microseconds to seconds.
Thus to describe such processes, a simpler (coarse-grained)
representation of atomic system is essential (e.g. Brownian dynamics).
Many-body Schroedinger equation for a molecular system
H n + H e + U ne  R i , r   E R i , r 
where:
(1)
zi z j e 2
2 2
H n  -
i + 
i 2M i
i j Ri - R j
nuclear Hamiltonian
2 2
e2
H e  -
 + 
 2m
  b r - rb
electronic Hamiltonian
zi e 2
U ne  -
i , R i - r
elect-nucl. interaction
Here m and Mi are the mass of the electrons and nuclei,
r and Ri denote the electronic and nuclear coordinates, and
 and i denote the respective gradients.
Separation of the electronic wave function
Nuclei are much heavier and hence move much slower than electrons.
This allows decoupling of their motion from those of electrons.
Introduce the product wave function:
 Ri , r    n Ri  e Ri , r 
Substituting this in the Schroedinger equation gives
H n + He + U ne  n Ri  e Ri , r   E n Ri  e Ri , r 
For fixed nuclei, the electronic part gives
He + U ne  e Ri , r   Ee Ri  e Ri , r 
(2)
Substitute the electronic part back in the Schroedinger equation
Hn + Ee Ri  n Ri  e Ri , r   E n Ri  e Ri , r 
Born-Oppenheimer (adiabatic) approximation consists of neglecting
the cross terms arising from H n  e R i , r 
(which are of order m/M), so that the nuclear part becomes
Hn + Ee Ri  n Ri   E n Ri 
(3)
Eqs. (2, 3) need to be solved simultaneously, which is a formidable
problem for most systems. For two nuclei, there is only one coordinate
for R (the distance), so it is feasible. But for three-nuclei, there are 4
coordinates (in general for N nuclei, 3N-5 coordinates are required),
which makes numerical solution very difficult.
Classical approximation for nuclear motion
Nuclei are heavy so their motion can be described classically, that is,
instead of solving the Schroedinger Eq. (3), we solve the corresponding
Newton’s eq. of motion
d 2R i
Mi
 - iU R i  i  1,, N
2
dt
U R i   
i j
zi z j e 2
Ri - R j
(4)
+ Ee R i 
At zero temperature, the potential can be minimized with respect to the
Nuclear coordinates to find the equilibrium conformation of molecules.
At finite temperature, Eqs. (2) and (4) form the basis of ab initio MD
(ignores quantum effects in nuclear motion and electronic exc. at finite T.)
Methods of solution for the electronic equation
Electronic part of the Schroedinger Equation (2) has the form

zi e 2
2 2
e2
 + 
-
- 
 i 2m
  b r - rb
i , R i - r

 e R i , r 

 Ee R i  e R i , r 
Two basic methods of solution:
1. Hartree-Fock (HF) based methods: HF is a mean field theory.
One finds the average, self-consistent potential in which electrons move.
Electron correlations are taken into account using various methods.
2. Density functional theory: Solves for the density of electrons.
Better scaling than HF (which is limited to ~10 atoms); 1000’s of atoms.
Car-Parrinello MD (DFT+MD) has become popular in recent years.
(5)
Classical mechanics
Molecular dynamics (MD) is the most popular method for simulation
studies of biomolecules. It is based on Newton’s equation of motion.
For N interacting atoms, one needs to solve N coupled DE:
Mi
d 2ri
dt
2
N
 Fi   Fij , i  1,, N
j i
Force fields are determined from experiments and ab initio methods.
Analytically this is an intractable problem for N>2.
But we can solve it easily on a computer using numerical methods.
Current computers can handle N=~106 particles, which is large enough
for description of most biomolecules.
Integration time, however, is still a bottleneck (106 steps @ 1 fs = 1 ns)
Stochastic mechanics (Brownian dynamics)
In order to deal with the time bottle-neck in MD, one has to simplify the
simulation system (coarse graining). This can be achieved by
describing parts of the system as continuum with dielectric constants.
Examples:
• transport of ions in electrolyte solutions (water → continuum)
• protein folding (water → continuum)
• ion channels (lipid, protein, and water → continuum)
To include the effect of the atoms in the continuum, modify Newton’s
eq. of motion by adding frictional and random forces:
Langevin equation:
d 2ri
mi 2  Fi -  i mi vi + R i
dt
Frictional forces:
Friction dissipates the kinetic energy of a particle, slowing it down.
Consider the simplest case of a free particle in a viscous medium
d 2r
dv
m 2  -mv 
 -v
dt
dt
Solution with the initial values of
v(0)  v 0 , r (0)  0
v(t )  v 0e -t
r (t ) 
v0

1 - e-t 
In liquids frictional forces are quite large, e.g. in water 1/  20 fs
From
1 2 3
v

mv  kT  v  500 m/s and
 0. 1 
2
2

Random forces:
Frictional forces would dissipate the kinetic energy of a particle rapidly.
To maintain the average energy of the particle at 1.5 kT, we need to
kick it with a random force at regular intervals.
This mimics the collision of the particle with the surrounding particles,
which are taken as continuum and hence not explicitly represented.
Properties of random forces:
1. Must have zero mean (white)
Ri  0, i  x, y, z
2. Uncorrelated with prior velocities
vi (0) R j (t )  0
3. Uncorrelated with prior forces
Ri (0) R j (t )  2m kT (t ) ij
(Markovian assumption)
Fluctuation-dissipation theorem:
Because the frictional and random forces have the same origin,
they are related
1 
m 
R(0) R(t ) dt

2kT -
R(0)R(t)
t
In liquids the decay time is very short, hence one can approximate
the correlation function with a delta function
R(0) R(t )  2m kT (t )
Random forces have a Gaussian probability distribution
w( Ri ) 
1
2
Ri2
Ri2 

exp - Ri2 2 Ri2

2mkT
t
This follows from the fact that the velocities have a Gaussian distribution
g (vi )  N

m
exp - mvi2 2kT
2kT

In order to preserve this distribution, the random forces must be
distributed likewise.
The standard model of biomolecules: MD
•
MD is necessary because:
1. QM is too slow and can handle only very small systems.
2. Stochastic dynamics eliminates water from the system. But water is
not just a passive spectator in biomolecular processes - it plays an
active and essential role in the dynamics. For example, accurate
calculation of free energies is impossible without explicit description of
water (except in a few lucky cases where errors cancel out).
•
Also MD is sufficient because atoms are heavy enough to justify a
classical treatment (except H). The only requirement is that accurate
potential functions must be used, which is not quiet satisfied at
present; polarization int. is not explicitly included in most force fields.
49