Transcript *** 1

Empirical energy function
Summarizing some points about typical MM force field
•
In principle, for a given new molecule, all force field parameters need to be derived from scratch because once a
functional group changes, it can change many things.
H
N
N
N
N
H2
C
H
N
N
N
N
But this will be inconvenient as you can image how many molecules are there?
To get around this difficulty, a way to do is to use ‘unit molecular fragments’ to form the molecule we want to
study. This way, we only need to build force field for these unit molecular fragments. (certainly, this is an
approximation.) This approach has been widely used for polymer molecules like proteins and/or DNAs, etc.
Q: How do we know the force fields are accurate?
A: The force field parameters are chosen/optimized such that the calculated results will agree with experimental
results well. Better agreement with experimental results and higher prediction ability of a force field, a better one.
•
Note that, they are many type of carbon atoms, even in the ‘unit molecular fragments’ you choose. For example,
carbons, sometimes, is singly bonded or doubly bonded etc. This will affect vdw parameters, and many other
parameters.
•
Usually, all the charges of atoms are FIXED. ( this is an approximation, considering the cis- and trans n-butane
case)
•
•
•
•
•
Then, given a geometry of a studied molecule, the energy of that molecule at that structure can be calculated.
Besides the typical MM force field, there are empirical force fields which can deal with bond breaking and
formation cases.
Molecular dynamics simulation
•
(Wikipedia)
•
•
•
•
•
•
For transition metal molecular systems, the empirical force field functional forms can be
more complex due to complexity of d-orbitals. The bonding between metal ions and ligands
can be bonded or non-bonded.
Well-known force field for biomolecules: AMBER, Charmm, OPLS, etc.. Note that the force
field is still under development. The transferibility of force field at different
conformations(e.g., cis and trans C-C-C-C, and ff of amino acids are derived from chosen
conformations) and molecules (forming different molecules with unit molecular fragments)
are not 100 % yet. Some people even argue that the force field need to be changed during
MD simulation to achieve better simulation. In addition, polarizable force field is under
development as well.
In addition to the description of molecular interactions using empirical functional forms
without including electrons explicitly, quantum chemistry calculation can be used to
calculate molecular energy as well. This method includes electron orbitals in calculation
explicitly. The calculation procedures in this "quantum chemistry calculation" can be
categorized into two groups. One is semi-empirical method, the other is ab initio (this
means "starting from the very beginning or the first principle") calculation method.
Between the empirical MM force field and quantum chemistry calculation, there is a
combined /compromised method, named QM(quantum mechanical) /MM method. This
method allows to examine local electronic structure (orbitals) in the large molecular
systems.
Ways to alleviate problem of sampling in MD
and MC simulation
• Simulated annealing
• Parallel tempering, also known as replica exchange
Replicate Exchange MD simulation
•Remove net
translational
and
rotational
motion due
to truncated
error at
chosen time
interval.
Simulation step
exchange with probability
P = min{1, exp[-(E2-E1)(1/kT1 – 1/kT2)]}
•Enhance sampling. (alleviate the problem of beingtrapped at a local minimum.)
•For a helical peptide of ~20 amino acids, this speeds
up about 10 times, compared with conventional MD.
(PCCP)
Random Number Generators
/Monte Carlo Simulation
•
An example of a uniform (quasi)-random number generator:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
(Fortran 77 version)
double precision function usran(ir)
c
c this subroutine generates random values between 0.0 and 1.0 using
c an integer seed
c it is based on the imsl routine ggubs.
c
c double precision version
c
implicit double precision (a-h,o-z)
parameter(da=16807.d0,db=2147483647.d0,dc=2147483648.d0)
ir=abs(mod(da*ir,db)+0.5d0)
usran=dfloat(ir)/dc
return
end
Random Number Generators
/Monte Carlo Simulation
•
An example of a uniform (quasi)-random number generator:
•
•
•
•
•
•
•
•
•
•
•
(Fortran 77 version, don’t mind the rigorousness of coding here.)
double precision function RANF(DUMMY)
c
c this subroutine generates random values between 0.0 and 1.0 using
c an integer seed
c it is based on the imsl routine ggubs.
c
c double precision version
c
Parameter ( L= 1029, C=221591, M=1048576)
DATA SEED /0/
•
•
•
•
SEED= MOD(SEED*L+C,M)
RANF= REAL(SEED)/M
RETURN
END
Protein Structure Basics
References:
1. Branden & Tooze’s “Introduction to Protein
Structure (second edition)”, Garland Publishing,
1999, ISBN: 0815323050. (藝軒有賣!)
2. Creighton's "Proteins: Structures and Molecular
Properties (second edition)", Freeman and
Company, 1993.
There are several levels of
protein structure
Primary structure: the sequence of amino acid residues
Secondary Structure: the polypeptide backbone
conformation
Tertiary Structure: the three-dimensional structure of
a protein
Quaternary Structure: the arrangement of one subunit relative to
another in space
Proteins are polypeptide chains
The basic repeating unit along the main chain is
NH-CH-C’=O, which is the residue of the common parts of
amino acids after peptide bonds have been formed
Amino Acids can be classified by
their R groups (1)
His
The left form
The right form
The right form is usually predominant in model
peptides. However, which form is predominant
depends on the precise conditions in the local
environment. Both forms (the left form & the
right form) are found in proteins.
Cysteines can form disulfide bridges
The conformation of
the main-chain atoms is
therefore determined by
the values of phi (), &
psi () angles of each amino
acid.
Ramachandran Plot
(a): a result from calculations of sterically allowed regions
Glycine residues can adopt many
different conformations
(b) Observed values for all
Residue types except glycine
(c) Observed values for glycine
: note that the values include
combinations of  and  that are
not allowed for other amino
acids.
Rotamers: Most side chains
have one or a few
conformations that
occur most frequently than the
other possible staggered
conformations. These are
called rotamers. Today,
collections of
these favored conformations,
or rotamer libraries are a
standard tool in computer
programs used for modeling
protein structures