Protein Folding and Modeling
Download
Report
Transcript Protein Folding and Modeling
Protein Folding and Modeling
Carol K. Hall
Chemical and Biomolecular
Engineering
North Carolina State University
Computational Methods for Modeling Protein
Folding and Structure
1. Homology Modeling
Assumes that proteins with similar sequences have
similar structures, alignments
2. Threading
“Threads” sequence of unknown structure through
database of known structures and scores match
based on contact potentials
3. Ab initio or de novo approaches
Deduce 3-d structure for given sequence by finding
minimum energy based on force field
Types of Computer Simulations
1. Molecular Dynamics
a. Decide on model intermolecular forces
b. Distribute 500-100,000 molecules in simulation cell
assigning random positions and velocities to each molecule
c. Monitor molecule’s motion as a function of time by solving
Newton’s equation of motion (F=m*a) at each time step to
predict new position and velocity
d. Take time averages of properties of interest
2. Monte Carlo
a. Decide on model intermolecular forces
b. Distribute 500-10,000 molecules at random locations in cell
c. Generate configurations of these molecules randomly (in
proportion to their probability of occurring)
d. Take averages over all configurations generated to
calculate properties of interest
Types of Computer Simulations – cont.
3.
Periodic Boundary Conditions: makes 1000 molecules look like 1023
molecules
4.
Computer simulation gives exact results for the molecular model
studied
Simulation of a System of Hard
Spheres
Folding Kinetics
New View: Energy Bias
Dill & Chan (1997)
Representing Protein Geometry
• Atomic Resolution Models
Each atom on protein and on solvent molecules is represented as a sphere
interacting via a realistic set of potentials based on the Lennard Jones potential and
electrostatic Coulomb potential
Includes correct bond lengths, bond angles, planar trans peptide bond, leads to
faithful representation of protein geometry.
• Low resolution models ( Coarse-grained or Simplified Folding
Models)
Solvent molecules not included in the simulation.
Lattice Models: protein is chain of single-site amino acid residues arranged on
the sites of a square or cubic lattice
Off-Lattice models: protein is a flexible chain of single-sphere amino acid
residues interacting via Lennard Jones or other potentials
• Intermediate Resolution Models – in between
All-Atom Simulations
Folding of Villin Headpiece
Subdomain
Well-studied, fast-folding 36-residue protein
Folding time is ~10 microseconds
Duan and Kollman (1998) conducted a 1microsecond simulation of “folding” using
256 dedicated CPU for 2 months
Unfolded state hydrophobic collapse
helix formation conformational
readjustment partially-folded
intermediate
All-atom simulations
Folding of Polyalanine 30-mer
Villin Headpiece Folds at Home
Intermolecular Potentials for Spherical Molecules
One Example– Lennard Jones Potential
Lennard-Jones potential in
dimensionless form
1 12 1 6
u * (r*) 4
r *
r *
r*= r/ σ where σ is molecular
diameter of system under study
taken from Dr. D. A. Kofke’s lectures on Molecular Simulation, SUNY Buffalo
http://www.eng.buffalo.edu/~kofke/ce530/index.html
Why use simplified ( coarsegrained) protein models?
• All atom simulations take too long, can
depend sensitively on the details, and
sample only very early folding events.
• Simplified models allow us to learn general
physical principles of protein folding.
contain few parameters ,implicit biases.
• Allow complete exploration of
conformational and sequence space
Lattice Models for Folding:
Monte Carlo Simulations
• Amino acid residues are sites ( beads) on a
cubic lattice
•Generate random moves of “beads” on the lattice
protein
• Accept moves based on their
probability of(2)
occurring= exp( Enew-E old)/kT
(1)
(3)
(4)
Lattice Models -The HP Model
• Energy function: amino acids are either
hydrophobic (H) or polar(P),
• Hydrophobic beads, H, attract each other
with strength ε when they are on
neighboring lattice sites
U= ε [number H-H contacts]
Lattice model of folding
1016 possible starting configurations
Q0 = # of native contacts
C = total # of contacts
F = free energy
1016 possible starting
conformations rapidly
fold to one of 1010 disordered
globules and then
slowly search for one of 103
F compact transition states that
rapidly fold to the unique native
structure.
1010 disordered
globules
Q0
103 transition states
C
1 native configuration