Protein structure and folding
Download
Report
Transcript Protein structure and folding
H-bonds
D-H … A
Assignment: if D…A < 3.7 Å in crystal
structure. (normally 2.7-3.1 Å).
Energy of stablization: -12~-40 kJ/mol)
Tends to be linear.
Only weakly stabilize proteins. (!?)
A survey over H-bonds in globular
proteins (J. Mol. Biol. (1992) 226, 1143)
Local H-bonds?
The authors made
this conclusion:
“Most H-bonds are
local.”
Should be more
critically reviewed.
Most H-bonds are
between beckbone
atoms.
Source: K. Schulten Group
University of Illinois Urbana-Champaign
Protein folding, dynamics and
structural evolution
Chapter 9
Questions
How does a peptide sequence find its
native, functional conformation?
Is there a set of fundamental principles?
We discussed several factors determining
protein structure. Can we *predict* the
structure, from sequence information yet?
Determinants of Protein Folding
We will discuss the following factors, as listed in V&V
chapter 9.
Space Packing.
Directed mainly by internal residues
Protein structures are hierarchically organized.
Protein structures are highly adaptable.
Secondary structure can be context dependent.
Changing the fold of a protein.
Still, keep in mind that most of them are based from
observations and deductions.
Exceptions are possible.
Whether the statistical methods/criteria are acceptable is
another question.
Compactness
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Proteins are like
liquids and glasses,
instead of crystalline
solids.
The reverse
statement is, does
compactness serve
as a factor
determining protein
structure?
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Compactness helps, but not
enough.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Folding is directed mainly by
internal residues
Mutations that change surface residues
are accepted more frequently and are
less likely to affect protein conformations
than are changes of internal residues.
This is consistent with the idea of
Hydrophobic force-driven folding.
Determinants of Protein Folding
Space Packing.
Directed mainly by internal residues
Protein structures are hierarchically
organized.
Protein structures are highly adaptable.
Secondary structure can be context
dependent.
Changing the fold of a protein.
Determinants of Protein Folding
Space Packing.
Directed mainly by internal residues
Protein structures are hierarchically
organized.
Protein structures are highly adaptable.
Secondary structure can be context
dependent.
Changing the fold of a protein.
Protein structures are quite
“resistant” to mutations
A large number of single residue
mutations do not yield a very different
structure.
A complete study was done in phage T4
lysozyme by B. W. Matthews.
Homologous proteins comes with some
sequence identity and they are often
structurally similar.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Determinants of Protein Folding
Space Packing.
Directed mainly by internal residues
Protein structures are hierarchically
organized.
Protein structures are highly adaptable.
Secondary structure can be context
dependent.
Changing the fold of a protein.
Voet Biochemistry 3e Page 300
© 2004 John Wiley & Sons, Inc.
Table 9-1
Propensities and Classifications of Amino Acid
Residues for a Helical and b Sheet Conformations.
Secondary structure
Secondary structure prediction can be
done with more sophisticated algorithms.
Artificial intelligence such as neuronal
networks or support vector machines.
Basically look at a local sequence and
recongnize its pattern.
Usually such methods need a training set.
I.e. knowledge-based methods.
Voet Biochemistry 3e
Page 280
© 2004 John Wiley & Sons, Inc.
• Green: residues 23-33.
• Cyan: residues 42-53.
• Chm-alpha: a new
sequence replaces green
part.
• Chm-beta: the same new
sequence replaced cyan
part.
• Both are structurally
similar to native GB1.
• The same sequence can be
either an alpha helix or a
beta sheet structure,
depending on their
context.
Figure 9-6
NMR structure of protein GB1.
Determinants of Protein Folding
Space Packing.
Directed mainly by internal residues
Protein structures are hierarchically
organized.
Protein structures are highly adaptable.
Secondary structure can be context
dependent.
Changing the fold of a protein.
Homology
Proteins that share some sequence
identity may be structurally similar.
One evidence that support evolution.
Proteins with as little as 20% sequence
identity may have similar structure.
How much should be changed for a
protein to assume a different structure?
Voet Biochemistry 3e
Page 281
© 2004 John Wiley & Sons, Inc.
• GB1 and Rop are
structurally different.
• 50% of the residues of
GB1 is changed, yielding
a new polypeptide that
assumes Rop-like
structure.
• This new peptide has
41% sequence identity
with native Rop.
• The idea of Protein
design and engineering.
Figure 9-7
X-Ray structure of Rop protein, a homodimer
of aa motifs that associate to form a 4-helix bundle
Protein Folding
Levinthal’s paradox
If for each residue there are only two degrees of
freedom (,).
Assume each can have only 3 stable values.
This leads to 32n possible conformations.
If a protein can explore 1013 conformation per
second. (10 per picosecond).
Still requires an astronomical amount of time to fold
a protein.
This is impossible. So protein must fold in a
way that does not randomly explore each
possible conformations.
Molten Globule
Much of the secondary structure that is present
in a native proteins forms within a few
milliseconds.
This is called hydrophobic collapse.
Something called “Molten Globule”.
Slightly (5-15% in radius) larger than native
conformation.
Significant amount of secondary structure formed.
Side chains are still not ordered/packed.
Structure fluctuation is much larger. Not very
thermodynamically stable.
Are proteins sticky tapes?
Are they simply
hetereopolymers that
like to form H-bond,
hydrophobic interactions
with each other?
Proteins are not any random
hetero-polymers
• By observation:
– every protein has a very stable native structure,
– while polymers are usually random in their
conformation.
• Interesting observation for simple models: the
“designability”. In the following materials are from:
– R. Helling et al., “The designability of protein structures”, J. Mol.
Graphics and Modelling, 19, 157, (2001).
– J. Miller et al.,“Emergence of highly designable protein-backbone
conformations in an off-lattice model” Proteins, 47, 506 (2002).
– Steven S. Plotkin and Jose N. Onuchic, “Understanding protein folding
with energy landscape theory Part I : Basic concepts” Quart. Rev. of
Biophys., 35, 2 (2002), 111.
A 3D lattice HP model
• Assuming only two kinds
of residule H and P.
• Well-studied before.
EHH=−2.3, EHP=−1, EPP=0
• Enumerate all 227 possible
sequences.
• Each sequence has a lowest
energy structure.
• Some sequences share the
same structure. Count the
number of sequence per unique
structure NS.
• Plot the distribution of NS.
• (a) Histogram of NS for the 3 × 3 × 3 system. (b)
Average energy gap between the ground state
and the first excited state versus NS for the 3 × 3
× 3 system.
3794 different
sequences
share ONE
structure!
On the average,
these structure
have large
energy gaps
between the
lowest energy
structure and
the next lowest
one.
NS: Number of seq. corresponding to a structure S
Some structures are very “popular”
for many different sequences
3794 different
sequences
share ONE
structure!
On the average,
these structure
have large
energy gaps
between the
lowest energy
structure and
the next lowest
one.
NS: Number of seq. corresponding to a structure S
Such a property does not depend
on the model used
• Very similar behavior are seen in 2D 6×6
HP model and in 2D or 3D models with 20
different amino acids.
Off the lattice: 23mer 3 state model
Zinc finger of 1PSV
Off-lattice model: results
• a: Backbone configuration of the 11th most designable
23-mer structure
• b: Backbone configuration of the zinc finger 1NC8,
truncated to 23 amino acids.
What does it mean?
Sequence
Structure
The energy landscape
Proteins are in a special subset of
heteropolymers
• Such that the number of possible structures are greatly
reduced.
• Evolution!
• Therefore protein structure prediction is not as hard as it
appears. (still a hard problem though..)
• That also explains why knowledge-based methods works.
• Nevertheless, the tools developed offers valuable clues
for the structure of a new protein.
Computer Simulation
•
Goals:
1)
2)
•
•
Structure prediction: From primary sequences to tertiary
structures (so that we can infer its function)
Known structure (from X-Ray or NMR or another simulation).
Want: dynamics (how it moves at room temperature, with a
ligand, or with a mutation).
(1) above is difficult but do-able. We will discuss
about some of the methods.
(2) is often done with the same methods developed
for (1).
Computers
•
•
•
Deals with numbers and logical
operations.
Needs some “principles” (written in
mathematical equations).
For protein simulations there are different
approaches:
1. Physics-based
2. Knowledge-based
Physics
• Laws for particles moving and interaction
– Classical Mechanics (Newton’s Equation of
motion)
F ma
– Quantum Mechanics (Schrödinger’s Equation)
(r, t ) Hˆ (r, t )
t
T ime- Independent : Hˆ (r) E (r )
T ime- Dependent: i
• Many developments in physical chemistry
can be used.
Physics-Based protein simulation
• All quantum mechanics (QM) calculation is
not feasible.
• QM can be applied to a small set of atoms.
– Modeling of an active site (other atoms: not
treated or treated as dielectric continuum)
– Can get total energies (binding vs. nonbinding, pKa etc.), wave function (charge
distribution).
– QM/MM simulations (other atoms: treated with
Molecular Mechanics)
An example of using QM (Case et al.,
J. Biol. Inorg. Chem. 2002, 7, 632)
• Rieske iron-sulfur protein
in bc-type cytochromes
• Calculations based on density
functional theory (DFT) performed.
• pKa and redox potentials can be
obtained from total energies of
several states.
• Change of pKa (proton-binding)
and redox potential (electronbinding) are strongly coupled, as
observed in experiments.
Using classical mechanics for
protein structure and dynamics
• Ignore electrons,
assigning (empirical)
force fields for atoms
(or clusters of atoms).
• A very simple potential:
Force fields: bond stretching
and bending
A. R. Leach, “Molecular Modelling”, 1996
Torsional potential
A. R. Leach, “Molecular Modelling”, 1996
3 point charges
N2 molecules:
Known to have an
Electric quadrupole
moment
5 point charges
Ab initio QM results
Polarization: many-body effect
Physics based: methods
• Energy Minimization
– Steepest descent
– Conjugated gradient
• Monte Carlo Simulation
– Random sampling
– Stimulated annealing
• Molecular Dynamics
– Compute conformational
changes.
Energy surface of two torsional
angles
• Very shallow valleys.
• Similar in energy.
• Determining the (ψ,φ)
conformations of peptide
backbones is even more
complicated.
Trapping at a local minimum
• Standard practice: use Monte Carlo (random sampling)
with stimulated annealing techniques.
Using classical mechanics for
protein structure and dynamics
• With a Force field ( V(rN) ), for lowest energy structure
• Find the structure that gives energy minimum. Hopefully
this is done within finite amount of computer resources.
And hopefully this energy minimum gives the desired
native protein structure.
• For protein dynamics: calculate trajectories (Newton’s
eq.) at thermal condition and find the averaged physical
quantities.
Questions to ask:
• Is the energy function correct?
– Precise enough to discriminate other nonnative structure.
– Yet simple enough for computers to carry out
efficiently.
• Is the conformational search good enough
to cover the global minimum?
Take-home messages:
Physics-based methods
• Protein folding without any prior
knowledge about protein structure is a
difficult task.
• Protein structure prediction is often quoted
as an “N-P complete problem”, i.e. the
complexity of the problem grows
exponentially as the number of residues
increases.
• Structures of small proteins (~101 - 102
a.a.) can be solved in principle.