Transcript ppt
Protein structure
Recommended reading
• On protein structure and function
– Branden & Tooze, Intro to Protein Structure
– Lesk, Protein Architecture
• On protein structure prediction
– Salzberg & Searls, Kasif, Computational
Methods in Molecular Biology
– Setubal & Meidanis, Intro to Computational
Molecular Biology
Why?
• Proteins are highly adapted to perform
specific functions.
Why?
• Proteins are highly adapted to perform
specific functions.
• Functional properties depend on threedimensional structure.
Why?
• Proteins are highly adapted to perform
specific functions.
• Functional properties depend on threedimensional structure.
• Three-dimensional structure form, on the
basis of chemical laws, from linear chains
of peptides (amino acids).
What?
• Most life functions are performed by proteins:
–
–
–
–
–
–
scaffolding (cell walls)
activity (muscle fiber)
catalysts (enablers)
enzymes (tools)
transport (hemoglobin)
binding (keys & switches)
Whether?
• ‘To understand the biological function of
proteins we would .. Like to be able to
deduce or predict the three-dimensional
structure from the amino acid sequence.’
Whether?
• ‘To understand the biological function of proteins we
would .. Like to be able to deduce or predict the threedimensional structure from the amino acid sequence.’
• ‘This we cannot do.’
Whether?
• ‘To understand the biological function of proteins we
would .. Like to be able to deduce or predict the threedimensional structure from the amino acid sequence.’
• ‘This we cannot do.’
• ‘In spite of considerable efforts over the
past 25 years, this folding problem is still
unsolved and remains one of the most basic
intellectual challenges in molecular
biology.’
Proteins are polypeptide chains
• There are 20 amino acids
• Each amino acid has a central
carbon atom, Ca
• Each Ca has an attached amino
group and a carboxyl group
• Each Ca has an attached side chain
• Adjacent peptides link through a
peptide bond
Amino acid groups
• Hydrophobic amino acids
– Alanine, Valine, Phenylalanine, Proline,
Methionine, Isoleucine, Leucine
• Charged amino acids
– Aspartic acid, Glutamic acid, Lysine, Arginine
• Polar amino acids
– Serine, Threonine, Tyrosine, Histidine, Cyseine,
Asparagine, Glutamine, Tryptophan
• Glycine
‘Levels’ of protein structure
Angles of the chain
The conformational angles
•
•
•
•
•
Peptides are rigid groups
There is rotational freedom at each Ca bond
The N-Ca angle is called phi (f)
The Ca-C’ angle is called psi (y)
The peptide sequence and the phi-psi angles
completely specify a three dimensional
structure
Constraints on conformation
• Most phi-psi combinations can’t occur
because they cause the side chain and main
chain to collide
• Permitted combinations are called
Ramachandran plots
Glycine’s structural role
• Glycine (H side chain) has lots of freedom
• Therefore, is good for creating unusual
shapes
• Therefore, glycine is highly conserved
among homologous sequences (useful in
prediction)
Rotamers
• ‘Long’ side chains interweave with the main
chain
• These staggered conformations are called
rotamers
• Some rotamers are energetically favorable
and are preferred (useful in prediction)
Metal atoms in proteins
• Some amino acids bind metal atoms
• These atoms are used in function and/or
structure of the protein
• Example: iron is used by hemoglobin for
oxygen binding and transport
• Example: ‘zinc fingers’ stabilize DNAbinding regions
Motifs
‘Perhaps the most remarkable features of the
[protein] molecule are its complexity and its
lack of symmetry. The arrangement seems
to be almost totally lacking in the kind of
regularities which one instinctively
anticipates, and it is more complicated than
has been predicted by any theory of protein
structure.’
– John Kendrick, 1958
Globular structure
• Functional groups attached to rigid
framework
• Many compact regions
• Hydrophobic core
• Hydrophilic surface
• Gaps may have water molecules
Alpha helices
• Predicted by Pauling, 1951
• Confirmed by Max Perutz on myoglobin
structure
• Consecutive residues with angles of ~-60
and -50 degrees
• Question: how many residues per turn of
helix?
Beta sheets
• Build from ‘distant’ subsequences
• Form hydrogen bonds between C’=0 group
and NH group
• May be parallel or antiparallel
Loop regions
• Rich in charged and polar residues
• Antiparallel connectors are hairpin turns
Threading
• Match protein to a relative (based on
sequence similarity)
• There aren’t too many (~1000) basic
families of proteins
Formal statement
• Input:
–
–
–
–
protein sequence
core structural model
score functions
lower bound (temination condition)
• Output:
– a threading
Methods
• NP-hard!
• Any combinatorial optimization method
may be applied
– Branch and bound
– Genetic algorithm
– Simulated annealing
• Hundreds of submissions to CASP
A beta sheet
Abstract views of protein structure