Secondary structure

Download Report

Transcript Secondary structure

The Structure and
Function of Proteins
Bioinformatics Ch 7
The many functions of proteins
•
•
•
•
•
•
•
Mechanoenzymes: myosin, actin
Rhodopsin: allows vision
Globins: transport oxygen
Antibodies: immune system
Enzymes: pepsin, renin, carboxypeptidase A
Receptors: transmit messages through membranes
Vitelogenin: molecular velcro
– And hundreds of thousands more…
Complex Chemistry Tutorial
• Molecules are made of atoms!
• There is a lot of hydrogen out there!
• Atoms make a “preferred” number of covalent (strong)
bonds
– C–4
– N–3
– O, S – 2
• Atoms will generally “pick up” enough hydrogens to “fill
their valence capacity” in vivo.
• Molecules also “prefer” to have a neutral charge
Biochemistry
• In the context of a protein…
– Oxygen tends to exhibit a slight negative charge
– Nitrogen tends to exhibit a slight positive charge
– Carbon tends to remain neutral/uncharged
• Atoms can “share” a hydrogen atom, each making
“part” of a covalent bond with the hydrogen
– Oxygen: H-Bond donor or acceptor
– Nitrogen: H-Bond donor
– Carbon: Neither
Proteins are chains of amino acids
• Polymer – a molecule composed of repeating units
Amino acid composition
• Basic Amino Acid
Structure:
– The side chain, R,
varies for each of
the 20 amino acids
– Amino & Carboxyl
groups, plus a Carbon
make the “Backbone” of
the amino acid
Side chain
R
H
O
N Ca C
H
Amino
group
H
OH
Carboxyl
group
The Peptide Bond
• Dehydration synthesis
• Repeating backbone: N–Ca –C –N–Ca –C
O
O
– Convention – start at amino terminus and proceed
to carboxy terminus
Peptidyl polymers
• A few amino acids in a chain are called a
polypeptide. A protein is usually composed
of 50 to 400+ amino acids.
• Since part of the amino acid is lost during
dehydration synthesis, we call the units of a
protein amino acid residues.
carbonyl
carbon
amide
nitrogen
Side chain properties
• Recall that the electronegativity of carbon is at
about the middle of the scale for light elements
– Carbon does not make hydrogen bonds with water
easily – hydrophobic
– O and N are generally more likely than C to h-bond to
water – hydrophilic
• We group the amino acids into three general
groups:
– Hydrophobic
– Charged (positive/basic & negative/acidic)
– Polar
The Hydrophobic Amino Acids
Proline severely
limits allowable
conformations!
The Charged Amino Acids
The Polar Amino Acids
More Polar Amino Acids
And then there’s…
Planarity of the peptide bond
Psi () – the
angle of
rotation about
the Ca-C bond.
Phi () – the
angle of
rotation about
the N-Ca bond.
The planar bond angles and bond
lengths are fixed.
Phi and psi
C=O
•  =  = 180° is
extended
conformation
•  : Ca to N–H
•  : C=O to Ca
Ca
N–H
The Ramachandran Plot
Observed
(non-glycine)
Calculated
Observed
(glycine)
• G. N. Ramachandran – first calculations of
sterically allowed regions of phi and psi
• Note the structural importance of glycine
Primary & Secondary Structure
• Primary structure = the linear sequence of amino
acids comprising a protein:
AGVGTVPMTAYGNDIQYYGQVT…
• Secondary structure
– Regular patterns of hydrogen bonding in proteins result
in two patterns that emerge in nearly every protein
structure known: the a-helix and the
-sheet
– The location of direction of these periodic, repeating
structures is known as the secondary structure of the
protein
The alpha helix

 60°
Properties of the alpha helix
•     60°
• Hydrogen bonds
between C=O of
residue n, and
NH of residue
n+4
• 3.6 residues/turn
• 1.5 Å/residue rise
• 100°/residue turn
Properties of a-helices
• 4 – 40+ residues in length
• Often amphipathic or “dual-natured”
– Half hydrophobic and half hydrophilic
– Mostly when surface-exposed
• If we examine many a-helices,
we find trends…
– Helix formers: Ala, Glu, Leu,
Met
– Helix breakers: Pro, Gly, Tyr,
Ser
The beta strand (& sheet)
   135°
  +135°
Properties of beta sheets
• Formed of stretches of 5-10 residues in
extended conformation
• Pleated – each Ca a bit
above or below the previous
• Parallel/aniparallel,
contiguous/non-contiguous
Parallel and anti-parallel -sheets
• Anti-parallel is slightly energetically favored
Anti-parallel
Parallel
Turns and Loops
• Secondary structure elements are connected by
regions of turns and loops
• Turns – short regions
of non-a, non-
conformation
• Loops – larger stretches with no secondary
structure. Often disordered.
– “Random coil”
– Sequences vary much more than secondary structure
regions
Levels of
Protein
Structure
• Secondary structure
elements combine to form
tertiary structure
• Quaternary structure
occurs in multienzyme
complexes
– Many proteins are active
only as homodimers,
homotetramers, etc.
Secondary Structure Prediction
• Based on backbone flexibility
• Various methods
– Statistical, neural networks, evolutionary
computation.
– Conserved aligned sequences as input (degree
calculated)
– PHD can get 70-75% accuracy
Chou-Fasman Parameters
Name
Alanine
Arginine
Aspartic Acid
Asparagine
Cysteine
Glutamic Acid
Glutamine
Glycine
Histidine
Isoleucine
Leucine
Lysine
Methionine
Phenylalanine
Proline
Serine
Threonine
Tryptophan
Tyrosine
Valine
Abbrv
A
R
D
N
C
E
Q
G
H
I
L
K
M
F
P
S
T
W
Y
V
P(a)
142
98
101
67
70
151
111
57
100
108
121
114
145
113
57
77
83
108
69
106
P(b) P(turn)
83
66
93
95
54
146
89
156
119
119
37
74
110
98
75
156
87
95
160
47
130
59
74
101
105
60
138
60
55
152
75
143
119
96
137
96
147
114
170
50
f(i)
0.06
0.07
0.147
0.161
0.149
0.056
0.074
0.102
0.14
0.043
0.061
0.055
0.068
0.059
0.102
0.12
0.086
0.077
0.082
0.062
f(i+1)
0.076
0.106
0.11
0.083
0.05
0.06
0.098
0.085
0.047
0.034
0.025
0.115
0.082
0.041
0.301
0.139
0.108
0.013
0.065
0.048
f(i+2)
0.035
0.099
0.179
0.191
0.117
0.077
0.037
0.19
0.093
0.013
0.036
0.072
0.014
0.065
0.034
0.125
0.065
0.064
0.114
0.028
f(i+3)
0.058
0.085
0.081
0.091
0.128
0.064
0.098
0.152
0.054
0.056
0.07
0.095
0.055
0.065
0.068
0.106
0.079
0.167
0.125
0.053
Chou-Fasman Algorithm
• Identify a-helices
– 4 out of 6 contiguous amino acids that have P(a) > 100
– Extend the region until 4 amino acids with P(a) < 100
found
– Compute P(a) and P(b); If the region is >5 residues
and P(a) > P(b) identify as a helix
• Repeat for -sheets [use P(b)]
• If an a and a  region overlap, the overlapping
region is predicted according to P(a) and P(b)
Chou-Fasman, cont’d
• Identify hairpin turns:
– P(t) = f(i) of the residue * f(i+1) of the next residue *
f(i+2) of the following residue * f(i+3) of the residue at
position (i+3)
– Predict a hairpin turn starting at positions where:
• P(t) > 0.000075
• The average P(turn) for the four residues > 100
• P(a) < P(turn) > P(b) for the four residues
• Accuracy  60-65%
Chou-Fasman Example
• CAENKLDHVRGPTCILFMTWYNDGP
• CAENKL – Potential helix (!C and !N)
• Residues with P(a) < 100: RNCGPSTY
– Extend: When we reach RGPT, we must stop
– CAENKLDHV: P(a) = 972, P(b) = 843
– Declare alpha helix
• Identifying a hairpin turn
– VRGP: P(t) = 0.000085
– Average P(turn) = 113.25
• Avg P(a) = 79.5, Avg P(b) = 98.25
Protein Structure Examples
Views of a protein
Wireframe
Ball and stick
Views of a protein
Spacefill
Cartoon
CPK colors
Carbon =
green, black,
or grey
Nitrogen =
blue
Oxygen = red
Sulfur =
yellow
Hydrogen =
white