Transcript Lecture 6
Structural hierarchy in proteins
Color conventions
Protein Geometry
CORN LAW amino acid with L configuration
Greek alphabet
The Polypeptide Chain
Chapter 5
Covalent structures of proteins
Proteins function as:
1. Enzymes:biological catalysts
2. Regulators of catalysis-hormones
3. Transport and store i.e. O2, metal ions
sugars, lipids, etc.
4. Contractile assemblies
Muscle fibers
Separation of chromosomes
etc.
5. Sensory
Rhodopsin nerve proteins
6. Cellular defense
immuoglobulins
Antibodies
Killer T cell
Receptors
7. Structural
Collagen
Silk, etc.
Function is dictated by protein structure!!
There are four levels of protein structure
1. Primary structure
1 = Amino acid sequence, the linear order of
AA’s.
Remember from the N-terminus to the C-terminus
Above all else this dictates the structure and
function of the protein.
There are four levels of protein structure
2. Secondary structure
2 = Local spatial alignment of amino acids
without regard to side chains.
Usually repeated structures
Examples: a helix, b sheets, random coil, or b
turns
3. Tertiary Structure
3 = the 3 dimensional structure of an entire
peptide.
Great in detail but vague to generalize. Can
reveal the detailed chemical mechanisms of
an enzyme.
4. Quaternary Structure
4 two or more peptide chains associated
with a protein.
Spatial arrangements of subunits.
Chapter 5.3 is how to determine a protein’s primary
structure.
“Protein Chemistry”
Example of each level of protein structure
Insulin was the first protein to be sequenced
F. Sanger won the Nobel prize for protein
sequencing.
It took 10 years, many people,
and it took 100 g of protein!
Today it takes one person several days to sequence
the same insulin.
1021 AA b- glactosidase 1978
Steps towards protein sequencing
Above all else, purify it first!! Chapter 5.3 then 5.1 and 5.2
1. Prepare protein for sequencing
a. Determine number of chemically different polypeptides.
b. Cleave the protein’s disulfide bonds.
c. Separate and purify each subunit.
d. Determine amino acid composition for each peptide.
Bovine insulin: note the intra- and interchain disulfide linkages
2. Sequencing the peptide chains:
a. Fragment subunits into smaller peptides 50
AA’s in length.
b. Separate and purify the fragments
c. Determine the sequence of each fragment.
d. Repeat step 2 with different fragmentation
system.
3. Organize the completed structure.
a. Span cleavage points between sets of peptides
determined by each peptide sequence.
b. Elucidate disulfide bonds and modified amino
acids.
At best, the automated instruments can sequence about 50
amino acids in one run!
Proteins must be cleaved into smaller pieces to obtain a
complete sequence.
End Group Analysis
How many peptides in protein?
Bovine insulin should give 2 N-terminii and 2 Cterminii
N-terminus
1-Dimethylamino - naphthalene-5-sulfonyl chloride
Dansyl chloride
Reacts with amines: N-terminus + Lys (K) side chains
Disadvantage with the Dansyl-chloride method is that you must
use 6M HCl to cleave off the derivatized amino acid, this also
cleaves all other amide bonds (residues) as well.
Edman degradation with Phenyl isothiocyanate, PITC
Edman degradation has been automated as a
method to sequence proteins. The PTH-amino acid
is soluble in solvents that the protein is not. This fact
is used to separate the tagged amino acid from the
remaining protein, allowing the cycle of labeling,
degradation, and separation to continue.
Even with the best chemistry, the reaction is about
98% efficient. After sufficient cycles more than one
amino acid is identified, making the sequence
determination error-prone at longer reads.
Demonstration of Edman
degradation
Use your CD disk- install it and run
chapter 5 Edman degradation.
Carboxypeptidase cleavage at the C-terminus
O
R n -2
NH
C
CH
NH
R n -1
O
CH
C
H 2O
R n -2
NH
CH
O
C
NH
Rn
NH
CH
O
C
O
C a rb o xyp e p tid a se
R n -1
O
CH
C
Carboxypeptidase A
Rn-1 P
Rn
O
H 3N
CH
O
C
O
Rn R, K, P
If the Tyr-Ser bond is more resistant to cleavage than the Leu-Tyr,
the Ser and the Tyr will appear simultaneously and the C-terminus
would still be in doubt.
Cleavage of disulfide bonds
Permits separation of polypeptide chains
Prevents refolding back to native structure
Performic acid oxidation
Changes cystine or cysteine to Cystic acid
Methionine to Methionine sulfone
2-Mercaptoethanol, dithiothreitol, or dithioerythritol
Keeps the equilibrium towards the reduced form
-S-S-
2SH
Amino acid composition
The amino acid composition of a peptide chain is determined by
its complete hydrolysis followed by the quantitative analysis of
the liberated amino acids.
Acid hydrolysis (6 N HCl)
at 120 oC for 10 to 100 h
destroys Trp and partially
destroys Ser, Thr, and Tyr.
Also
Gln and Asn yield Glu and
Asp
Base hydrolysis 2 to 4 N
NaOH at 100 oC for 4 - 8 h.
Is problematic, destroys Cys
Ser, Thr, Arg but does not
harm Trp.
Amino acid analyzer
In order to quantitate the amino acid residues after hydrolysis, each
must be derivatized at about 100% efficiency to a compound that is
colored. Pre or post column derivatization can be done.
o-P hth alaldehyde (O P A )
A m ino acid
O
2-m ercaptoethanol
CH
R
+
CH2
HS
CH2
OH
+
H 3N
CH
O
C
CH
O
S
CH2
CH2
R
N
CH
OH
O
C
O
These can be
separated using
HPLC in an
automated setup
O
Amino acid compositions are indicative
of protein structures
Leu, Ala,Gly, Ser, Val, Glu, and Ile are the most
common amino acids
His, Met, Cys, and Trp are the least common.
Ratios of polar to non-polar amino acids are
indicative of globular or membrane proteins.
Certain structural proteins are made of repeating
peptide structures i.e. collagen.
Long peptides have to be broken to shorter
ones to be sequenced
Endopeptidases cleave proteins at specific sites within the chain.
NH
R n -1
O
CH
C
NH
Rn
O
CH
C
S cissile B o n d
Trypsin
Rn-1 = positively charged residues R, K; Rn P
Chymotrypsin Rn-1 = bulky hydrophobic residues F, W, T; Rn P
Thermolysin
Rn = I, M, F, W, T, V; Rn-1 P
Endopeptidase V8
Rn-1 = E
Specific chemical cleavage reagents
Cyanogen Bromide
Rn-1 = M
Cleave the large protein using i.e trypsin, separate fragments and
sequence all of them. (We do not know the order of the
fragments!!)
Cleave with a different reagent i.e. Cyanogen Bromide, separate the
fragments and sequence all of them. Align the fragments with
overlapping sequence to get the overall sequence.
How to assemble a protein sequence
1. Write a blank line for each amino acid in the
sequence starting with the N-terminus.
2. Follow logically each clue and fill in the blanks.
3. Identify overlapping fragments and place in
sequence blanks accordingly.
4. Make sure logically all your amino acids fit into
the logical design of the experiment.
5. Double check your work.
1
H3N-
2
3
4
5
6
7
8
9
10
11
12
13
14
_-_-_-_-_-_-_-_-_-_-_-_-_-_-COO
K
F-A-M-K
K-F-A-M
Q-M-K
D-I-K-Q-M
G-M-D-I-K
Y-R-G-M
Y-R
Trypsin cleaves after K or R
Cyanogen Bromide (CN
(positively charged amino
Br) Cleaves after Met
acids)
i.e M - X
Q-M-K
D-I-K-Q-M
G-M-D-I-K
K
F-A-M-K
K-F-A-M
Y-R
Y-R-G-M
There are a variety of ways to purify peptides
All are based on the physical or chemical properties
of the protein.
Size
Charge
Solubility
Chemical specificity
Hydrophobicity/ Hydrophylicity
Reverse Phase High Pressure Liquid Chromatography
is used to separate peptide fragments.
Peptide mapping: digest protein with an appropriate
agent, then separate using two dimensional paper
chromatography
Digested Peptide from normal (HbA) and
Sickle cell anemia (Hbs) hemoglobins
HbA
V-H-L-T-P-E-E-K
HbS
V-H-L-T-P-V-E-K
b
1 2 3 4 5 6 7 8
Beta chain position 6 contains altered
amino acid
Red blood cells :
(a) normal
(b) sickle cell
Electrophoretic separation of
hemoglobins
Deoxyhemoglobin aggregates and deforms cell. Primary
structure changes dictate quaternary structure.
Why did the problem not die out?
Homozygotic
normal
gets malaria
dies
Heterzyatic
sickle cell trait
resistant
to malaria
Homozygotic
sickle cell
gets sickle cell
dies
Species variation in homologous proteins
The primary structures of a given protein from
related species closely resemble one another. If one
assumes, according to evolutionary theory, that
related species have evolved from a common
ancestor, it follows that each of their proteins must
have likewise evolved from the corresponding
ancestor.
A protein that is well adapted to its function, that is,
one that is not subject to significant physiological
improvement, nevertheless continues to evolve.
Neutral drift: changes not effecting function
Homologous proteins
(evolutionarily related proteins)
Compare protein sequences:
Conserved residues, i.e invariant residues reflect
chemical necessities.
Conserved substitutions, substitutions with similar
chemical properties Asp for Glu, Lys for Arg, Ile for Val
Variable regions, no requirement for chemical reactions
etc.
Amino acid difference matrix for 26 species of cytochrome c
Man,chimp
Rh. monkey
Horse
Donkey
cow,sheep
dog
gray whale
rabbit
kangaroo
Chicken
penguin
Duck
Rattlesnake
turtle
Bullfrog
Tuna fish
worm fly
silk moth
Wheat
Bread mold
Yeast
Candida k.
0
1
12
11
10
11
10
9
10
13
13
11
14
15
18
21
27
31
43
48
45
51
0
11
10
9
10
9
8
11
12
12
10
15
14
17
21
26
30
43
47
45
51
Average differences
0
1
3
6
5
6
7
11
12
10
22
11
14
19
22
29
46
46
46
51
0
2
5
4
5
8
10
11
9
21
10
13
18
22
28
45
46
45
50
10.0
0
3
2
4
6
9
10
8
20
9
11
17
22
27
45
46
45
50
0
3
5
7
10
10
8
21
9
12
18
21
25
44
46
45
49
0
2
6
9
9
7
19
8
11
17
22
27
44
46
45
50
5.1
0
6
8
8
6
18
9
11
17
21
26
44
46
45
50
0
12
10
10
21
11
13
18
24
28
47
49
46
51
0
2
3
19
8
11
17
23
28
46
47
46
51
0
3
20
8
12
18
24
27
46
48
45
50
0
17
7
11
17
22
27
46
46
46
51
9.9
14.3
0
12.6
22 0
24 10 0
26 18 15 0
29 24 22 24
31 28 29 32
46 46 48 49
47 49 49 48
47 49 47 47
51 53 51 48
18.5
0
14
45
41
45
47
0
25.9
45 0
47 54 0 47.0
47 47 41 0
47 50 42 27 0
Phylogenetic tree
Indicates the ancestral relationships among the
organisms that produced the protein.
Each branch point indicates a common ancestor.
Relative evolutionary distances between neighboring
branch points are expressed as the number of amino
acid differences per 100 residues of the protein.
PAM units
or
Percentage of Accepted Mutations
PAM values differ
for different
proteins.
Although DNA
mutates at an
assumed constant
rate. Some proteins
cannot accept
mutations because
the mutations kill
the function of the
protein and thus are
not viable.
Mutation rates appear constant in time
Although insects have
shorter generation times
than mammals and
many more rounds of
replication, the number
of mutations appear to
be independent of the
number of generations
but dependent upon time
Cytochrome c amino acid
differences between
mammals, insects and plants
note the similar distances
Evolution through gene duplication
Many proteins within an organism have sequence similarities with
other proteins.
•These are called gene or protein families.
•The relatedness among members of a family can vary greatly.
•These families arise by gene duplication.
•Once duplicated, individual genes can mutate into separate genes.
•Duplicated genes may vary in their chemical properties due to
mutations.
•These duplicate genes evolve with different properties.
•Example the globin family.
Hemoglobin:
• is an oxygen transport protein
•it must bind and release oxygen as the cells require
oxygen
Myoglobin:
• is an oxygen storage protein
•it binds oxygen tightly and releases it when oxygen
concentrations are very low
The globin family history
1. Primordial globin gene acted as an Oxygen-storage
protein.
2. Duplication occurred 1.1 billion years ago.
lower oxygen-binding affinity, monomeric protein.
3. Developed a tetrameric structure two a and two b
chains increased oxygen transport capabilities.
4. Mammals have fetal hemoglobin with a variant b
chain i.e. g (a2g2).
5. Human embryos contain another hemoglobin 2e2.
6. Primates also have a d chain with no known unique
function.
Protein Evolution is not organismal evolution
Chimpanzee human are about 99% the same amino
acid sequences in proteins!
However:
•Rapid divergence with few mutational changes suggest
altered control of gene expression.
•Controlling the amount, where, and when a protein is
made.