Transcript Translation

Translation
Messenger RNA Structure
• In eukaryotes, genes can be divided into exons
and introns. An exon is any sequence that ends
up in the messenger RNA. An intron is any
sequence that is transcribed but spliced out of
the primary transcript.
• As we move to the level of translation, we need
to add another set of definitions:
• the coding sequence or CDS is the part of the
messenger RNA sequence that is translated into
protein. The CDS starts at the first AUG (start
codon) and ends at the first in-frame stop codon.
• From the beginning of the mRNA to the start
codon is the 5’ untranslated region or 5’UTR.
• From the end of the stop codon to the end of the
mRNA is the 3’ untranslated region or 3’UTR.
• Note that the first exon contains the 5’UTR, but
it might also include part of the CDS. Or, the
5’UTR might take up the entire first exon and
also part of the second exon. That is,
intron/exon boundaries are not necessarily the
same as UTR/CDS boundaries.
Codons and the Genetic Code
• There are 20 different amino acids coded in DNA, but only 4 different nucleotides. To
accommodate all the amino acids, each amino acid is coded for by a group of 3 nucleotides, called
a codon.
• There are 64 possible codons (4 x 4 x 4 = 64). The table that shows the correspondence between
the 64 codons and the amino acid they code for is called the genetic code.
• Most amino acids have more than one codon: we say that the genetic code is degenerate. The
consequence of this is that if you know the amino acid sequence of a protein, you can’t be sure of
the exact nucleotide sequence.
• If an amino acid has 4 possible codons, they all start with the same 2 bases, and the third base can be
anything. See proline, threonine, alanine, lysine.
• If an amino acid has only 2 codons, the first two bases are the same, and the third is either a pyrimidine (C or
U) or a purine (A or G). E.g., histidine/glutamine or asparagine/lysine.
• Two amino acids have 6 possible codons: leucine and serine. They have different first bases as well as third
bases.
• All eukaryotic proteins start translation at AUG, the codon for methionine. This initial methionine
is removed after translation in many proteins. There are also methionines coded by AUG in the
middle of most proteins. AUG is the only methionine codon.
• In addition to AUG, bacteria use GUG and UUG as start codons. All three cause the first amino acid to be Nformyl methionine. In the rest of the protein, these codons code for valine and leucine.
• Three codons (UAA, UAG, and UGA) are stop codons: they code for no amino acid, and every CDS
ends in one of these 3 stop codons.
• Except in the very special circumstances of the unusual amino acids selenocysteine and pyrrolysine (discussed
later), there are no internal stop codons in a CDS.
The Genetic
Code
• The genetic code is almost universal:
nearly all organisms use it, both
prokaryote and eukaryote.
• There are several minor variants, with
one or two codons changed, in
mitochondria and in some single celled
eukaryotes (protists).
• It is thought that these variants developed
late in evolutionary time, because they only
affect a few taxonomic groups.
Reading Frames
• Codons are groups of 3 bases. Since translation can start at any
nucleotide of the mRNA, the same region of DNA can be read in 3
ways, starting one base apart. Each of these 3 modes is a reading
frame.
• The DNA might also be read on the opposite strand, giving a total of 6
possible reading frames.
• The proper reading frame is set by the initiation AUG. Once
this codon is read, the ribosome simply moves down 3
nucleotides at a time.
• Genes occur in open reading frames (ORFs), areas where there are
no stop codons. Genes end at the first stop codon that exists in
their reading frame.
• 3 out of every 64 codons is a stop codon, so long open reading
frames are rare in random, unselected DNA. Since genes are under
selection pressure, most long open reading frames contain genes.
The figure to the left shows all 3
reading frames for a small virus.
Potential start codons (AUG) are
marked with short lines, and stop
codons are marked with full-width
lines. There are 2 ORFs, one in
reading frame 3 and the other in
reading frame 1.
Transfer RNA
• Transfer RNA (tRNA) molecules are short RNAs that serve as
the adapters between the codons of mRNA and the amino
acids they code for.
• Transfer RNA molecules fold into a characteristic cloverleaf
pattern formed by base-pairing within the molecule. Higher
level (tertiary) structure then forms as different parts of the
cloverleaf hydrogen-bond with each other.
• Each tRNA has 3 bases that make up the anticodon. These
bases pair with the 3 bases of the codon on mRNA during
translation.
• Some tRNAs can pair with more than one codon. The first base
of the anticodon (which matches the third base of the codon) is
called the wobble position. It can form base pairs with several
different nucleotides, often using non-standard base pairings.
Also, the wobble position is often inosine, a purine that can
base pair with C, A, or U.
• Inosine is made by deaminating adenosine.
• Different species use different numbers of tRNAs; a minimum
of 31 different tRNAs is necessary. Humans have 48 different
tRNAs.
• There are multiple copies of tRNA genes, with about 500 total in
humans.
tRNA Charging
• A set of enzymes, the aminoacyl-tRNA
synthetases, are used to “charge” (that is,
attach) the tRNA with the proper amino
acid.
• There are 20 aminoacyl-tRNA synthetases,
one for each amino acid. Each enzyme
works with all of the tRNAs that code for
that amino acid.
• The aminoacyl-tRNA synthetases recognize
their tRNAs by bases in both the anticodon
and in the acceptor stem (just below the
amino acid attachment point).
• The –COOH group of the amino acid is
attached to the 3’ –OH at the 3’ end of the
tRNA, using energy from hydrolyzing ATP
to AMP (not just to ADP).
•
The bond formed is a high energy bond, and
the energy is later used to drive the
formation of the peptide bond between this
amino acid and the growing peptide chain.
Ribosomes
• Ribosomes are RNA/protein hybrids that perform
protein synthesis. About 60% of the mass of a
ribosome is RNA, with 40% protein.
• The ribosome is a ribozyme: the ribosomal RNA
catalyzes the chemical reactions of protein synthesis.
• Ribosomes consist of a large subunit and a small
subunit.
• In eukaryotes, the large subunit is called the 60S
subunit. It has 3 RNA molecules and 46-50
polypeptides. The small subunit, called 40S, contains
one RNA plus 33 polypeptides. Together, the ribosomal
subunits make up the 80S ribosome.
• The S units are sedimentation velocity is a centrifuge,
and they are not strictly additive.
• In prokaryotes, the assembled ribosome is called 70S.
It is smaller than the eukaryotic ribosome. The large
subunit (50S) has 2 RNAs (one fewer than eukaryotic)
plus 31 polypeptides, and the small subunit (30S) has 1
RNA and 21 polypeptides.
• When the ribosome is assembled, it has a binding site
for mRNA as well as 3 binding sites for tRNA, called the
E site, P site, and A site.
G Proteins
• G proteins are used for a variety of signal transmissions within the cell,
as well as between the outside of the cell and the inside. G proteins
can be thought of as molecular switches: they have two different
conformations, corresponding to “On” and “Off”.
• G proteins work by binding a molecule of GTP. When GTP is bound,
the protein is in the “on” state. The G protein slowly hydrolyzes the
GTP to GDP.
• Conversion of GTP to GDP (going from on to off) can be greatly sped up by G
protein activating proteins or GAPs.
• Hydrolysis of GTP to GDP causes a change of conformation of the
protein, converting it to the “off” state.
• The energy derived from converting GTP to GDP is used to change the
conformation of the G protein. Sometimes this energy does other useful work as
well: moving the ribosome from one codon to the next, for example.
• The G protein goes back to the ON state when the GDP is exchanged
for a GTP, with the help of a guanosine nucleotide exchange factor
(GES) protein.
• We will see several uses of G proteins and their associated GES and
GAP proteins in the translation process.
Initiation Process
• Translation has 3 phases: initiation, elongation, and termination.
• All proteins, both prokaryotic and eukaryotic, use an initiator
tRNA to insert the first amino acid, which is always methionine or
a derivative.
• In eukaryotes, the first amino acid is always methionine, whose
codon is AUG.
• In bacteria, the first amino acid is a modified version of methionine:
N-formyl methionine. Bacteria usually use AUG as the start codon,
but some proteins use GUG or UUG instead. However, all use the
same initiator tRNA.
• The initiator tRNA (tRNAiMet) is charged by the same amino acyl
tRNA synthetase as the regular Met-tRNA (which inserts
methionine into AUG codons in the middle of the protein).
• The initiator tRNA binds to the P site on the ribosome. All other
tRNAs bind to the A site.
• Initiation in both prokaryotes and eukaryotes use several different
proteins, called initiation factors.
Where Does Translation Start?
• In prokaryotes, translation starts when the small
ribosomal subunit binds to a specific sequence
called a ribosome binding site (or ShineDalgarno sequence), which is just upstream
from the translation start site. Ribosome
binding site sequences are complementary to a
region of the 16S ribosomal RNA.
• Many bacterial mRNAs code for multiple
proteins, each with its own translation start site.
This is an easy way to keep the amount of
different proteins in the same biochemical
pathway relatively equal.
• An operon is a group of genes that are all
transcribed by a single messenger RNA, and then
translated separately. The mRNA from an operon
is sometimes called polycistronic: cistron is an
old word for protein-coding gene.
• In eukaryotes, protein synthesis starts at the
first AUG in the mRNA
• This implies that eukaryotic messenger RNAs can
only be translated into a single protein
Initiation in
Prokaryotes
1. To start the initiation process, the
small ribosomal subunit and 2
initiation factors (IF1 and IF3) bind to
the ribosome binding site.
2. Next, the initiator tRNA binds to the
AUG start codon with the help of IF2,
the third initiation factor.
• IF2 is a G protein, in its active
conformation with GTP bound
3. Then, the large ribosomal subunit
binds.
4. The initiation factors dissociate.
• The large subunit acts as a G protein
activating protein (GAP), causing GTP
to be hydrolyzed to GDP. The
conformation change in IF2 causes all
of the initiation factors dissociate.
Eukaryotic Initiation
• There are many more proteins involved in eukaryotic initiation than in prokaryotic.
• Eukaryotic initiation factors are named eIF (eukaryotic initiation factor).
1.
Proteins bind to the 5’ cap and the 3’ poly-A tail. Then, these proteins bind to each other to
form a circular structure with the mRNA.
• This probably facilitates rapid re-use of ribosomes: as soon as a ribosome finished making a protein, it
can quickly find its way back to the translation start site.
2.
The small subunit, combined with the initiator tRNA and several initiation factors, binds to
the 5’ cap structure.
3.
The small subunit complex then uses ATP energy to move down the mRNA, scanning for the
first AUG.
• One of the initiation factors, eIF2, is a G protein. Scanning is only possible when it is in the ON conformation (has
GTP bound).
4.
When the initiator tRNA’s anticodon binds to the AUG, another initiation factor acts as a GAP
protein and hydrolyzes the GTP to GDP. This locks the initiation complex in place and
prevents further scanning.
5.
The large subunit then binds. This causes the initiation factors to be released.
Elongation
• After initiation is completed, the
ribosome is assembled with the
messenger RNA, and the initiator
tRNA is in the P site. All initiation
factors have been released.
Translation then enters the
elongation phase, where the
polypeptide chain is synthesized
one amino acid at a time.
• Elongation also uses proteins,
called elongation factors. Some of
these are G proteins, which serve
to proofread the new polypeptide.
• The process is very similar
between prokaryotes and
eukaryotes.
Elongation
• Transfer RNAs, charged with the appropriate
amino acid and bound to elongation factor
EF-1α (called EF-Tu in bacteria) enter the A
site of the ribosome.
• EF-1α is a G protein, and when the anticodon
of the proper tRNA binds to the mRNA codon,
EF-1α hydrolyzes GTP to GDP. This causes a
conformational shift that moves the incoming
amino acid into close proximity with the
amino acid at the P site.
• Note that various incorrect tRNAs enter the A site
but are rejected because their anticodon doesn’t
match the mRNA codon.
• At this point, the P site has a transfer RNA with the
initial methionine attached to its 3’ end, and the A
site has a transfer RNA with the next amino acid
attached to it.
Elongation
• The ribosomal RNA then catalyzes the transfer of the
polypeptide chain from the tRNA at the P site onto the
NH2 group of the amino acid at the A site. The
ribosome is actually a ribozyme: the reaction is
catalyzed by RNA, not a protein.
• At this point, the tRNA in the P site has nothing attached to it,
and the tRNA in the A site has the growing polypeptide chain
attached to it. The attachment is the COOH group of the last
amino acid.
• The ribosome then moves down the mRNA 3
nucleotides: this is called translocation. It occurs
because elongation factor EF2, another G protein,
hydrolyzes its GTP to GDP. This causes a conformation
change that moves the ribosome.
• At this point, the empty tRNA is in the E site and the growing
polypeptide is attached to the tRNA in a P site. The A site is
empty.
• Finally, the empty tRNA in the E site detaches from the
ribosome, leaving it ready to start the next cycle of
adding an amino acid.
Termination
• The protein coding sequence on the mRNA ends in a
stop codon. There are no tRNAs that match stop
codons.
• Termination is accomplished by 2 release factor
proteins.
• When the ribosome has a stop codon under the A
site, the first release factor binds to the stop codon.
This release factor has a shape very similar to a tRNA.
• The release factor catalyzes the hydrolysis of the bond
between the tRNA and the C-terminus of the newly
synthesized polypeptide.
• The second release factor is a G protein. After the
polypeptide has been released, the second release
factor hydrolyzes its GTP, and the resulting
conformation change causes the ribosomal subunits
to separate from each other and from the mRNA.
Inventory of GTP usage in eukaryotic translation
--Initiation: when initiator tRNA binds to start
codon (plus ATP for small subunit scanning)
--Elongation: for each amino acid, GTP is used
when the proper tRNA binds to the codon, and
when the ribosome translocates.
--Termination: when the new polypeptide is
released from its tRNA.
Polyribosomes
• Most mRNAs are translated by several
ribosomes following each other down the
RNA. This structure, the mRNA plus
attached ribosomes, is called a
polyribosome, or polysome.
• The circularization of the mRNA (in
eukaryotes) by proteins binding to both the
cap and the poly A tail make it easier to
recycle the ribosomes.
• In bacteria, transcription and translation are
coupled, with the first ribosome attached to
the mRNA right after it is transcribed.
Bacterial mRNA isn’t circularized.
Protein Synthesis Inhibitors
• There are some small differences in the way
bacteria and eukaryotes synthesize proteins.
Many antibiotics work by using these
differences to inhibit protein synthesis in
bacteria while leaving human cells
unharmed.
Selenocysteine and Pyrrolysine
• These two amino acids are not part of the regular genetic code,
but they are coded in the DNA: as stop codons that get modified by
other sequences in the mRNA.
• Selenocysteine (Sec, or U) is the best known. It is found in most
organisms in all 3 domains of life, as part of the active site of some
oxidation-reduction enzymes.
• At least 25 human proteins contain Sec.
• Sec uses the UGA stop codon, and has a special tRNA with a
matching anticodon.
• tRNASec is initially charged with cysteine, and then an enzyme
replaces the sulfur atom with selenium.
• The selenocysteine insertion sequence (SECIS) is a hairpin loop in
mRNA
• The SECIS binds a translation elongation factor similar to EF-Tu in
bacteria or EF1A in eukaryotes.
• This elongation factor causes Sec-tRNASec to be inserted at every UGA
in the mRNA, instead of the usual chain-terminating release factor.
• Pyrrolysine has only been found in a few Archaea. It uses the UAG
stop codon and a hairpin loop similar to the SECIS element.
Protein Degradation
• The lifespan of proteins is primarily controlled by the process of
protein degradation. Damaged proteins are removed by the same
process.
• The proteasome is a large, multi-subunit molecular machine. All of
its subunits are proteins, with no RNA. It has about 50 subunits and
a sedimentation size of 26S (compare to the 40S small ribosomal
subunit).
• The proteasome has a cylindrical shape with 3 parts. The central
region contains multiple proteases (enzymes that digest proteins).
The end caps (or stoppers) regulate the proteasome’s activity.
• The proteasome can exist with just the central cylinder, or with 1 or 2
end caps.
• Proteins that are to be destroyed enter through the end cap.
Here, ATP in used to unfold the proteins. Also, disulfide
bridges are removed (reduced to –SH). Then, the proteins are
fed through a narrow opening into the central chamber.
Proteases cut them up into short peptides, which are then
released into the cytoplasm. In the cytoplasm, other
proteases hydrolyze them into single amino acids.
Ubiquitin and Proteolysis
• Proteins are marked for destruction by being tagged with ubiquitin
molecules.
• Ubiquitin is a 76 amino acid protein (which is quite short for a
protein).
• Ubiquitin is found in all eukaryotes, and it is one of the slowest evolving
proteins known.
• The COOH end of ubiquitin is attached to the –NH2 at the end of
the R-group of lysine.
• This creates a peptide bond, but it is not part of the main
polypeptide chain. It is thus called an isopeptide linkage.
• A set of 3 enzymes attaches the ubiquitin to the protein being
marked for destruction, using ATP energy. Then more ubiquitins
are attached to a lysine in the first ubiquitin, forming along chain.
• E3, the last enzyme in the chain, is a large family of proteins that recognize
specific misfolded or defective proteins. E3 thus provides specificity to the
degradation process.
• The proteasome cap binds to proteins with 4 or more ubiquitins
and destroys them. An enzyme removes the ubiquitins before the
protein is destroyed.