Transcript Document
Codon Usage
1
Discovering the codon bias
2
In the year 1980
Four researchers from Lyon analyzed ALL published
mRNA sequences of more than about 50 codons.
All together they analyzed 90 sequences…
3
4
In the paper they first list all the gene studied and
they compute codon frequencies for various groups:
Single strand RNA viruses
Single strand DNA viruses
Double strand DNA viruses
Double strand (DS) bacteria
DS mitochondria
DS yeast
DS animals
IgG’s.
5
They project each sequence on 2D, so that
sequences with similar codon composition appear
near each other.
Correspondence analysis
6
Mostly viruses
Mostly animals & bacteria
7
Papova virus genes cluster together
8
Ig genes cluster together
9
Mammals
genes that are
not Ig
cluster together
10
The genome hypothesis
All genes in a genome tend to
have the same coding strategy.
That is, they employ the codon
catalog similarly and show
similar choices between
synonymous codons.
Different taxa have different
coding strategies.
Richard Grantham
11
An example:
21 of the 23 leucine residues in
the E. coli outer membrane
protein II (ompA) are encoded
by the codon CUG, although 5
other codons for leucine are
available.
12
Measures of codon-usage bias
13
The relative synonymous codon usage (RSCU) was
first suggested by Sharp et al. (1986).
14
RSCU is the number of times a codon appears in a gene
divided by the number of expected occurrences under
equal codon usage.
X
RSCU n i
i 1 X
n i
i1
n = number of synonymous codons (1 n 6) for the
amino acid under study, Xi = number of occurrences of
codon i.
15
X
RSCU n i
i 1 X
n i
i1
If the synonymous codons of an amino acid are used with
equal frequencies, their RSCU values will equal 1.
16
Gouy and Gautier (1982) and
Bennetzen and Hall (1982)
found positive correlation
between degree of codon bias
and level of gene expression.
17
One can locate “optimal”
codons which are expected to
be translated more efficiently
than others.
18
Motivation:
We now want to define the
“codon bias” of a specific gene,
relative to the optimal
codons…
19
The codon adaptation index (CAI) measures the degree
with which genes use preferred codons.
We first compile a table of RSCU values for highly
expressed genes. From this table, it is possible to identify
the codons that are most frequently used for each amino
acid. The relative adaptiveness of a codon (wi) is
computed as
RSCU
i
w
i RSCUmax
where RSCUmax = the RSCU value for the most
frequently used codon for an amino acid.
20
The CAI value for a gene is calculated as
the geometric mean of wi values for all the
codons used in that gene.
1
L
L
CAI w
i
i1
where L = number of codons.
21
Am in o Acid
Leucine
Valine
Isoleucine
P henylalanine
Escherichia
coli
C odon
Saccharomyces
cerevisae
High
Low
High
Low
UUA
1%
20%
8%
25%
UUG
1%
15%
89%
25%
CUU
2%
12%
0%
12%
CUC
3%
11%
0%
9%
CUA
1%
5%
3%
15%
CUG
92%
37%
0%
14%
GUU
60%
27%
52%
28%
GUC
2%
25%
48%
19%
GUA
28%
16%
0%
30%
GUG
10%
32%
0%
23%
AUU
16%
46%
42%
43%
AUC
84%
37%
58%
22%
AUA
0%
17%
0%
35%
UUU
17%
67%
10%
69%
UUC
83%
33%
90%
31% 22
Universal and species-specific
patterns of codon usage
23
Universal patterns:
Codons that contain the CG
dinucleotide are universally avoided
(low-usage codons). This
phenomenon is particularly notable as
far as the arginine codons CGA and
CGG are concerned.
24
Codon Usage
is related to
Translation
Efficiency
Toshimishi Ikemura
25
Rules determining choice of optimal
codons in unicellular organisms
__________________________________
1.
tRNA availability.
2.
Preference for A over G when
thiolated uridine or 5carboxymethyl are at the anticodon
wobble position.
3.
Preference for T and C over A
when inosine is at the anticodon
wobble position.
4.
Codons of the AAN, ATN, TAN,
and TTN type prefer C in the third
codon position.
__________________________________
26
Codon usage and population size
If codon usage is affected by selection, the strength of
such selection ought to be very week. In fact, it may be
so week that random genetic drift would dominate the
evolutionary dynamics of codon substitution in species
with a small effective population size, whereas
selection would be the dominant force in species with
large effective populations sizes.
Drosophila simulans, which has a larger effective
population size than D. melanogaster, also has a
stronger codon bias.
27