Transcript Genomics

3c
The 3 genomic paradoxes
K
N
C
1
K-value paradox: Complexity
does not correlate with
chromosome number.
Homo sapiens
46
Lysandra atlantica
250
Ophioglossum reticulatum
~12602
C-value paradox: Complexity
does not correlate with
genome size.
3.4  109 bp
Homo sapiens
1.5  1010 bp
Allium cepa
6.8  1011 bp
Amoeba dubia
3
N-value paradox: Complexity
does not correlate with gene
number.
~21,000 genes ~25,000 genes ~60,000 genes
4
Possible solutions:
5
What is complexity?
6
Solution 1 to the N-value
paradox:
Many protein-encoding genes
produce more than one protein
product (e.g., by alternative
splicing or by RNA editing).
7
RNA editing
Alternative
splicing
8
The combinatorial use of RNA
editing and alternative splicing
probably causes the human
proteome to be 5-10 times larger
than that of Drosophila or
Caenorhabditis.
9
959 cells 1,031 cells
~108 cells
19,000 genes
13,600 genes
10
Solution 2 to the N-value
paradox:
We are counting the wrong
things, we should count other
genetic elements (e.g., small
RNAs).
11
Solution 3 to the N-value
paradox:
We should look at connectivity
rather than at nodes.
12
L. Mendoza and E. R. Alvarez-Buylla. 1998. Dynamics of the genetic regulatory
network for Arabidopsis thaliana flower morphogenesis. J. Theor. Biol. 193:307-319.
13
Solution 4 to the N-value
paradox:
The numbers provided by the
various genome annotations are
wrong!
14
Comparison of three databses
Hogenesch JB, Ching KA, Batalov S, Su AI, Walker JR, Zhou Y, Kay SA, Schultz PG, & Cooke MP. 2001. A15
comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes. Cell 106:413-415.
Range of C-values in various eukaryotic taxa
___________________________________________________________
Taxon
Genome size range
Ratio
(Kb)
(highest/lowest)
___________________________________________________________
Eukaryotes
2,300 - 686,000,000
298,261
Amoebae
35,300 - 686,000,000
19,433
Fungi
8,800 1,470,000
167
Animals
49,000 - 139,000,000
2,837
Sponges
49,000 53,900
1
Molluscs
421,000 5,290,000
13
Crustaceans
686,000 - 22,100,000
32
Insects
98,000 7,350,000
75
Bony fishes
340,000 - 139,000,000
409
Amphibians
931,000 - 84,300,000
91
Reptiles
1,230,000 5,340,000
4
Birds
1,670,000 2,250,000
1
Mammals
1,700,000 6,700,000
4
Plants
50,000 - 307,000,000
6,140
16
___________________________________________________________
If the variation in C-values is
attributed to genes, it can be due to
interspecific differences in
(1) the number of protein-coding
genes
(2) the size of proteins
(3) the size of protein-coding genes
(4) the number and sizes of genes
other than protein-coding ones.
17
The number of proteincoding genes in eukaryotes
is thought to vary over a 50fold range. This variation is
insufficient to explain the
300,000-fold variation in
nuclear-DNA content.
18
19
The bigger the genome, the smaller the genic fraction
20
Nongenic DNA is
the sole culprit
for the C-value
paradox!
99.998%
21
Genome increase:
(1) global increases, i.e., the entire genome
or a major part of it is duplicated
(2) regional increases, i.e., a particular
sequence is multiplied to generate
repetitive DNA.
MECHANISMS FOR
GLOBAL INCREASES
IN GENOME SIZE
22
Polyploidization = the addition
of one or more complete sets of
chromosomes to the original set.
An organism with an odd number
of autosomes cannot
undergo meiosis or
reproduce sexually.
Musa acuminata
23
allopolyploidy
24
Triticum urartu (AA)  Aegilops speltoides (BB)
T. turgidum (AABB)  T. tauschii (DD)
`
T. aestivum (AABBDD)
25
autopolyploidy
26
Following
polyploidization, a
very rapid process of
duplicate-gene loss
ensues.
27
Allohexaploid Triticum aestivum
originated about 10,000 years ago.
In this very short time, many of its
triplicated loci have been silenced.
The proportion of enzymes
produced by triplicate, duplicate,
and single loci is 57%, 25%, and
18%, respectively.
28
During evolution
autopolyploidy
&
allopolyploidy
becomes
cryptopolyploidy.
29
Genome sizes in 80 grass species (Poaceae).30
31
32
It has been suggested that the
emergence of vertebrates was made
possible by two rounds of
tetraploidization.
Two cryptooctoploids?
33
Does chromosome
number increase due to
polyploidy affect the
phenotype?
Chrysanthemum species have 9 to 90
chromosomes in haploid cells.
34
54 duplicated regions.
35
2 possible explanations:
(1) the duplicated regions were formed
independently by regional duplications
occurring at different times.
(2) the duplicated regions have been
produced simultaneously by a single
tetraploidization event, followed by
genome rearrangement and loss of
many redundant duplicates.
36
50/54 duplicated regions have
maintained the same orientation
with respect to the centromere.
54 independent regional
duplications are expected to result
in ~7 triplicated regions (i.e.,
duplicates of duplicates), but none
was observed.
37
Loss of 92% of
the duplicate
genes.
Occurrence of
70-100 map
disruptions.
38
Arabidopsis thaliana: regional duplications
39
What about polysomy?
40
trisomy 21
Polysomy is usually deleterious.
41
An exception?
42
MAINTENANCE OF NONGENIC
DNA: HYPOTHESES
(1) The selectionist hypothesis.
(2) The neutralist hypothesis
(junk DNA).
(3) The intragenomic selectionist
hypothesis (selfish DNA).
(4) The nucleotypic hypothesis.
43
44
45
46
47
3.5
3
log nuclear volume (mm3)
2.5
2
1
1.5
2
log DNA per cell ()
Correlation between nuclear volume and nuclear DNA
content in apical meristem cells of 30 herbaceous species.
48
Regression slope = 0.826 fitted by least squares.
49
MAINTENANCE OF NONGENIC
DNA: EVIDENCE
(1) The selectionist hypothesis.
(2) The neutralist hypothesis
(junk DNA).
(3) The intragenomic selectionist
hypothesis (selfish DNA).
(4) The nucleotypic hypothesis.
50
Even whole chromosomes may
be junk.
A person needs
an Y, like a fish
needs bicycles.
51
with apologies to Irina Dunn, Australian feminist (1970).
52
Nature (2004) 431:988-993.
53