L13Generalizations
Download
Report
Transcript L13Generalizations
Topic 8. Lecture 13. Generalizations emerging from past
evolution
History unfolds in time, which makes chronology of past events crucial. However, any
history also has a crucial timeless aspect, which can be described by generalizations.
An example of generalizations: complexity is rapidly lost, if selection stops maintaining it.
Mycobacterium
leprae is in the
middle of massive
genome
degeneration
A parasitic plant
Epifagus virginiana
lost many key
genes in its
chloroplast genome
Astyanax mexicana, like
many other cave animals,
has degenerated eyes
A crustacean parasite of
fish, Lernaea carassii, has
profoundly simplified
morphology
In a sense, every feature of living beings, both modern and ancient, is a generalization about
evolution of their ancestors.
Let us view evolutionary generalizations from three complementary perspectives:
1. Generalizations concerned with evolution at a particular level of organization of life - i. e.,
with sequences, molecules, cells, organisms, populations, and ecosystems.
2. Generalizations concerned with evolution of the diversity of life. Such generalizations
describe patterns in the diversity of life at one moment of time, as well as processes that
generate such diversity, i. e., evolution of individual lineages, birth and death of lineages,
independent and dependent evolution of different lineages, and evolution in space.
3. Generalizations concerned with evolution of complex adaptations, the most enigmatic
aspects of evolution. These generalizations describe genotypical and phenotypical
mechanisms of adaptive evolution, origin of novel adaptations, and dynamics of complexity
Because we still lack a comprehensive theory of Macroevolution, generalizations about past
evolution often are all what we have.
Level-specific generalizations:
1. Sequences
a) Mutation strongly affects sequence evolution, and selfish segments are common
b) Functionally important segments and sites of genomes usually evolve slower
c) Complex organisms have larger genomes, mostly due to noncoding sequences
2. Molecules
a) Life possesses fundamental unity
b) A particular function can be performed by very dissimilar molecules
c) Rates of evolution vary across sites of a molecule and often change with time
3. Cells
a) Networks within a cell are modular
b) Networks within a cell consist of a small number of common motifs
4. Multicellular organisms
a) Cell differentiation involves combinatorial regulation of gene expression
b) In the development of vertebrates one stage is particularly conservative
c) Body size often increases, but declines on islands
5. Populations
a) Reproduction almost always involves unicellular channels
b) Amphimixis is pervasive
6. Ecosystems
a) Natural ecosystems can be successfully invaded
Generalizations concerned with diversity of life:
1. Diversity of life at a particular moment of time
a) Every individual belongs to a population of at least ~1000 individuals
b) At any moment, life mostly consists of compact, disconnected forms
c) Genotypes are incompatible if the distance between them exceeds ~1-5%
2. Evolution of a lineage
a) Changes of a lineage are continuous, with some caveats
b) Genomes evolve at much more uniform rates than phenotypes
3. Birth and death of lineages
a) Cladogenesis is often, but not always, triggered by geographic isolation
b) Cladogenesis and extinction are extremely unfair processes
c) Overall diversity of life fluctuates, with the long-term tendency to increase
4. Independent evolution in multiple lineages
a) Evolution is predominantly divergent, but homoplasy is common in simple traits
b) Independent evolution eventually leads to speciation
5. Coevolution
a) Lineages often coevolve for a long time
b) Organisms often imitate each other to avoid been eaten
6. Diversity in space
a) Distributions of ranges of species are strongly affected by limited dispersal
b) Independent evolution at different localities is often parallel
Generalizations concerned with adaptation and complexity:
1. Genetical aspects of adaptive evolution
a) Evolution of both coding and non-coding sequences is important for adaptation
b) The target for strong positive selection is narrow at each moment
c) Tightly related genes can perform rather different functions
2. Phenotypic aspects of adaptive evolution
a) Adaptations can be very general and very specific
b) Evolution is irreversible
c) Perhaps, all adaptations are imperfect
3. Origin of novelties
a) New non-coding regulatory sites, but not new genes, often appear from scratch
b) Origin of phenotypic novelties is usually opportunistic and can happen fast
4. Dynamics of complexity
a) Complex phenotypes evolve through adaptive intermediate stages
b) Complexity is rapidly lost, if selection stops maintaining it
c) The overall trend is for complexity to increase
Level-specific generalizations:
1. Sequences
The level of sequences is the simplest
of all levels of organization of life.
ACGATCGACGACGATCGATCGACGATCGA
Green, blue, red: targets of
no, negative, and positive selection.
Evolution of sequences is undestood relatively well. The two key factors
of Darwinian evolution, mutation and selection, are its main forces.
However, this is of little help for understanding evolution at higher levels.
Whether genotypes drive evolution of phenotypes or it is the other way
around is a classical chicken-and-egg problem.
1a) Mutation strongly affects sequence evolution, and selfish segments are common
This sweeping generalization has many facets. The three most important of them are:
i) Evolution of sequences proceeds through individual changes that are supplied by
mutation process, first of all by point mutations - single nucleotide substitutions, and short
deletions and insertions.
Sister 1: caagccag---cgtctatcatatacgcagactcggctatttacgccacgatcagcat
Sister 2: catgccagcatcgtctagcatatacacagactc-gctatttacgtcacga-cagcat
Outgroup: catgccagcatcgtgtagcatataggcagactc-gctaattacgtcacgatcagtat
del.
in.
del.
ii) Long new sequences have identifiable sources, instead of appearing from scratch.
acagcatcgtgactagctatcgagatca -> acagcatcgtgactagctatagctatcgagatca
Tandem duplication, the simplest manifestation of this pattern.
iii) Different genome regions evolve at similar overall rates. This is another theory-based
evidence for evolution.
Human-mouse divergence at synonymous sites of genes on chromosomes 4 (left) and 22 (right).
One important special case of this generalization is that transposable elements (TEs)
accumulate in many genomes.
A mammalian genome is ~50% TEs, a Drosophila
genome is ~10% TEs, and bacterial genomes
usually contain very few TEs and other junk.
In mammals, individual TEs are usually fixed, i. e.
present in every genotype within a lineage. In
Drosophila a individual TE is usually rare.
Mammals
Drosophila
Often, TEs or their segments become domesticated,
i. e. start performing some function for their host.
The distribution of ages of TEs in the human genome. This is
measured by divergence from the consensus sequences and
grouped into bins that correspond to 25My of divergence.
Simple explanation for:
1a) Mutation strongly affects sequence evolution, and selfish segments are common
Qualitatetively, this pattern is inavoidable - the only feasible mode of genome evolution is
fixation of an individual mutation. Selection is powerless without mutation.
IMPERFECT -> IMPENICESEGMENTRFECT
No selection can accomplish this!
Quantitatively, mutation dictates the course of evolution, but only as long as selection does
not care. For example, in coding regions deletions and insertions of lengths 1 and 2 (but not
3) are rare.
Not so simple from here:
Selection is more efficient in lineages represented, at any moment, by many individuals.
Thus, lineages with large populations are better protected against TEs and other junk.
The relative roles of mutation and selection is the key issue of evolution at the sequence
level. To some extent, it will be clarified by the next generalization.
1b) Functionally important segments and sites of genomes usually evolve slower
A nucleotide substitution can kill, but at another location a substitution of even a removal of
1Mb of sequence has no evident impact on the phenotype.
Pathogenic (black) and benign (grey)
nucleotide substitutions in human
mitochondrial gene for alanine tRNA.
A typical human is heterozygous for ~50
deletions larger than 5,000 nucleotides each.
The detected deletions span a total of 267
genes.
This sweeping generalization has many facets. The three most important of them are:
i) Non-synonymous sites of coding genes evolve slower than synonymous sites
ATG TCT GGG CGA GGT AAA GGT GGC AAG GGG CTG GGT AAG GGA GGC GCC AAG CGC CAC CGG
||| ||| || ||| || ||| || ||| || ||| || || || || ||| || || ||| || ||
ATG TCT GGA CGA GGC AAA GGC GGC AAA GGG CTC GGA AAA GGT GGC GCT AAA CGC CAT CGT
This alignment of the first 20 codons of histone 4 genes from human and zebra-fish genomes is an
extreme case. On average, nonsynonymous substitutions accumulate ~10 times slower than
synonymous ones.
ii) functional non-coding segments evolve slower than junk segments
Alignment of four genome regions upstream of the transcription start of apolipoprotein gene. The
binding site of the key transcription factor (protein) is conserved (sequence motif) and highlighted.
Conservation is represented by a
motif logo. Functional non-coding
sequence segments can be
detected using phylogenetic
footprinting.
iii) exons evolve slower than introns
Coding exons evolve much slower than introns, and this pattern can be used to determine
exon locations by genome comparions.
Alignment of human (top) and mouse (bottom) orthologous genes. Lines connecting the genomes
show segments where their similarity is moderate (blue) or high (red). Red boxes below the alignment
show predicted exons.
Essential genes, that make ~25% of all genes,
evolve ~1.5 slower than non-essential genes.
Still, occasionally even once-essential genes are lost.
The estimated number of lost genes is shown
next to each branch. Approximate divergence
times are shown at the right.
Simple explanation for:
1b) Functionally important segments and sites of genomes usually evolve slower
Negative selection which favors already-commom variants and prevents changes is much
more common than positive (Darwinian) selection which favors initially rare variants and
promotes changes. One may wonder why beneficial mutations happen at all.
Still, positive selection does happen, and a particular site or even a segment where
selection strongly promotes changes can evolve faster than selectively neutral sites.
Mutation reigns where selection does not care, but where it cares, selection makes a very
strong impact on sequence evolution, although it can only reject of favor new mutations.
1c) Complex organisms have larger genomes, mostly due to noncoding sequences
Genomes of complex organisms carry only a slightly elevated number of protein-coding
genes. In Drosophila, ~50% of its non-coding DNA is apparently doing something, and in
mammals this fraction is ~10%.
Organisms
parasitic bacteria
free-living bacteria
unicellular eukaryotes
flowering plants
most of animals
fishes
birds
mammals
Minimal Genome size
(millions)
0.5-1.5
2.5-7.5
10-30
60-120
100-200
400-1000
1000-1500
2500-3500
Number of genes
(thousands)
0.5-1.5
2.5-7.0
7-10
20-30
15-25
20-30
20
20
Maxiaml coding fraction
(per cent)
85
85
50-70
25-40
15-20
5-10
2-3
1.5-2
Simple explanation:
Complex organisms need more text to describe themselves, and the extra text comes in the
form of functional non-coding sequences (we do not really understand why). Also, complex
organisms have "bloated", instead of "lean", genomes.
Level-specific generalizations:
2. Molecules
Molecules are the lowest functional level. A molecule is a (relatively) small but fully
functional entity, and each one is incredibly complex (protein folding remains a mystery).
DNA as a functioning molecule.
tRNA, a non-coding RNA.
Hemoglobin, a protein.
Why do we ignore evolution of things like
this? Discontinuous, and its connection to
the genotype is much more complex.
2a) Life possesses fundamental unity
This unity is most striking, as far as translation machinery is concerned - genetic code,
components of ribosomes, tRNAs, aminoacyl tRNA synthetases, etc.
80S ribosome of
Saccharomyces
cerevisiae
70S ribosome of
Escherichia
coli
Many proteins not involved in translation are
also universal - almost 50% of E. coli proteins
have homologs among human proteins.
Simple explanation:
Many key features of life are probably forozen accidents, impossible to modify. This allows
us to learn something about LUCA.
2b) A particular function can be performed by very dissimilar molecules
Despite the fundamental unity of life, there are some cases when the same function, either
simple or complex, is performed by clearly non-homologous molecules, similar only to the
extent dictated by this function.
Inorganic pyrophosphatases comprise
two non-homologous families, I and II.
Archaeal and eukaryotic replicative DNA
polymerases (families A and B) and bacterial
replicative DNA polymerases (family C) perhaps
are non-homologous.
Simple explanation:
Apparently, this is a general law of nature: every complex task can be performed more or
less equally well in many rather different ways. Without common ancestry, each species
would probably use its own way hydorolize pyrophosphate.
2c) Rates of evolution vary across sites of a molecule and often change with time
In almost every RNA or protein molecule there are sites that evolve very conservative and
sites that evolve as fast as junk DNA (i. e., at mutation rate) or even faster.
A typical segment of an alignment of several orthologous proteins from different species.
Distribution of amino acid replacements along the Neisseria gonorrhoeae transmembrane
porin sequence. Each dot represents one replacement. Obviously, sequence segments
exposed outside the cell evolve much faster, probably due to positive selection.
The rate of evolution within a molecule is not only heterogeneous across sites at any
moment of time, but it also can change at a particular site while the molecule evolves. This
occasionally includes the most drastic, qualitative changes - a nucleotide or amino acid
replacement which was forbidden by selection may become permitted, or other way around.
Hs 1
Ag 1
MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTK 60
MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVTTVAEKTK 60
Hs 61
Ag 61
EQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDP 120
EQVTSVGGAVVTGVTAVAQKTVEGAGNIAAATGFVKKDHSGKSEEGAPQEGILEDMPVDP 120
Hs 121 DNEAYEMPSEEGYQDYEPEA 140
Ag 121 DNEAYEMPSEEGYQDYEPEA 140
In humans, T at the 53rd site of a protein alpha-synuclein is pathogenic. However, in spider
monkey normal alpha-synuclein contains this T. Probably, some other deviation of spider
monkey alpha-synuclein from its human ortholog renders T at the 53rd site harmless. Thus,
we can call this T a CPD (compensated pathogenic deviation).
As many as 10% of deviations of a non-human protein from its human ortholog would be
deleterious, if placed into the human molecule individually.
CDPs are very common in tRNAs. Three
of them are present in mitochondrial
tRNASer of Ursus maritimus (polar bear).
Nucleotides corresponding to human
pathogenic mutations are shown in red;
predicted compensatory substitutions are
shown in blue; and other deviations from
the human ortholog, those unrelated to
the pathogenic mutations or their
compensations, are shown in green.
Nucleotides found in healthy humans are
shown in orange alongside the nonhuman
sequence.
At least five mechanisms of
compensation are known for pathogenic
mutations that destroy a Watson-Crick
pair in one of the four tRNA stems.
Simple explanation for:
2c) Rates of evolution vary across sites of a molecule and often change with time
Variation of rate of evolution across sites is not surprising.
Proportions of amino acid replacements in human
proteins that are opposed by selection with
coefficients s > 10–2, 10–2–10–4, 10–4 –10–5, or <10–5.
Variation of rate of evolution at a site may be unexpected. However, molecules are very
complex things, and their parts interact with each other in complex ways. Thus, evolution of
a molecule changes the rules of the game for each its site.
Level-specific generalizations:
3. Cells
Cells is not the lowest functional level - molecules is - but it is the first living level. Thus, in
cells we encounter a staggering degree of complexity.
Unicellular green alga
Acetabularia is ~5cm tall
A ciliate Stentor
Human Hippocampal
neuron
A cell contains a large number of functional units - promoters, mRNAs, ribosomes, and
proteins. These units interact with each other, forming networks. Networks that describe the
following 3 processes are particularly important:
transcription of genes
physical interactions
of proteins
functional interactions
of proteins
Transcriptional regulatory network of the
Saccharomyces cerevisiae. Transcription
factor genes are green, regulated genes are
brown, and those with both functions are red.
Network of protein complexes in S. cerevisiae. Different functions are shown by colors. The
gray edges connect complexes that share protein components. Exemplar complexes from
each function are expanded to show individual proteins.
A standard map of
biochemical
pathways,
representing the
metabolic network
of the cell.
Networks of interacting units (produced by evolution!) is the essense of cells.
But can we formulate any useful generalizations about them?
3a) Networks within a cell are modular
Modularity of a network simply means that interactions between some components are tight
and other interactions are loose.
Complexes of physically interacting proteins are modules.
The genome of yeast Saccharomyces cerevisiae encodes ~7,000 proteins. Within the cell,
they form ~700 different complexes of physically interacting proteins. Such complexes are
modules, but this is not the whole story. Often, a protein can participate in several
complexes. Some proteins always stick together and form "cores". Other proteins form
"submodules" that can attach to different cores.
A protein complex, consisting of the core, 3 submodules, and other attachments.
Modularity is also pervasive in transcruption and metabolic networks.
Transcription factors (boxes) separately regulate genes involved in different processes.
Modules, associated with different functions,
in the metabolic network in Escherichia coli.
Hierarchical organization of modularity in
metabolic networks.
Simple explanation for:
3a) Networks within a cell are modular
Well, nothing is going to be simple here! We only might assume that, perhaps, networks
within cells are modular because such networks are evolvable and designable, and not
because they are optimal.
3b) Networks with a cell consist of a small number of common motifs
Network motifs are patterns of interconnections that recur in many different parts of a
network. All networks within cells consist mostly of a small number of motifs that evolved
independently.
Much of the network of transcriptional interactions in Escherichia coli is composed of
repeated appearances of three motifs. Each motif has a specific function in determining
gene expression.
Feedforward loop: a transcription
factor X regulates a second
transcription factor Y, and both
jointly regulate one or more operons
Z1...Zn.
Example of a feedforward loop (Larabinose utilization).
SIM motif: a single transcription factor, X,
regulates a set of operons Z1...Zn. X is
usually autoregulatory. All regulations are of
the same sign. No other transcription factor
regulates the operons.
Example of a SIM system (arginine
biosynthesis).
DOR motif: a set of operons Z1...Zm are each
regulated by a combination of a set of input
transcription factors, X1...Xn. DORs are
detected as dense regions of connections.
Example of a DOR (stationary phase
response).
The most common motif in metabolic networks that regulate enzyme activity: negative
feedback loop. Again, such loops evolved independently very many times in different
metabolic pathways.
Simple explanation for:
3b) Networks with a cell consist of a small number of common motifs
Apparently, there are not too many feasible solutions for each of the simple regulatory tasks
that a part of the network has to perform. We may be dealing with unique optimality here, as
far as the overall structure of regulatory interactions is considered.
Level-specific generalizations:
4. Multicellular organisms
A cell is alive, but often cells are not independent. Multicellular organisms evolved from
unicellular five times. Multicellular organisms are as complex as constituent cells, if not
more. Obviously, cell differentiation, pattern formation, and overall properties of organisms
are all essential.
Cell differentiation
Pattern formation
Overall phenotype
4a) Cell differentiation involves combinatorial regulation of gene expression
The genome of a multicellular organism programs development of many different cell types,
although it contain only slightly more genes that the genome of a unicellular organism.
Greater complexity of multicellular organisms appears because, on average, their genes are
regulated by a much larger number of transcription factors.
a, Simple eukaryotic transcriptional unit. A simple core promoter (TATA), upstream activator
sequence (UAS) and silencer element. b, Complex metazoan transcriptional control modules
consisting of multiple clustered enhancer modules interspersed with silencer and insulator
elements.
Moreover, the total number of transcription factors encoded by any genome is not large. For
example, the genome of Drosophila encodes only ~800 transcription factors. Thus,
combinatorics of transcription factors and their binding sites is essential both for genespecific and tissue-specific patterns in gene expression.
An array of binding sites for 4
transcription factors in a
controlling region of a typical
gene of a multicellular organism.
Each of these factors regulates
many other genes - but in
different combinations.
Simple explanation:
This generalization is not fully understood. Perhaps, combinatorial regulation evolves
because novel transcription factors are more difficult to acquire than novel binding sites for
them. Or, alternatively, such regulation may be the most efficient one feasible. Opportunism
or optimality? - we do not know.
4b) In the development of vertebrates one stage is particularly conservative
The embryonic development of all
vertebrates shows remarkable similarities at
the early - but not the earliest - stage called
the pharyngula. At this stage all vertebrates
have notochord, dorsal hollow nerve cord,
post-anal tail, and a series of paired
branchial grooves, matched on the inside by
a series of paired gill pouches. The pattern
is known since XIX centry, as Von Baer's
law.
Simple explanation:
Perhaps, early stage of development are
less evolvable, because their changes afect
all subsequent stages.
4c) Body size often increases in, but declines on islands
This pattern is known as Cope's rule, and has been observed repeatedly. Larger animals are
apparently more prone to extinction.
Body size is plotted
against time, for
species of
Borophaginae (a clade
of extinct carnivors).
Clearly, this pattern cannot be universal! Indeed, there are many exceptions. In fact, on
islands body size of many - but again not all - organism declines, a pattern known as
Foster's Rule.
The Pygmy Mammoth
(Mammuthus exilis) was a
dwarfed descendant of
full-sized mammoths that
lived on an island known
as Santa Rosae.
Wrangel island - the range of
another dwarfed mammoth,
extinct only ~3,500ya.
Simple explanation:
Clearly, this is a mess!
Skeleton of a Cretan Dwarf
Elephant.
Level-specific generalizations:
5. Populations
Here the complexity of our object drops again - a part may be more complex than the whole,
if we can view parts as black boxes. Populations are sets of similar individuals, and we care
only about those properties of inviduals that describe them as members of such sets,
without looking under the hood.
Organism
Individual
Population of individuals
Complexity of life peaks at cells and organisms - lower and upper levels are simpler.
5a) Reproduction almost always involves unicellular channels
Why to recreate big organisms from single cells, every generation?
Indeed, vegetative reproduction, i. e. reproduction by many cells, is perfectly feasible even
in humans - but it almost never replaces single-cell reproduction completely.
Some other examples of vegetative reproduction
Even when reproduction is nominally vegetative - for example, a
branch of a moss becomes an independent organism - all the cells
of this branch may originate from a single apical meristemal cell.
Even mitochondria in female germline in mammals go through
drastic bottlenecks - all mitochondia of a newborn are descendants
of just 3-4 stem maternal mitochondria. Why?
Simple explanation:
Probably, single-cell (and single-genotype) channels make selection more efficient. In fact,
this is not that simple, and other explanations are feasible. We will return to this issue later.
5b) Amphimixis is pervasive
Why is such a crazy process - alternation of syngamy and meiosis - ubiquitous?
Indeed, apomixis (asexual reproduction) is very common, but almost never represents the
only mode of reproduction. The only known exception are bdelloid rotifers.
An obligately apomictic
bdelloid rotifer.
Simple explanation:
There is no definite explanation for the ubiquity of amphimixis. Almost 20 hypotheses have
been proposed, and 3 or 4 among them make sense. We will consider this issue later.
Level-specific generalizations:
6. Ecosystems
As you know, ecosystems consist of interacting populations.
6a) Natural ecosystems can be successfully invaded.
Purple loosestrife, Lythrum salicaria, a
very successful invader of European
origin in North America.
Elodea canadensis, a very successful
invader of North American origin in
Eurasia.
An invasion present a paradox: why should an invader should be successful within the new
environment, to which it never had a chance to adapt? Apparently, natural ecosystems have
a lot of empty niches.
Simple explanation:
There are several hypotheses but none is universally accepted. Still, it is clear that the
problem is an evolutonary one.
Quiz:
Formulate your own generalization about evolution at any level of organization of life. This
generalization does not need to be sweeping and very important - just make sure that it
makes sense.
Generalizations concerned with diversity of life:
1. Diversity of life at a particular moment of time
a) Every individual belongs to a population of at least ~1000 individuals
b) At any moment, life mostly consists of compact, disconnected forms
c) Genotypes are incompatible if the distance between them exceeds ~1-5%
2. Evolution of a lineage
a) Changes of a lineage are continuous, with some caveats
b) Genomes evolve at much more uniform rates than phenotypes
3. Birth and death of lineages
a) Cladogenesis is often, but not always, triggered by geographic isolation
b) Cladogenesis and extinction are extremely unfair processes
c) Overall diversity of life fluctuates, with the long-term tendency to increase
4. Independent evolution in multiple lineages
a) Evolution is predominantly divergent, but homoplasy is common in simple traits
b) Independent evolution eventually leads to speciation
5. Coevolution
a) Lineages often coevolve for a long time
b) Organisms often imitate each other to avoid been eaten
6. Diversity in space
a) Distributions of ranges of species are strongly affected by limited dispersal
b) Independent evolution at different localities is often parallel
Generalizations concerned with diversity of life:
As long as we are ready to ignore the complexity of life, evolution of its diversity is
understood reasonably well.
1. Diversity of life at a particular moment of time
1a) Every individual belongs to a population of at least ~1000 individuals
This fundamental fact is so familiar that it is often taken for granted - although it should not.
Loch-Ness monster does not
exist - there must be at least
1000 of them.
The same is probably true
for yeti.
Mating ball of Garter
snakes.
Simple explanation:
Population genetic theory demonstrates that a small population will soon become extinct
due to inefficient selection against new deleterious mutations. We will consider this theory.
1b) At any moment, life mostly consists of compact, disconnected forms
Indeed, at least among multicellular eukaryotes, we often encounter "good species", i. e.
compact sets of similar and compatible organisms.
Often, a form of life is not very compact phenotypically, but it still compatible and connected
within itself, and disconnected from other forms.
Aquilegia formosa
Aquilegia pubescens
Sometimes, two compatible phenotypes are connected by only a relatively small number of
hybrids, so it is not clear whether to treat them all as one form of life or not.
Occasionally, connection exists even between incompatible genotypes.
Of course, according to the Strong Claim, every two organisms are connected, if we take
into account all organisms, present and past. Still, "intermediate" genotypes and
phenotypes have a tendency to disappear. Among modern organisms, continuous paths
within the space of genotypes are no longer than 0.01-0.1 of DNA-level differences.
Simple explanation:
There are probably several reasons behind this generalization:
(i) species might be adapted to discontinuous ecological niches,
(ii) reproductive isolation (which can arise only in sexual taxa) might create gaps between
taxa by allowing them to evolve independently,
(iii) Anagenesis is only rarely coupled with continuous range expansion, and that such
expansion cannot be too long - because the Earth is too small.
c) Genotypes are incompatible if the distance between them exceeds ~1- 5%
Very often, incompatible genotypes are also disconnected (again, only within modern
organisms). There are no living, fit intermediates between dog and cat ,or horse and donkey.
nothing to
show
Two incompatible, disconneced genotypes - the most common situation.
Two partially compatible, disconnected genotypes (mules are viable, but sterile).
Still, compatible genotypes (left and right) may be disconnected, due to geographical
isolation (hybrid in the center was produced artificially).
Occasionally, even incompatible genotypes remain connected.
Despite this variation, there is a strong correlation between incompatibility and dissimilarity.
Incompatibility appears when the genetic distance between two genotypes exceeds 0.01 0.05.
Each point represents a pair of species of Drosophila.
No wonder that genotypes that are very dissimilar are also incompatible. However, it seems
that incompatibility kicks in surprisingly abruptly.
Mitochondrial genetic distance between most distant hybridizable species do not differ
between birds and mammals. Such distances correspond to nuclear DNA genetic distances
0.01 - 0.03 (mitochondria evolve faster).
Curiosuly, in mammals, within clades with invasive placenta hybridization is possible
between more dissimilar species.
Simple explanation:
There is no need to explain, really, why incompatibility generally increases with dissimilarity.
However, the likely reason for a rapid transition from compatibility to incompatibility is
nontrivial and is known as Orr's snowball effect - we will consider it later.
Generalizations concerned with diversity of life:
2. Evolution of a lineage
a) Changes of a lineage are continuous, with some caveats
Children are
similar to parents
A rare exception: WDG
A rare exception: symbiogenesis
Simple explanation:
With some exceptions, long parent-offspring leaps within the space of genotypes are just
impossible: most of potential genotypes are junk, and a long leap will land you in junk.
b) Genomes evolve at much more uniform rates than phenotypes
At the level of sequences, different lineages can easily accumulate changes at rates that
vary within a factor of 1.5-2.0, but variation of rates is rarely large.
Lengths of dog, mouse, and human branches of the unrooted phylogenetic tree in numbers
of nucleotide substitutions per a synonymous (Ks) and a nonsynonymous site (Ka).
Sequences evolved almost 3 times faster on
the mouse branch than on the human
branch - because the number of generations
was much higher in the mouse branch.
In contrast, phenotypes occasionally evolve at very different rates along different branches.
Simple explanation:
No law of nature prescribes a constant rate of genome evolution. Thus, its approximate
uniformity is something of a mystery. We will consider this issue later.
Heterogeneity in rates of phenotypical evolution must be, at least partially, due to
heterogeneity of strength of Darwinian natural selection.
Generalizations concerned with diversity of life:
3. Birth and death of lineages
3a) Cladogenesis is often, but not always, triggered by geographic isolation
Geographical isolation always leads to
unlimited divergence. However, this is
not the whole story.
A lineage can also split into two even without geographic subdivision. This process is called
sympatric speciation.
For example, in a crater lake Apoyo in Nicaragua a new species of cichlids evolved
sympatrically in the course of ~10,000 years.
Amphilophus citrinellus (left) is the
ancestral form and A. zaliosus (right)
is a new species.
By now, these two species are quite different morphologically, occupy substantially different
ecological niches, and do not hybridize in nature.
Simple explanation:
If a lineage is subdivided into two isoalted parts, these parts are bound to evolve
independently and, eventually, will become very dissimilar, disconnected and incompatiblebecause evolution is primarily divergent. This is trivial.
In contrast, cladogenesis without prior geographical isolation is a complex and fascinating
subject, to be considered later.
3b) Cladogenesis and extinction are extremely unfair processes
We already saw this may times.
Simple explanation:
Why should they be fair? Is life fair? Specific reasons for unfairness, however, are not clear,
and may be 1) "Key innovations", 2) Ecological opportunities, 3) Chance.
Questions to think about:
Can we say that a clade which diversifies faster has a selective advantage over a clade
which diversifies slower? Is it true that species from a more diverse clade are more
advanced (derived)? Is Amborella a living fossil?
3c) Overall diversity of life fluctuates, with the long-term tendency to increase
Reliable data exist only for
times since Cambrian, but the
tendency is clear.
Simple explanation:
Initially, the diversity of life was low, so it could only grow from there. However, it is not
clear why an equilibrium has not yet been reached.
Generalizations concerned with diversity of life:
4. Independent evolution in multiple lineages
4a) Evolution is predominantly divergent, but homoplasy is common in simple traits
When a complex enough genotype or phenotype is considered, divergence always
dominates. Divergence of sequences eventually reaches saturation at ~75%, but divergence
of phenotypes is unlimited.
However, homoplasy is also common, as long as we consider simple traits that can only
accept a small number of states. In proteins, per site rate of parallel amino acid
replacements is above the average.
Homo
Macaca
Rattus
Mus
fkVmnasdfrtshnmcvadnmd
fklmnasdfrtshnmcvqdnmd
fklmnatdfrtshnmcvadnmd
fkvmnasdfrtshnicvadnmd
At sites (painted red), were an amino acid replacement occurred between rat and mouse, the
same replacement occurs between human and monkey with probability that is ~5 times
higher that the probability of replacement at other sites.
In contrast, at the level of complex phenotypes, homoplasy, although widely-known, is
always superficial.
Simple explanation:
When we consider the whole multidimensional space ot possibilities, homoplasy is very
improbable. Imagine two hikers wandering in a 41,000,000-dimensional forest, starting from the
same location. Thus, at the level of complex genotypes and phenotypes, homoplasy must be
forced by similar selection operating on different lineages. In contrast, when we consider a
1-dimensional subspace with only 4 or 20 states, random homoplasy is possible, and is
quite common because if an even occurred in one lineage it must be harmless.
4b) Independent evolution eventually leads to speciation
This is always the case, qualitatively. Quantitatively, however, the rates of speciation can
vary. On the one hand, lineages which became geographically isolated over 40 Mya may still
hybridize and produce fertile offspring.
Platanus orientalis
from Asia
Their hybrid, "London plane"
Platanus occidentalis
from North America
On the other hand, host races of insects with high degree of reproductive isolation can
appear after ~100 years of different selection.
Alphalpha (left) and clover (right) races of pea aphid Acyrthosiphon
pisum are reproductively isolated to a large degree.
Simple explanation:
Because independent evolution is mostly divergent and incompatibility increases with
dissimilarity, this pattern in inavoidable.
Generalizations concerned with diversity of life:
5. Coevolution
5a) Lineages often coevolve for a long time
Often, a host and its symbiont
or parasite have congruent
phylogenies, suggesting their
co-divergence (cospeciation).
An example of this is provided
by mealybugs and their
bacteria symbiont Tremblaya.
A spectacular example of a
long-term coevolving
association are figs and fig
wasps, which cospeciate for
over 60My.
Simple explanation:
This is not surprising, when the host and the symbiont totally depend on each other.
5b) Organisms often imitate each other to avoid been eaten
Mimicry is a spectacular phenomenon. There are two kinds of mimicry.
Batesian mimics where the mimic resembles the successful species but does not share the
attribute that discourages predation.
Palatable viceroy Limenitis
archippus (top) mimics bitter
monarch Danaus plexippus.
Non-venomous Scarlet kingsnake
Lampropeltis triangulum (top)
mimics deadly coral snake
Micruroides euryxanthus.
Müllerian mimics where the mimic resembles the successful species and shares the antipredation attribute (dangerous or unpalatable.)
Heliconius erato (above), and H. melpomene (below), a pair of impalatable Müllerian mimics
from different areas of Ecuador and Northern Peru. Within any area, the two species are
extremely accurate mimics of one another, but major geographic differences in colour
pattern have evolved within each species.
Simple explanation:
This is natural selection!
A question to think about:
Does mimicry constitute an evidence for evolution?
(do not tell anybody, but here I disagree with Darwin).
Generalizations concerned with diversity of life:
6. Diversity in space
6a) Distributions of ranges of species are strongly affected by limited dispersal
This is a trully pervasive pattern.
For example, there are 13 species of
finches on the Galapagos islands,
occupying a wide variety of ecological
niches.
Simple explanation:
Limited dispersal can strongly affect the
outcomes of even slow evolution.
6b) Independent evolution at different localities is often parallel
Simple
explanation:
This is natural
selection! The
arrays of ecological
niches available for
similar organisms
at different places
tend to be similar.
Generalizations concerned with adaptation and complexity:
1. Genetical aspects of adaptive evolution
a) Evolution of both coding and non-coding sequences is important for adaptation
b) The target for strong positive selection is narrow at each moment
c) Tightly related genes can perform rather different functions
2. Phenotypic aspects of adaptive evolution
a) Adaptations can be very general and very specific
b) Evolution is irreversible
c) Perhaps, all adaptations are imperfect
3. Origin of novelties
a) New non-coding regulatory sites, but not new genes, often appear from scratch
b) Origin of phenotypic novelties is usually opportunistic and can happen fast
4. Dynamics of complexity
a) Complex phenotypes evolve through adaptive intermediate stages
b) Complexity is rapidly lost, if selection stops maintaining it
c) The overall trend is for complexity to increase
Generalizations concerned with adaptation and complexity:
Of the two main assignments of evolutionary biology, to understand the origin of diversity
and of complexity and adaptation of life, the second one is by far the most difficult.
1. Genetical aspects of adaptive evolution
Genetics of adaptive evolution is understood better than it other aspects.
1a) Evolution of both coding and non-coding sequences is important for adaptation
........................................Allele.C...................Lys.........
........................................Allele.S...................Val.........
transcription.start..............................MetValHisLeuThrProGluGluLysSer...
catttgcttctgacacaactgtgttcactagcaacctcaaacagacaccATGGTGCACCTGACTCCTGAGGAGAAGTCT...
........................................Allele.S....................T..........
........................................Allele.C...................A...........
A recent and imperfect adaptation in humans. Alleles S and C of beta-hemoglobin, both
causing a single amino acid replacement (of the same glutamine) protect against malaria in
heterozygous state.
In our ancestors and in many modern human populations adults are lactose-intolerant. The
ability of adults to produce lactase is due to T -> C substitution at site -13910 upstream of
the start codon of LCT locus (in Europeans) and due to a G ->C substitution at site -14010 (in
Africans).
Simple explanation:
It is only natural that adaptation may involve changes both in proteins and in regulation of
their synthesis.
1b) The target for strong positive selection is narrow at each moment
In a typical protein sites that are currently under positive selection are rare and interspersed
among numerous sites under negative selection.
Sites that were under recent positive selection are painted red in a primate seminal protein
Kallikrein 2 and in HIV-1 protein gp120. Usually, the fraction of such sites is much lower.
Positive selection in non-coding segments also appears to be relatively rare.
Simple explanation:
Natural selection is, above all, a conservative force. Also, a large target for positive
selection would imply substantial suboptimality of the phenotype, which may be lethal.
1c) Tightly related genes can perform rather different functions
Occasionally, a protein completely changes its function due to not too many changes.
MATEGDKLLGGRFVGSTDPIMEILSSSISTEQRLTEVDIQASMAYAKALEKASILTKTELEKILSGLEKISEESSKGVLV
MA+EGDKL.GGRF.GSTDPIME+L+SSI+.+QRL+EVDIQ.SMAYAKALEKA.ILTKTELEKILSGLEKISEE..SKGVV
MASEGDKLWGGRFSGSTDPIMEMLNSSIACDQRLSEVDIQGSMAYAKALEKAGILTKTELEKILSGLEKISEEWSKGVFV
MTQSDEDIQTAIERRLKELIGDIAGKLQTGRSRNEQVLTDLKLLLKSSTSVISTHLLQLIKTLVERAAIEIDIIMPGYTH
+.QSDEDI.TA.ERRLKELIGDIAGKL.TGRSRN+QV+TDLKLLLKSS.SVISTHLLQLIKTLVERAA.EID+IMPGYTH
VKQSDEDIHTANERRLKELIGDIAGKLHTGRSRNDQVVTDLKLLLKSSISVISTHLLQLIKTLVERAATEIDVIMPGYTH
LQKALPIRWSQFLLSHAVALTRDSERLGEVKKRITVLPLGSGALAGNPLEIDRELLRSELDMTSITLNSIDAISERDFVV
LQKALPIRWSQFLLSHAVAL.RDSERLGEVKKR++VLPLGSGALAGNPLEIDRELLRSELD..SI+LNS+DAISERDFVV
LQKALPIRWSQFLLSHAVALIRDSERLGEVKKRMSVLPLGSGALAGNPLEIDRELLRSELDFASISLNSMDAISERDFVV
ELISVATLLMIHLSKLAEDLIIFSTTEFGFVTLFDAYSTGSSLLPQKKNPDSLELIRSKAGRVFGRLAAILMVLKGIPST
EL+SVATLLMIHLSKLAEDLIIFSTTEFGFVTL.DAYSTGSSLLPQKKNPDSLELIRSKAGRVFGRLAA+LMVLKG+PST
ELLSVATLLMIHLSKLAEDLIIFSTTEFGFVTLSDAYSTGSSLLPQKKNPDSLELIRSKAGRVFGRLAAVLMVLKGLPST
FSKDLQEDKEAVLDVVDTLTAVLQAATEVISTLQVNKENMEKALTPELLSTDLALYLVRKGMPIRQAQTASGKAVHLAET
++KDLQEDKEAV.DVVDTLTAVLQ.AT.VISTLQVNKENMEKALTPELLSTDLALYLVRKGMP.RQA..ASGKAVHLAET
YNKDLQEDKEAVFDVVDTLTAVLQVATGVISTLQVNKENMEKALTPELLSTDLALYLVRKGMPFRQAHVASGKAVHLAET
KGITINNLTLEDLKSISPLFASDVSQVFSVVNSVEQYTAVGGTAKAA
KGI.IN.LTLEDLKSISPLFASDVSQVF++VNSVEQYTAVGGTAK++
KGIAINKLTLEDLKSISPLFASDVSQVFNIVNSVEQYTAVGGTAKSS
Delta-crystalline (top) and argininosuccinate lyase (bottom), both of chicken, Gallus gallus.
Simple explanation:
There are often many peaks on the fitness landscape, and the closest peak may be not far
away from any point in phase space. A little editing replaces fuction A with function B.
Generalizations concerned with adaptation and complexity:
2. Phenotypic aspects of adaptive evolution
This is the most difficult, and the least understood, facet of evolutionary biology.
2a) Adaptations can be very general and very specific
Heart is a general adaptation,
necessary for any large active
organism.
Coloration of viceroy
would not be adaptive
without monarch
Simple explanation:
Why not?
2b) Evolution is irreversible
Reversals at individual simple traits are common, but a reversal of substantial evolution has
never been observed.
Can a hermit crab abandon its
dependence on gastropod shells?
Yes, it can! King crabs
originated from hermit
crabs.
Still, king crabs retained asymmetrical abdomens of their hermit crab ancestors.
Another example: aquatic tetrapods still need air for breezing:
Simple explanation:
Perhaps, fitness landscapes are too complex and variable to allow exact return to the
starting point of an evolutionary trajectory.
2c) Perhaps, all adaptations are imperfect
We have no good data to directly suport this hypothesis, because measuring adaptation
precisely is currently impossible. However, it seems very likely because:
i) Most of fitness landscapes probably have multiple peaks, and the probability of reaching
the highest peak, after climbing strictly up from a random starting point, is very low.
ii) Often, an apparently good adaptation is achieved by a slight modification of a phenotype
that performed an unrelated function.
Generalizations concerned with adaptation and complexity:
3. Origin of novelties
Origin of new functions is a particularly intriguing aspect of the adaptive evolution.
3a) New non-coding regulatory sites, but not new genes, often appear from scratch
Binding sites of transcription factor Zeste and shown by boxes, the sites in the top two and
bottom two species are located differently, indicating their gains and losses.
In contrast, the origin of a protein-coding gene from scratch (i. e., entirely from a non-coding
sequence) is a very rare event - so far, only ~10 such occasions have been documented.
Simple explanation:
A typical transcription factor-binding site is small enough to "condence from chaos. In
contrast, a random segment very rarely encodes even a slightly useful protein.
3b) Origin of phenotypic novelties is usually opportunistic and can happen fast
Even a novel function tends to appear on the basis of a pre-existing adaptation.
MATEGDKLLGGRFVGSTDPIMEILSS
MA+EGDKL.GGRF.GSTDPIME+L+S
MASEGDKLWGGRFSGSTDPIMEMLNS
Origin of a crystalline from an
enzyme.
Males of a toothed whale narwhal (Monodon
monoceros) have a 2-3 m long tusk, which is an incisor
tooth on the left side of the upper jaw.
Feathers of flightless
dinosaurs allowed birds
to evolve flight.
Skeletons of semiaquatic
mammals transitional from
land to sea in the origin of
whales.
Making a whale from a
"protohippo" took ~15 My.
Making a human from an ape
took ~5My.
Simple explanation:
Evolution is not apt to perform long jumps in the space of phenotypes, and works with preexisting material. Apparently, under strong positive selection evolutiona can be fast.
Generalizations concerned with adaptation and complexity:
4. Dynamics of complexity
Here, generalizations is almost all we have.
4a) Complex phenotypes evolve through adaptive intermediate stages
Simple explanation:
Evolution cannot do it any other way, but we still do not really know how this happens.
4b) Complexity is rapidly lost, if selection stops maintaining it
Mycobacterium
leprae is in the
middle of massive
genome
degeneration
A parasitic plant
Epifagus virginiana
lost many key
genes in its
chloroplast genome
Astyanax mexicana, like
many other cave animals,
has degenerated eyes
Simple explanation:
To demolish is easier than to build.
A crustacean parasite of
fish, Lernaea carassii, has
profoundly simplified
morphology
4c) The overall trend is for complexity to increase
There is no law of nature that would force complexity to always increase - because it can
easily decline. Still, the overall trend is up.
Simple explanation:
Initially, the complexity was low, so the only direction for it to change was up. This is not the
whole story, of course.
Epilogue for generalizations regarding past evolution:
was evolution a purely natural phenomenon and does it matter?
We know that modern life is a product of evolution - simple creationsm is refuted
by evidence for past evolution. However, such evidence do not necessarily refute the claim
that Supernatural Power somehow guided evolution. Do we have any reasons to think this
was the case (of course, here we need strong reasons, due to Occam's razor)?
One such reason may be an apparent inability of evolution to produce complex
adaptaions - an argument going back to Darwin. However, here we are on shaky ground our theoretical understanding of phenotypic evolution remains so poor that we simply
cannot say a priori what is impossible and what is possible, or to discern Supernatural
guidance in the course of past evolution of complex phenotypes.
It is better to address this issue at a much better understood level of sequences.
Because phenotype is mostly determined by genotype (we just do not know how exactly), in
order to guide evolution a Supernatural Power would have to guide the evolution of
genomes. Do we see any traces of this? The answer (so far) is "No".
Let us consider the last 5My of human evolution, because this short episode in the
history of life is of special interest for any anthropocentric religion. We see no traces of
supernatural intervention in changes accepted by our lineage after human-chimpanzee
divergence - only substitutions, deletions, and insertions (duplications), all suppliable by
natural mutation - and no new pieces of DNA that look as if they came from Heaven. Thus,
the null hypothesis of purely natural evolution of humans from apes must be kept.
Still, to prove that something is absent is essentially impossible. Thus, if you
believe (for any reason) that evolution of humans from apes was supernaturally guided,
study human-chimpanzee-orangutang alignments. I am not optimistic - but if you find a
change in the human lineage that cannot be explained naturally, this would be the most
important discovery in the history of all natural (or supernatural?) sciences.
What to search for in such an analysis? An obvious trace of a supernatural intervention
would be a long meaningful sequence that appeared without a plausible source in the
human lineage, something like this:
Homo sapiens
nlpirqrtgillygppsinnersrepentgtgktllagviaresrmnfi
Pan troglodytes nipirqrtgillygpp-------------gtgktllagvivresrmnfi
Pongo pygmeus
nlpirqrtgillygpp-------------gtgktllagviaresrmnyi
If no such sequences are found, a weaker evidence would be unexplained differences
between overall patterns of sequence evolution in the human and chimpanzee lineages.
However, as long as no overt traces of supernatural intervention are evident, we have to
assume that such intervention did not happen.
Does this all really matter? I am (to the best of my knowledge) 1/2 Jewish, 23/64
Russian, 1/8 Latvian, and 1/64 French. OK, Jewish and Russian components (my father and
mother) are to some extent important - but do I really care that my great-great-great-greatgrandfather (named Laurent) was a Napoleon's soldier, captured (according to our family
legend) by my great-great-great-great-grandmother after Kutuzov destroyed the invading
Great Army? And this happened less that 200 years ago - so why do people care about their
great-great-...-great-grandparents being apes (or worms, or protozoans)?
A human being does not "evolve from apes", but develops from the zygote, and
slow evolution of our remote ancestors is only marginally relevant to the mystery of the
origin of the newborn (or, occasionally, of identical twins) 40 weeks after a fusion of two
gametes. Human nature emerges, again and again, in the course of individual ontogeneses,
instead of appearing just once during the evolution of Homo sapiens. The lack of traces of
any overt, proximal Supernatural involvement in human evolution does not imply that an
individual human being, with their mind, consciousness, and, according to the views of
some, immortal soul and free will, is a purely natural phenomenon.
Also, instead of proximal causes, one can contemplate philosophical issues. If the
Material Universe is inherently stochastic, we are free to attribute an apparently random
mutation, or any other event, to the will of Providence. Moreover, evolutionary origin of
humans required a lot of conditions, from the right values of physical constants to the
suitable distance between the Earth and the Sun and the timely meteorite strike which
cleared the Earth of dinosaurs. Can we interpret these conditions as a work of Providence?
This is a possibility, although an alternative, called Anthropic Principle, also exists - the
conditions were right for our origin because otherwise we would not be here to ponder such
questions. However, philosophical questions, such that neither answer can ever be proven
on the basis of laws of nature, do not belong to the domain of natural sciences.
Quiz:
Imagine that for every amino acid sequence we completely know the properties of the
corresponding protein. Consider new insights that can be provided by this knowledge for
any three evolutionary generalizations.