Transcript Document
Evolution
What causes evolution?
Speciation & hybridization.
Uncovering evolutionary history.
The four forces of evolution:
• Mutation -- spontaneous changes in the DNA of gametes.
Mutations are the result of mistakes in DNA replication,
exposure to UV or to some chemicals (mutagens) and other
causes. Prerequisite to all other evolution.
• Natural Selection -- genetically-based differences in survival
or reproduction that leads to genetic change in a population.
• Gene flow -- movement of genes between populations. In
plants this can be accomplished by pollen or seed dispersal.
• Genetic drift -- random changes in gene frequency. This is
very important in small populations.
All these plants
are the same
species:
Brassica
oleracea
Mutation: Generation of new alleles:
Point mutations
(changing one base to another, e.g., C-->T)
• unrepaired DNA damage, e.g. from UV-light, chemicals
•
uncorrected copying errors: in any system, error-free
transmission of information is a theoretical impossibility
•
Mutations that are transmitted into gametes are
evolutionarily important
Sickle cell anemia is an example of a point
mutation causing a big change in phenotype.
Point mutations are only one of many kinds of chance
genetic change:
•
Indels
•
Chromosomal mutations
•
Gene duplication
•
Polyploidy
– (insertions/deletions)
– Cause frame-shifts, & usually
premature ‘stops’
– Inversions, translocations,
deletions
– May lead to new functions
– May lead to new species in
one step
– Very common in plants
Q: What are the consequences of mutations for an
individual’s ability to survive and reproduce?
A: Most mutations have no effect or almost no effect.
• Why?
1. Most of the genome seems to be ‘junk’ -- at least it doesn’t code
for proteins. We still may have a lot to learn here but the empirical
evidence regarding mutations’ effects support this view.
2. Many mutations within protein-coding genes don’t change the
amino acid specified. I.e., there is redundancy in the genetic code
For example, 6 different
codons specify the
amino acide leucine.
This distribution of the fitness effects shows:
• most ms have no effect (are
neutral)
• the remainder are usually
deleterious
the relatively high freq. of
lethals is due to missense
mutations -- those that cause a
premature ‘stop’ in protein
synthesis.
• Very few ms are beneficial
so conservation of genetic
variation is extremely
important.
Fitness
The four forces of evolution:
• Mutation -- spontaneous changes in the DNA of gametes.
Mutations are the result of mistakes in DNA replication,
exposure to UV or to some chemicals (mutagens) and other
causes. Prerequisite to all other evolution.
• Natural Selection -- genetically-based differences in survival
or reproduction that leads to genetic change in a population.
• Gene flow -- movement of genes between populations. In
plants this can be accomplished by pollen or seed dispersal.
• Genetic drift -- random changes in gene frequency. This is
very important in small populations.
The four forces of evolution:
• Mutation -- spontaneous changes in the DNA of gametes.
Mutations are the result of mistakes in DNA replication,
exposure to UV or to some chemicals (mutagens) and other
causes. Prerequisite to all other evolution.
• Natural Selection -- genetically-based differences in survival
or reproduction that leads to genetic change in a population.
• Gene flow -- movement of genes between populations. In
plants this can be accomplished by pollen or seed dispersal.
• Genetic drift -- random changes in gene frequency. This is
very important in small populations.
Gene flow tends to homogenize populations.
Rates of gene flow depend on the spatial
arrangement of populations.
More models of gene flow
The four forces of evolution:
• Mutation -- spontaneous changes in the DNA of gametes.
Mutations are the result of mistakes in DNA replication,
exposure to UV or to some chemicals (mutagens) and other
causes. Prerequisite to all other evolution.
• Natural Selection -- genetically-based differences in survival
or reproduction that leads to genetic change in a population.
• Gene flow -- movement of genes between populations. In
plants this can be accomplished by pollen or seed dispersal.
• Genetic drift -- random changes in gene frequency. This is
very important in small populations.
Founder effect: Gene flow and genetic drift are
responsible for the limited genetic variation on islands,
relative to mainland populations.
Convergence; similar features in unrelated
organisms due to evolution of traits that
“work” in similar environments
• spiny succulent growth habit in deserts
• schlerenchymatous leaves in many families that
live in dry habitats
• similar flower sizes, shapes & colors for
attracting pollinators have evolved in many
plant groups.
• low prostrate growth for of high altitude plants
Convergent structures in the ocotillo (left)
from the American Southwest, and in the
allauidia (right) from Madagascar.
This, believe it or not, is a South African
member of the milkweed family.
QuickTime™ and a
Photo - JPEG decompressor
are needed to see this picture.
alpine clover and forget-me-nots -- convergence in growth habit
Nectar feeders have converged on
this hovering long-tongued
morphology.
Another animal example of
convergence.
Hybridization (between species)
• Well -- what is a species, anyway?
• Most species were described by their morphology.
• In vertebrates, morphological discontinuities
generally correspond to fertility barriers. BSC
• In plants, many named species can hybridize.
• Hybridization can lead to:
– Homogenization of divergent ‘species’
– Production of new species; hybrids are better
than parents and/or can’t mate with parents
– If hybrids not fit and parents waste resources
making them then selection could act to
minimize hybridization.
Most dandelions are asexual.
So the biological species
concept doesn’t apply.
How can you name species
depending on who can mate
with whom when the
organisms do not mate at all?!
These two Calochortus have been named as separate species.
But they are interfertile -- should we combine them as one
species?
Their ranges do not overlap so the chance of hybridization in
Nature is very remote.
These
milkweeds
hybridize in the
central plains.
A.
A.syriaca
syriaca
hybrid
A. speciosa
Scarlet and Black
oaks can hybridize
and inhabit the
same range -- but
they have different
microhabitat
preferences and
so hybridization is
rare.
These
pines can
also
hybridize
but they
shed their
pollen at
different
times of
the
season
Speciation by hybridization
Hybridization often
shows how difficult it
is to apply the BSC
to plants. The hybrid
in this case is a new
species. The
rearrangements of
its chromosomes
make it +/- infertile
with either parent.
Tragopogon pratensis is a new species formed by hybridization
between an American Tragopogon and a European
Tragopogon that was introduced about 150 years ago.
T.p. is a
polyploid
formed by the
union of
unreduced
gametes -i.e.
2n x 2n => 4n
(Normally
n x n => 2n)
As the climate
becomes drier the
desert splits the
range of this
hypothetical tree
species. This
reduces gene flow
between the now
isolated populations
and sets the stage
for speciation.
Geographical isolation
leads to genetic differences
among the different
populations.
Theorem: geographic
isolation is necessary for
new species to arise.
Counter-theorem: strong
natural selection or big
mutations can cause
divergence within
populations.
Taxonomy vs. Systematics
• Taxonomy
– discovering
– describing
– naming
– classifying
• Systematics
– Figuring out the evolutionary
relationships of species to each other.
Taxonomy vs. Systematics
• Taxonomy products are
– descriptions of new species in journals
– Keys
– Entries in floras
e.g., Flora of Missouri lists all the species
found in MO and has keys for identifying
plants.
• Systematics produces trees that attempt to
summarize the evolutionary history of a group.
– Usually done with DNA sequences, these
days.
Phylogenetic trees have more
information than a list of names.
E.g., the nine
animal phyla
are
hypothesized
to have the
relationships
shown at left.
Modern
taxonomic
groups
generally
correspond to
clades on a
phylogenetic
tree
(=cladogram)
plant taxonomy
taxon - any group at any rank
species
genus
family
order
class
division (phylum)
kingdom
discovering
describing
naming
classifying
2 basic rules of naming organisms:
- each species name must be a binomial
- all scientific names must be in Latin or
be “Latinized”
" ironweed "
" ironweed "
Acer rubrum
-
Genus
- always capitalized
red maple
species
- not capitalized
- either italicized or underlined
Acer rubrum :
the scientific name
the Latin name
the genus & species name
Carolus Linnaeus (born Carl von Linné)
- wrote Species Plantarum in 1753
- first use of binomial nomenclature
- named 7,300 species
My academic
lineage can be traced
back to Linnaeus …
… and now,
so can yours.
Systematic relationships are illustrated on a
phylogenetic tree
This tree is not cladistic
either. Extant groups
seem to give rise to
other extant groups.
For example, human ancestors
are not the apes we know now.
present
Gorillas
Chimps
Humans
time
Common
ancestor of
chimps and
humans
We need fossils to look back in time for
morphological traits -present
Gorillas
Chimps
Humans
Neanderthal
time
Australopithicus
-- and even then we’re not sure where to
put the fossils on the tree.
present
Gorillas
Chimps
Humans
Neanderthal
time
Angiosperm
Phylogeny
Group tree.
“Dicots” are not
a monophyletic
group.
There are many kinds of
information that can be used to
estimate a phylogeny.
• Types of data
– Crossability
• Uses the ‘Biological
Species Concept’
– Cytology
• Chromosome number
• Chromosome features
• Pairing in hybrids
– Morphology
• Continuous traits
• Meristic (countable)
traits
– Molecular data
• Secondary chemicals
• Proteins
• DNA
Kinds of DNA data
• DNA/DNA hybridization
– How well do 2 spp. DNAs match as revealed by binding
kinetics
• Comparison of “Bands on a gel”, not genes per se
– RAPD, ISSR
• Genetic distance estimates from:
– Allele frequencies at many loci (isozymes, SSR)
– DNA sequences, considered as a whole
• DNA sequences, considered site-by-site
– Parsimony; the simplest pathway is probably correct
– Maximum likelihood: specify a model for evolution, fit
that model to the data & use that model to make the tree.
Distance-based approaches begin with
comparing each taxon to every other taxon…
Sp1 Sp2 Sp3 Sp4 Sp5
Sp1
Sp2
Sp3
Sp4
Sp5
0
d12 d13 d14 d15
0
d23 d24 d25
0
d34 d35
0
d45
0
…to estimate a
“distance matrix”
Distances are then ‘clustered’ to
estimate a phylogenetic tree.
• Types of clustering algorithms
– UPGMA
– Fitch-Margoliash
– Neighbor-Joining
Many kinds of data are appropriate for the
distance matrix, then clustering approach.
– Crossability
• Uses the ‘Biological
Species Concept’
– Morphology
• Continuous traits
• Meristic (countable)
traits
– Cytology
• Chromosome number
• Chromosome features
• Pairing in hybrids
– Molecular data
• Secondary chemicals
• Proteins
• DNA
Parsimony and ML approaches
use a different data structure.
trait1
Species 1
0
2
3
1.2 red
4
trait5
A
T
Species 2
0
3.4 blue G
C
Species 3
1
3.5 red
A
T
Species 4
1
4.0 red
A
T
Species 5
1
2.8 blue G
T
Traits must
have discrete
character
states.
Using only trait 1 …
trait1
Species 1
2
3
4
trait5
sp1
0
1.2 red
Species 2
0
3.4 blue G
C
Species 3
1
3.5 red
A
T
Species 4
1
4.0 red
A
T
Species 5
1
2.8 blue G
T
A
sp2
T
trait1:0<->1
sp3 sp4 sp5
But traits 3 & 4 disagree with
trait 1. Trait 5 is no help.
trait1
2
3
4
trait5
Species 1
0
1.2 red
A
T
Species 2
0
3.4 blue G
C
Species 3
1
3.5 red
T
A
Species 4
1
4.0 red
A
T
Species 5
1
2.8 blue G
T
sp5
sp2
Red<->blue
A<->G
sp1 sp4 sp3
Since two traits (blue, G) suggest the left
tree it is more parsimonius than the right
tree, which is based on one trait (0).
2
5 1 3 4
0
2 3 4 5
1
Blue
0
G
Blue
G
0
Blue
G
Red
A
1
Red
A
1
Maximum likelihood begins with
a model of nucleotide substitution
A
C
G
A
P(A)
C
P(C->A) P(C)
G
P(G->A) P(G->C) P(G)
T
P(T->A) P(T->C)
T
P(A->C) P(A->G) P(A->T)
P(C->G) P(C->T)
P(G->T)
P(T->G) P(T)
Probabilities are iteratively estimated for
all the transitions in the substitution
matrix until the probabilities are found
that best fit the data.
Other parameters often estimated are:
• rate variation among nucleotide sites,
• AT/GC ratio
Then the best model of evolution for the
data is used to generate the tree