Transcript Slajd 1
3. The reconstruction of phylogeny
The first Darwinian principle told that every phylogenetic tree has one common ancestor.
Phylogenetic analysis is the study of taxonomic relationships among lineages.
Phylogenetic systematics
Numerical taxonomy
Cladistics (greek κλάδος: branch)
Willi Hennig
(1913-1976)
Robert Sokal
(1927-)
http://www.faunaeur.org/
http://tolweb.org/tree/phylogeny.html
http://www.eol.org/
The cladistic methodology
B
A
ade
adf
e
f
C
D
abc
c
d
e
b
a
Ancestor
abd
Apomorphies are common derived
characters.
Autapomorphies are characters that are
restricted to single lineages.
Plesiomorphies are ancestral derived
characters.
e: Autapomorphy of lineage D
b: Synapomorphy of lineage C+D
d: Plesiomorphy of lineage A
It is a symplesiomorphy
a: Apomorphy of the whole tree
It is the ancestral state.
The collective set of plesiomorphies defines the ground plan of a phylogenetic tree.
B
A
C
ade
adf
e
f
C is the sister taxon of A and B
abd
Character d in lineages A, B, and C is not
homologous because it derived twice. It
is homoplasious
d
d
Character a in lineages A, B, and C is
homologous because it synapomorph
b
a
Ancestor
Monophyletic taxon Paraphyletic taxon
B
A
e
f
Polyphyletic taxon
C
D
f
d
b
d
Ancestor
b
E
The ultimate aim of
taxonomy is to group
higher taxa into
monophyletic subtaxa.
For this task we have to
infer autapomorphies
Autapomorphy defines
monophyly
Tetrapoda
The diversification
of an evolutionary
tree is called
cladogenesis
Actinopterygia
Dipnoi
Amniota
Archosauria
Anura Urodela
Mammalia
Squamata
Aves
Therosauria
Loss of tail
apomorph
Mammae
autapomorph
Reptilia
(paraphyletic)
Feathers
apomorph
Amnion
apomorph
Common ancestor
Tetrapod
limbs
apomorph
Lungs
plesiomorph
The evolutionary
change within a lineage
is called anagenesis
Linnean systematics and cladistics
Linnean approach
Hennigean approach
Hierachical encaptive system
Hierachical encaptive system
Phenomenological method based on similarity Analytical method based on lineage branching
It uses grades (groups of similar body plan)
It uses clades (groups of identical root)
Different taxonomies are possible
Only one taxonomic solution is allowed
There is no clear decision intrument for
taxonomies
Autapomorphies decide about taxonomic
position
The number of higher taxa is rather small
The number of higher taxa is large
(Pisces, Amphibia, Reptilia, Aves, Mammalia) (Pisces, Amphibia, Reptilia are not valid taxa )
It does not assume common evolutionary
history
It is based on common evolutionary history
It does not reconstruct evolution
It does reconstruct evolution
Taxonomy is independent of evolution
Taxonomy is a part of evolutionary theory
Low resolution trees
High resolution trees
Phylogenetic tree of winged insect orders
Devonian
Carboniferous
Permian
Triassian
Jurassic
Cretaceous
Paleogene to recent
Palaeodictyoptera
Odonata
Devonian
origin
Low
resolution
Radiation
Rhyniognatha hirsti
The tree lacks 9
orders that went
extinct by the
end of the
Permian
Radiation
In the Triassic period all extant taxa
already existed
Ephemeroptera
Dictyoptera
Plecoptera
Zoraptera
Embioptera
Isoptera
Dermaptera
Grylloblatodea
Phasmida
Orthoptera
Mallophaga
Psocoptera
Thysanoptera
Heteroptera
Hymenoptera
Neuroptera
Coleoptera
Siphonaptera
Mecoptera
Diptera
Trichoptera
Lepidoptera
The construction of phylogenetic trees from numerical methods
The principle of maximum parsimony (Occam’s razor) holds that we should accept
that phylogenetic tree that can be constructed with the least number of morphological
changes.
The raw data
Species
A
B
C
D
E
1
1
1
0
0
1
2
1
1
1
0
0
Characters
3
4
0
1
1
1
0
0
1
1
1
1
5
1
1
1
0
0
6
1
1
0
1
1
A
B
D
E
C
001101
110111
101101
010010
8 changes
Distance matrix
Species
A
B
C
D
E
A
0
1
3
4
3
B
1
0
4
3
2
C
3
5
0
5
6
111111
D
4
3
5
0
1
E
3
2
6
1
0
We are looking for such a tree that
minimizes the sum of distances.
A
Outgroup
B
D
E
C
001101
101101
010010
111111
How to define the
root?
010111
110111
7 changes
Parsimony analysis
To find the most parsimonious tree we have to cross all combinations of lineages (trees)
with all character combinations at the root.
The number of
possible trees
Number of
trees
Species
2
1
3
3
4
15
5
105
6
945
7
10395
8
135135
9
2027025
10
34459425
N
(2S 2)!
2S1 (S 1)!
Neighbour joining
A
Neighbour joining is particularly used to generate
phylogenetic trees
B
C
Root
You need similarities (phylogenetic distances) (XY)
between all elements X and Y.
F
Dissimilarities
E
D
(X) (X, Yi )
n
A
B
Calculate
C
Root
X
F
Q(X,Y) (n 2)(X, Y) (X) (Y)
Select the pair with the lowest value of Q
E
Calculate new dissimilarities
D
(X, U AB )
A
B
Root
C
X
Y
E
Calculate the distancies from the new node
(n 2)(A, B) (A) (B)
2(n 2)
(n 2)(A, B) (A) (B)
(B, U)
2(n 2)
(A, U)
F
D
(X, A) (X, B) (A, B)
2
Distance matrix
Mouse
Raven
Mouse
Raven
Octopus
Lumbricus
0
0.2
0.6
0.7
0.2
0
0.6
0.8
Delta values
1.5
1.6
Octopus Lumbricus
0.6
0.7
0.6
0.8
0
0.5
0.5
0
1.7
2
Raven
Octopus
Mouse
Lumbricus
(X) (X, Yi )
n
Q-values
Mouse/Raven
Mouse/Octopus
Mouse/Lumbricus
Raven/Octopus
Raven/Lumbricus
Octopus/Lumbricus
-2.7
-2
-2.1
-2.1
-2
-2.7
Q(X,Y) (n 2)(X, Y) (X) (Y)
Raven
Mouse
Distance matrix
Mouse
Raven
Mouse
Raven
Protostomia
0
0.2
0.4
0.2
0
0.45
Delta values
0.6
0.65
Q-values
Mouse/Raven
Mouse/Protostomia
Raven/Protostomia
Protostomia
0.4
0.45
0
(X, U AB )
(X, A) (X, B) (A, B)
2
(X) (X, Yi )
n
-1.25
-1.05
-0.6
Q(X,Y) (n 2)(X, Y) (X) (Y)
Vertebrata
Distance matrix
Vertebrata
Protostomia
0.85
Protostomia
Vertebrata Protostomia
0
0.075
0.075
0
Protostomia
Assumption of the numerical methods
Birds
Characters (or transitions) have to be
independent.
Impossible character states have to be
excluded.
Fish
Loss of hairs
Mammals
Loss of feathers
Hairs
Feathers
Scales
Characters are assumed to have equal
importance. In reality transitions are not
comparable.
To overcome this problem you give
character weights. Technically you multiply
the occurrence of a character in a distance
matrix
Incompatible
http://evolution.genetics.washington.edu/phylip/software.html
Trees from molecular data
Distance matrix
Species
A
B
C
D
E
Sequence
A
A
C
A
C
G
C
G
G
G
T
T
T
T
T
T
T
T
G
G
A
A
T
T
T
A
A
G
G
G
C
C
G
G
C
C
C
A
A
C
C
C
A
A
C
A
A
T
T
A
A
A
G
A
A
T
T
A
A
T
A
A
C
A
A
A
B
C
D
E
A
0
1
11
10
5
B
1
0
10
9
5
C
11
10
0
3
9
D
10
9
3
0
6
E
5
5
9
6
0
Evolutionary time scales
The molecular clock
Numbers of amino acid substitutions
and therefore trespective numbers of
nucleotide substitutions are for many
proteins and genomes approximately
proportional to time.
Motoo Kimura Emile Zuckerkandl Tomoko Ohta
(1933-)
(1924-1994)
(1922-)
80
Hence, numbers of substitutions are a
measure of time of divergence from
the latest common ancestor.
Substitutions
alone
provide a
relative time
scale
70
acid differences
Nuumber of amino
c
c
Linus Pauling
(1901-1994)
60
50
40
30
20
Errors
10
0
0
200
400
600
800 1000
Paleontological divergence
estimate
Superoxide dismutase
An appropriate
calibration
adds the
absolute time
scale
Applying the molecular clock
A
B
C
D
1
3
4
2
Ancestor
T→C
T
C
A→G
A
G→C→G
T
G→C→A
A
A
C
G
T
T
C
A→G
A
G
T
G
C
C
C
T
Single substitution
The length of a tree segment is a measure of
the duration of a lineage
Is it possible to convert numbers of character
changes into evolutionary time scales?
The Jukes Cantor model now assumes that the
probabilities l of any transition within these 4
nucleotides is the same.
A
l/3
l/3
Parallel substitution
Back substitution
Multiple substitution
C
G
l/3
l/3
T
Assuming that transition probability is time
independent (every period has the same
transition probability).
The probability distribution follows an Arrhenius
model.
ptrans 1 elt
ptrans 1 elt
A→T:
3
1
3 lt
1
lt
1
(
1
e
)
(
1
e )
(1 e lt )
A→A:
4
4
4
4
1
3
3( (1 e lt )) (1 e lt )
4
4
1
1
lt
(1 e lt )
A→G:
(1 e )
4
4
p trans
A→C:
What is the probability to get exactly x differences out of n possible?
We apply the binomial:
n
L(x; t) p x (1 p) n x
We apply the principle of maximum likelihood.
x
n
3 3
3 3
ln(L(x; t)) ln x ln( e lt ) (n x) ln(1 ( e lt ))
4 4
4 4
x
We are interested in the time
that maximizes this function.
Hence we need the root of
the first derivative
1
4x
t ln(1 )
l
3n
The distances t are now used in distance matrices to construct the phylogenetic tree
Paleontological versus molecular timescales
Molecular estimates point frequently much more ancient divergences of lineages
than estimates based on the fossil record.
The reason are different speeds of morhological and genetical changes.
Time axis
First fossils of
placental orders (65
mya)
Eomaia (125 mya)
Molecular divergence of
placental orders (120-140 mya)
Morphological change
Changes in genetic constitution involve first
basic regulatory elements.
Genetical change
Genetical change
Changes in genetic constitution accumulate to a
point where basic regulatory elements are
involved
Gene flow up to 2 mya
First fossils of erect hominids
(6-7 mya)
Morphological change
Time axis
Molecular
divergence
(4-5 mya)
Paleontological versus molecular timescales
z
Matching of molecular and paleontological
timescales in Echinodermata
250
estimate
Molecular divergence
300
200
150
100
50
0
0
100
200
300
Paleontological divergence
estimate
For the majority of Echinoderm subtaxa
molecular divergence estimates are higher than
the paleontological estimates.
Data from Smith et al. (2006)
Paleontological versus molecular timescales
Divergences
Placental-marsupials
Amniotes-amphibians
Myriapods-chelicerates
Mosses-vascular plants
Crustaceans-insects
Echinoderms-chordates
Spiralian-Ecdysozoans
Protostomes-deuterostomes
Arthropods-chordates
Cnidaria-bilaterians
Sponges-chordates
Data from Qun et al. (2007)
Earliest fossil
record
175–145
310
530
450
530
<530
560–540
560–540
560–540
<600
<600
Molecular
estimates
185–161
375–345
705–579
899–515
726-539
1001–586
643–544
678–556
1200–588
724–615
1350–592
Have all phylogenetic trees a single root?
Darwin’s first principle: All species of a given taxon have a common ancestor.
A brush means:
• No speciation.
• If we except that extinction occurs this
would mean a constant decrease in the
number of species.
• Character change within whole species.
Theory of Lamarck
Scale of organization
Parsimony analysis cannot answer this
question. A brush would always have a lower
number of character changes
Scala
naturae
Spontaneous origin of simple
life forms
• No genetic (character) variability within
populations.
• Extreme longevity of lineages.
But horizontal gene transfer and might at least in bacteria result
in networks and rings!
Time
Evolution and development (EvoDevo)
August Weismann
(1834-1914)
The soma - germ line
distinction
makes it impossible to
transmit acquired characters
to the next generation
Ernst Haeckel
(1834-1919)
Theory of recapitulation
The ontogeny of
advanced species
recapitulates respective
stages in ancestral
forms.
In fact, only basic genetic
programs are conserved
and modifications at all
stages of ontogenesis
appear.
Haeckel’s rule is only a
crude approximation.
Today’s reading
Phylogenetic systematics:
http://evolution.berkeley.edu/evolibrary/article/phylogenetics_01
Cladistics: http://en.wikipedia.org/wiki/Cladistics
Ernst Haeckel: Kunstformen der Natur (Internet exhibition of original drawings:
http://caliban.mpiz-koeln.mpg.de/~stueber/haeckel/kunstformen/liste.html
The modern molecular clock:
http://awcmee.massey.ac.nz/people/dpenny/pdf/BromhamPenny_2003.pdf