Lecture 4: (Part 1) Phylogenetic inference

Download Report

Transcript Lecture 4: (Part 1) Phylogenetic inference

BIOE 109
Summer 2009
Lecture 4- Part II
Phylogenetic Inference
What is phylogeny?
A
What is phylogeny?
B
(a)
(b)
(c)
(d)
None
Both are phylogenetic trees
Only A is phylogenetic tree
Only B is phylogenetic tree
What is phylogeny?
Phylogeny: evolutionary history of a group of species
or a gene.
What is phylogeny?
Phylogeny: evolutionary history of a group of species
or a gene
Phylogenetic tree: graphical summary of the
evolutionary history
Phylogeny describes:1. Pattern and/or timing of events that occurred as species
diversified.
2. Sequence in which lineages appeared
3. Which organisms are more closely or distantly related.
Phylogenetic Inference
Two points to keep in mind:
Phylogenetic Inference
Two points to keep in mind:
1. Phylogenetic trees are hypotheses
-how reliable?
Phylogenetic Inference
Two points to keep in mind:
1. Phylogenetic trees are hypotheses
2. Gene trees are not the same as species trees
Phylogenetic Inference
Two points to keep in mind:
1. Phylogenetic trees are hypotheses
2. Gene trees are not the same as species trees
• a species tree depicts the evolutionary history of a
group of species.
Phylogenetic Inference
Two points to keep in mind:
1. Phylogenetic trees are hypotheses
2. Gene trees are not the same as species trees
• a species tree depicts the evolutionary history of a
group of species.
• a gene tree depicts the evolutionary history of a
specific locus.
Conflict between gene trees and species trees
Conflict between gene trees and species trees
Phylogenetic Inference
• phylogenetic trees are built from “characters”.
Phylogenetic Inference
• phylogenetic trees are built from “characters”.
• characters can be morphological, behavioral,
physiological, or molecular.
Phylogenetic Inference
• phylogenetic trees are built from “characters”.
• characters can be morphological, behavioral,
physiological, or molecular.
• there are two important assumptions about the
characters used to build trees:
Phylogenetic Inference
• phylogenetic trees are built from “characters”.
• characters can be morphological, behavioral,
physiological, or molecular.
• there are two important assumptions about the
characters used to build trees:
1. they are independent.
Phylogenetic Inference
• phylogenetic trees are built from “characters”.
• characters can be morphological, behavioral,
physiological, or molecular.
• there are two important assumptions about
characters used to build trees:
1. they are independent.
2. they are homologous.
What is a homologous character?
What is a homologous character?
• a homologous character is shared by two species
because it was inherited from a common
ancestor.
What is a homologous character?
• a homologous character is shared by two species
because it was inherited from a common
ancestor.
• a character possessed by two species but was not
present in their recent ancestors, it is said to exhibit
“homoplasy”.
Types of homoplasy:
Types of homoplasy:
1. Convergent evolution
Example: evolution of eyes, flight.
Examples of convergent evolution
Types of homoplasy:
1. Convergent evolution
Example: evolution of eyes, flight.
2. Parallel evolution
Example: drug resistance in HIV.
What is the difference between
convergent and parallel evolution?
What is the difference between convergent
and parallel evolution?
Convergent
Parallel
What is the difference between convergent
and parallel evolution?
Species compared:
Convergent
Parallel
distantly
related
closely
related
What is the difference between convergent
and parallel evolution?
Species compared:
Trait produced by:
Convergent
Parallel
distantly
related
closely
related
different genes/
developmental
pathways
same genes/
developmental
pathways
Types of homoplasy:
1. Convergent evolution
Example: evolution of eyes, flight.
2. Parallel evolution
Example: lactose tolerance in human adults
3. Evolutionary reversals
Example: back mutations at the DNA sequence level (C 
A  C).
Evolutionary reversals are common in DNA sequences



Our objective is to identify monophyletic
groups
Our objective is to identify monophyletic
groups
A monophyletic group is derived from a single ancestral
species and includes all descendants (e.g.,
mammals).
Three monophyletic groups:
Two mistakes are possible:
1. A paraphyletic group is derived from a single ancestral
species but does not include all descendants.
Reptiles are paraphyletic
Two mistakes are possible:
1. A paraphyletic group is derived from a single ancestral
species but does not include all descendants (e.g.,
reptiles).
2. A polyphyletic group fails to include the most recent
common ancestor.
“Warm blooded animals” is a polyphyletic group
Contending schools of systematics
1. Phenetics (Distance methods)
Contending schools of systematics
1. Phenetics (Distance methods)
Objectives:
1. Tree should reflect overall degree of similarity.
Contending schools of systematics
1. Phenetics (Distance methods)
Objectives:
1. Tree should reflect overall degree of similarity.
2. Tree should be based on as many characters as
possible.
Contending schools of systematics
1. Phenetics (Distance methods)
Objectives:
1. Tree should reflect overall degree of similarity.
2. Tree should be based on as many characters as
possible.
3. Tree should minimize the distance among taxa.
Examples of distance trees-HIV strains
Discrete character data is converted into a distance value
Distance tree—HIV strains
- Captures overall degree of similarity
- Branch lengths are important
-Drawbacks:
(a) loss of information about which traits have
changed.
(b) have to correct for multiple substitutions at
the same site.
(c) the tree may not reflect “true” phylogenetic
relationship
Contending schools of systematics
2. Cladistics
Contending schools of systematics
2. Cladistics
Objectives:
1. Tree should reflect the true phylogeny.
Contending schools of systematics
2. Cladistics
Objectives:
1. Tree should reflect the true phylogeny.
2. Tree should use characters that are shared (among two or
more taxa) and derived (from some inferred or known
ancestral state).
Contending schools of systematics
2. Cladistics
Objectives:
1. Tree should reflect the true phylogeny.
2. Tree should use characters that are shared (among two or
more taxa) and derived (from some inferred or known
ancestral state).
• shared and derived characters are called synapomorphies.
Contending schools of systematics
2. Cladistics
Objectives:
1. Tree should reflect the true phylogeny.
2. Tree should use characters that are shared (among two or
more taxa) and derived (from some inferred or known
ancestral state).
• shared and derived characters are called synapomorphies.
3. Ancestral state of characters inferred from an outgroup
that roots the tree.
Contending schools of systematics
2. Cladistics
Objectives:
1. Tree should reflect the true phylogeny.
2. Tree should use characters that are shared (among two or
more taxa) and derived (from some inferred or known
ancestral state).
• shared and derived characters are called synapomorphies.
3. Ancestral state of characters inferred from an outgroup
that roots the tree.
• an outgroup is ideally picked from the fossil record.
Example of a cladogram
How do distance trees differ from
cladograms?
How do distance trees differ from
cladograms?
Distance trees
Cladograms
How do distance trees differ from
cladograms?
Characters used
Distance trees
Cladograms
as many as
possible
synapomorphies
only
How do distance trees differ from
cladograms?
Distance trees
Cladograms
Characters used
as many as
possible
synapomorphies
only
Monophyly
not required
absolute
requirement
How do distance trees differ from
cladograms?
Distance trees
Cladograms
Characters used
as many as
possible
synapomorphies
only
Monophyly
not required
absolute
requirement
Emphasis
branch lengths
branch-splitting
How do distance trees differ from
cladograms?
Distance trees
Cladograms
Characters used
as many as
possible
synapomorphies
only
Monophyly
not required
absolute
requirement
Emphasis
branch lengths
branch-splitting
Outgroup
not required
absolute
requirement
How do we select the “best” tree?
No. of Taxa
4
No. of possible trees
3
How do we select the “best” tree?
No. of Taxa
4
5
No. of possible trees
3
15
How do we select the “best” tree?
No. of Taxa
4
5
6
No. of possible trees
3
15
105
How do we select the “best” tree?
No. of Taxa
4
5
6
7
No. of possible trees
3
15
105
945
How do we select the “best” tree?
No. of Taxa
No. of possible trees
4
5
6
7
10
3
15
105
945
2 x 106
How do we select the “best” tree?
No. of Taxa
No. of possible trees
4
5
6
7
10
11
3
15
105
945
2 x 106
34 x 106
How do we select the “best” tree?
No. of Taxa
No. of possible trees
4
5
6
7
10
11
50
3
15
105
945
2 x 106
34 x 106
3 x 1074
How do we select the “best” tree?
How do we select the “best” tree?
A. Maximum parsimony: the “best” tree is that which
minimizes the number of evolutionary steps (changes
among characters).
How do we select the “best” tree?
A. Maximum parsimony: the “best” tree is that which
minimizes the number of evolutionary steps (changes
among characters).
-the simplest explanation is preferred over more
complicated ones.
Examples of convergent evolution
Independent gain of camera eye requires two
changes
Evolution and loss of camera eye
requires six changes
How do we select the “best” tree?
B. Maximum likelihood: the “best” tree is that which
maximizes the likelihood of producing the observed
data.
How do we select the “best” tree?
B. Maximum likelihood: the “best” tree is that which
maximizes the likelihood of producing the observed
data.
- likelihood scores are estimated from a specific model
of base substitution and a specific tree.
How do we select the “best” tree?
C. Bootstrapping
Evaluating tree support by bootstrapping
Species 1
Species 2
Species 3
Species 4
A
A
A
A
A
T
T
T
C
C
T
T
G
G
G
G
C
C
A
A
C
C
C
C
T…
T…
C…
C…
G
G
G
G
Evaluating tree support by bootstrapping
Species 1
Species 2
Species 3
Species 4
A
A
A
A
A
T
T
T
C
C
T
T
G
G
G
G
C
C
A
A
C
C
C
C
T…
T…
C…
C…
Species 1
Species 2
Species 3
Species 4
G
G
G
G
Evaluating tree support by bootstrapping
Species 1
Species 2
Species 3
Species 4
A
A
A
A
A
T
T
T
C
C
T
T
G
G
G
G
C
C
A
A
C
C
C
C
T…
T…
C…
C…
G
G
G
G
Step 1. Randomly select a base to represent position 1
Evaluating tree support by bootstrapping
Species 1
Species 2
Species 3
Species 4
A
A
A
A
A
T
T
T
C
C
T
T
G
G
G
G
C
C
A
A
C
C
C
C
T…
T…
C…
C…

G
G
G
G
Step 1. Randomly select a base to represent position 1
Species 1
Species 2
Species 3
Species 4
T
T
C
C
Evaluating tree support by bootstrapping
Species 1
Species 2
Species 3
Species 4
A
A
A
A
A
T
T
T
C
C
T
T
G
G
G
G

C
C
A
A
C
C
C
C
T…
T…
C…
C…
G
G
G
G
Step 2. Randomly select a base to represent position 2
Species 1
Species 2
Species 3
Species 4
T
T
C
C
G
G
G
G
Evaluating tree support by bootstrapping
Step 3. Generate complete data set (sampling with
replacement).
Evaluating tree support by bootstrapping
Step 3. Generate complete data set (sampling with
replacement).
Step 4. Build tree and record if groupings match original
tree.
Evaluating tree support by bootstrapping
Step 3. Generate complete data set (sampling with
replacement).
Step 4. Build tree and record if groupings match original
tree.
Step 5. Repeat 1,000 times.
Evaluating tree support by bootstrapping
Species 1
98
Species 2
Species 3
92
Species 4