Molecular Systematics

Download Report

Transcript Molecular Systematics

Molecular Systematics
• Systematics - the science of identifying, naming, and classifying living
organisms into groups
• A natural activity of the human brain
• Aristotle - Scala Naturae, or “Chain of Life,” which consisted of God, man,
mammals, oviparous with perfect eggs (e.g., birds), oviparous with
nonperfect eggs (e.g., fish), insects, plants, and non-living matter.
• Dominated for ~2000 yrs but made no real attempt at an orderly, consistent
classification
• Lineaus and others – downward classification
• Dividing larger groups into smaller ones via dichotomies
• Actually a method of ‘identification’ not ‘classification’
• Highly dependent on the order in which the dichotomies were investigated
• Upward classification –
• Grouping organisms with similar characteristics
• Still, all of this was influenced heavily by the idea of archetypes, distinct
types or kinds of organisms that are unchanging
• This was an attempt to find a ‘natural’ system. But, what is the basis for
this ‘natural’ system?
Molecular Systematics
• Enter The Origin of Species
• Provided the rationale for a coherent system
• Common descent as the basis for classification
• Phylogenetic Systematics
• Classifying organisms based on evolutionary relationships
• “The time will come, I believe, though I shall not live to see it, when we
shall have fairly true genealogical trees of each great kingdom of
Nature” - Charles Darwin
• Therefore, 2 goals for phylogenetics
• (1) reconstruct life's geneology
• (2) use geneology as basis for classification.
Molecular Systematics
• What can we do with molecular phylogenies?
• Classify organisms according to evolutionary history
• bring order to the chaos of living things
There is only one "fundamental law" in biology: life evolves. That's
it. The messiness comes because life is an emergent property of
chemistry and physics. Just like a grain of sand acts differently
than a pile of sand, so too do physics and chemistry act differently
than biology - the amount of interacting forces that happen in
biology are orders of magnitudes more than in physics and
chemistry
Molecular Systematics
• What can we do with molecular phylogenies?
•Determine evolutionary patterns and processes in organisms
• evolutionary rates among organisms (speciation, extinction, morphological change)
• identification of key adaptations
• correlations between traits or characters
FOXP2 in bats and other mammals
Molecular Systematics
• What can we do with molecular phylogenies?
85 fam.
284K copies
Superfamily
• Determine evolutionary patterns and processes in genomes
•
Helitron
Tc1/mariner
hAT
piggyBac
Mutator
Merlin
unclassified
~ 20 fam.
231K copies
Little brown bat
?
Non-vesper bats
25 fam.
49K copies
Dog
23 fam.
47K copies
0
0
29 fam.
74K copies
11 fam.
23K copies
Mouse
Rat
0
0
0
Human
Macaque
0
Marmoset
?
~150
//
100
65
Galago
40
25
0 MY
Molecular Systematics
• What can we do with molecular phylogenies?
• Inform conservation efforts
Tabasco
x
x
x
8
Peten
x9
Molecular Systematics
• What can we do with molecular phylogenies?
• Inform medical and forensic genetics
Molecular Systematics
• What can we do with molecular phylogenies?
• Investigate population histories and demography
Molecular Systematics
• Tree – a mathematical model of a proposed evolutionary history of
organisms or some aspect of organisms
• The ultimate goal of phylogenetics is to recover an accurate tree of life
Molecular Systematics
(OTU)
Molecular Systematics
• Levels of resolution
Molecular Systematics
• Polytomies
Molecular Systematics
• Cladograms, phylograms, phenograms, etc…
• Cladogram – illustrates evolutionary relationships of organisms via relative
common ancestry
• branch lengths are meaningless and arbitrary
• Phylogram – illustrates relationships of organisms with branch lengths
proportional to time or similarity
• a subset of cladograms
• Phenogram – illustrates relative amounts of similarity or difference (NOTE:
intent is not necessarily to represent common ancestry)
• more on this distinction later
Cladogram
Phylogram
Molecular Systematics
• Rooted vs. unrooted trees
• Rooted trees have a node from which all other nodes have
descended
• It is directional
• Allow for the inference of ancestor-descendant relationships
• Unrooted trees lack a root, direction and indications of
ancestral relationships
Molecular Systematics
• Rooted vs. unrooted trees
• Rooted trees have a node from which all other nodes have
descended
B
C
Root
A
C
B
D
D
Rooted tree
Root
A
A
B
C
B
Root
D
A
Root
C
D
Molecular Systematics
• Rooted vs. unrooted trees
• Two major ways to root a tree
By outgroup:
Use taxa (the “outgroup”) that are
known to fall outside of the group of
interest (the “ingroup”). Requires prior
knowledge about the relationships
among the taxa.
outgroup
By midpoint:
A
Roots the tree at the midway point
between the two most distant taxa in
the tree, as determined by branch
lengths. Assumes that the taxa are
evolving in a clock-like manner.
d (A,D) = 10 + 3 + 5 = 18
Midpoint = 18 / 2 = 9
10
C
3
B
2
2
5
D
Molecular Systematics
• Rooted vs. unrooted
trees
• Most phylogenetic methods
infer unrooted trees
• Thus, choosing a root is an
extremely important decision
• 5 potential roots to this one
unrooted tree
• each one has a different
interpretation
Molecular Systematics
• Rooted vs. unrooted trees
Log # trees
• Numbers of rooted and unrooted
trees
Number
of OTU’s
8
Possible Number of
Rooted trees
Unrooted trees
2
1
1
3
3
1
4
15
3
5
105
15
6
945
105
7
10395
945
8
135135
10395
9
2027025
135135
10
34459425
2027025
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
10
1
2
3
4
5
OTU’s
6
7
8
9
Molecular Systematics
• More terminology
• Tree -phyly
• Monophyletic groups – a group on a tree that includes one ancestor and
the all terminal taxa that arose from it.
• Paraphyletic groups – A group of terminal taxa and ancestor(s) that
excludes one or more members
• Polyphyletic – A completely unnatural grouping of terminal taxa
Molecular Systematics
• More terminology
• Tree -phyly
• Monophyletic – Archosauria, Lepidosauria
• Paraphyletic – “reptiles”, “dinosaurs”
• Polyphyletic – “ homeotherms”
Molecular Systematics
• Gene trees vs. species trees
• We usually assume that trees inferred from molecular data (sequences)
reflect the history of the organisms. What happens when we assume?
A
B
C
A
B
C
A
A
A
B
B
C
B
C
C
+
A
B
C
Molecular Systematics
• Incomplete lineage sorting
•
We usually assume that trees inferred from molecular data (sequences)
reflect the history of the organisms. What happens when we assume?
Salem et al. 2003, PNAS
Molecular Systematics
• More terminology
• Characters and character states
• Organisms comprise sets of features
• A particular feature that is heritable is a “character”
• a nucleotide position, the shape of a bone, presence or absence of a bone
• When taxa differ with respect to a feature (e.g. the presence or absence or
difference of a base at a particular locus) the different conditions are called
“character states”
• Character states can be discrete or continuous, reversible or nonreversible,
ordered or non-ordered, ancestral or derived (polarity)
Character
Possible states
Nucleotide position
A, T, C, G, gap
TE insertion
Presence, absence
Amino acid
Polar, nonpolar, acid, base, etc
Mandibular symphysis
Unfused, partially fused, ossified
Molecular Systematics
• Homology assignment
• All phylogenetic methods (molecular and morphological) assume that you are
comparing homologous loci/structures
• Homologous – sharing a common ancestor
• Two loci are either homologous or not, there is no such thing as 95%
homologous – 95% similar, yes; 95% homologous, no
• Homology comes in two flavors
• Paralogy – loci originating from a duplication event recent enough to reveal
their common ancestry
• Orthology – loci that share ancestry via lineage divergence
• One must be able to discern the two a priori
Molecular Systematics
• More terminology
• Homoplasy
• Similarity that is not homologous (not due to common ancestry)
• Can be the result of convergence, parallelism, reversals of state
• Can provide misleading evidence of phylogenetic affinity (if interpreted
incorrectly as homology)
• Common in DNA sequence data
Molecular Systematics
• Challenges to inferring trees with
molecular data
• Paralogy
Molecular Systematics
• Challenges to inferring trees with
molecular data
• Gene conversion
Molecular Systematics
• Challenges to inferring trees with
molecular data
• Varying rates of mutation
Molecular Systematics
• Challenges to inferring trees with
molecular data
• Horizontal gene transfer
Identity by Descent/State
Identity By Descent
Identity By State
Species A
Species A
ATGGTCC
Species B
ATGATCC
insertion
Species A’
Species B
time
Species A
mutation
Species A
ATGGTCC
Species B
ATGGTCC