Transcript ppt
Lecture 22 : Phylogeography and Coalescence
November 13, 2015
Final Exam
We will have the exam on the last day of class,
Monday, December 7 at 2:30 pm
Exam will be in the computer lab, 3306 LSB
Exam will last 50 minutes, so it will be half as
long as the previous two exams.
Non-cumulative, covers Mutation through
Association Genetics
Review session on Friday, Dec 4
Last Time
Signatures of selection
Hudson-Kreitman-Aguade Test
Synonymous versus Nonsynonymous substitutions
McDonald-Kreitman
Molecular clock
Introduction to phylogenetics
Today
Phylogeography
Limitations of phylogenetic analysis
Coalescence introduction
Influence of demography on
coalescence time
Coalescence and human origins
Choosing Phylogenetic Trees
If multiple trees equally likely, select majority rule or consensus
Strict consensus is most conservative approach
Bootstrap data matrix (sample with replacement) to determine
robustness of nodes
E
60
Lowe, Harris, and Ashton 2004
A
D F
CB
60
60
Felsenstein 2004
Phylogeography
The study of evolutionary relationships among individuals based
on phylogenetic analysis of DNA sequences in geographic context
Can be used to infer evolutionary history of populations
Migrations
Population subdivisions
Bottlenecks/Founder Effects
Can provide insights on current relationships among populations
Connectedness of populations
Effects of landscape features on gene flow
Phylogeography
Topology of tree provides
clues about evolutionary
and ecological history of a
set of populations
Dispersal creates poor
correspondence between
geography and tree
topology
Vicariance (division of
populations preventing
gene flow among
subpopulations) results in
neat mapping of geography
onto haplotypes
Example: Pocket gophers (Geomys pinetis)
Fossorial rodent that inhabits
3-state area in the U.S.
RFLP for mtDNA of 87
individuals revealed 23
haplotypes
Parsimony network reveals
geographic relationships
among haplotypes
Haplotypes generally
confined to single populations
Major east-west split in
distribution revealed
Avise 2004
Problems with using Phylogenetics for
Inferring Evolution
It
is a black box: starting from end point,
reconstructing past based on assumed
evolutionary model
Orthologs versus paralogs
Hybridization
Differential evolutionary rates
Assumes coalescence
Gene Orthology
Phylogenetics
requires unambiguous identification of
orthologous genes
Paralogous genes are duplicated copies that do not share
a common evolutionary history
Difficult to determine orthology relationships
paralogs
paralogs
Lowe, Harris, Ashton 2004
paralogs
orthologs
Gene Trees vs Species Trees
Genes
(or loci) evolve at different rates
Why?
Topology derived by a single gene may not match
topology based on whole genome, or morphological traits
Gene Tree
B
C A
Coalescence
Retrospective tracing
of existing alleles to a
common ancestral allele
A reverse reconstruction of the evolution of
modern variation
Allows explicit simulation of sequence
evolution
Incorporation of factors that cause deviation
from neutrality: selection, drift, and gene flow
9 generations in the history of a population of 14 gene copies
Time
present
Slide courtesy of Yoav Gilad
Individual alleles
Gene Trees vs Species Trees
Failure to coalesce within species
lineages drives divergence of
relationships between gene and species
trees
Divergent Gene
Tree:
Concordant Gene
Tree
b is closer to a than to c
a b
c
b is closer to c
than to a
a b
c
How to model this process?
Modeling from Theoretical Ancestors: Forward Evolution
Can model
populations in a
forward direction,
starting with
theoretical past
Fisher-Wright model
of neutral evolution
Very computationally
intensive for large
populations
Alternative: Start at the end and work your way back
Most recent common ancestor (MRCA)
Time
present
Slide courtesy of Yoav Gilad
Individual alleles
The genealogy of a sample of 5 gene copies
Most recent common ancestor (MRCA)
Time
present
individuals
Slide courtesy of Yoav Gilad
The genealogy of a sample of 5 gene copies
Most recent common ancestor (MRCA)
Individual alleles
Slide courtesy of Yoav Gilad
Time
present
Examples of coalescent trees for a sample of 6
Time
Individual alleles
Slide courtesy of Yoav Gilad
Coalescence Advantages
Don’t
have to model dead ends
Only consider lineages that survive to modern
day: computationally efficient
Based on actual observations
Can simulate different evolutionary scenarios
to see what best fits the observed data
Coalescent Tree Example
Coalescence: Merging
of two lineages in the
Most Recent Common
Ancestor (MRCA)
Waiting Time: time to
coalescence for two
lineages
Increases with each
coalescent event
E(Tk ) =
2N
k(k -1)
2
Probability of Coalescence
For any two lineages, function of population
size
Pcoalescence
1
2Ne
Also a function of number of lineages
Pcoalescence
k (k 1) 1
2
2Ne
where k is number of lineages
Probability of Coalescence
Probability declines over time
Lineages decrease in number
Can be estimated based on negative
exponential
Pcoalescence e
k ( k 1) 1
t
2
2 Ne
where k is number of lineages
Average time to coalescence:
2N
Time =
k(k -1)
2
Time to Coalescence Affected by Population
History
Bottleneck
Time to Coalescence Affected by Population
History
Population Growth
How will population structure
affect coalescence times?
Time to Coalescence Affected by Population
Structure
Applications of the Coalescent Approach
Framework for efficiently testing
alternative models for evolution
Inferences about effective population
size
Detection of population structure
Signatures of selection
Reconstructing history of populations
Origins of Modern Humans
Most fossil evidence points to origins in Africa and subsequent migrations
Skulls found in Omo Valley, Ethiopia
Dated at ~195K
Omo 1
Modern
http://wwwv1.amnh.org/exhibitions/
permanent/humanorigins
/history/origin.php
http://www.dhushara.com/book/unraveltree/unravel.htm
Human Phylogeography:
mtDNA
Most ancient and
diverse haplotypes in
Africa (dots)
Migration and
admixture is evident
from presence of
African haplotypes in
other clades
Complexities to Human Phylogeography
Sequence of X-linked ribonucleotide reductase M2 pseudogene 4 (RRM2P4)
Coalescent evidence of Asian origin
Strongly negative Tajima’s D in Europe due to low π
How might you explain this?
Garrigan 2007 Nature Reviews Genetics 7:669
Tajima’s D Expectations
• D=0: Neutrality
• D>0
d S
– Balancing Selection: Divergence of alleles (π) increases
OR
– Bottleneck: S decreases
• D<0
– Purifying or Positive Selection: Divergence of alleles decreases
OR
– Population expansion: Many low frequency alleles cause low
average divergence