Transcript ppt

Lecture 22 : Phylogeography and Coalescence
November 13, 2015
Final Exam
We will have the exam on the last day of class,
Monday, December 7 at 2:30 pm
Exam will be in the computer lab, 3306 LSB
Exam will last 50 minutes, so it will be half as
long as the previous two exams.
Non-cumulative, covers Mutation through
Association Genetics
Review session on Friday, Dec 4
Last Time
Signatures of selection
 Hudson-Kreitman-Aguade Test
 Synonymous versus Nonsynonymous substitutions
 McDonald-Kreitman
Molecular clock
Introduction to phylogenetics
Today
Phylogeography
Limitations of phylogenetic analysis
Coalescence introduction
Influence of demography on
coalescence time
Coalescence and human origins
Choosing Phylogenetic Trees

If multiple trees equally likely, select majority rule or consensus

Strict consensus is most conservative approach

Bootstrap data matrix (sample with replacement) to determine
robustness of nodes
E
60
Lowe, Harris, and Ashton 2004
A
D F
CB
60
60
Felsenstein 2004
Phylogeography
 The study of evolutionary relationships among individuals based
on phylogenetic analysis of DNA sequences in geographic context
 Can be used to infer evolutionary history of populations
 Migrations
 Population subdivisions
 Bottlenecks/Founder Effects
 Can provide insights on current relationships among populations
 Connectedness of populations
 Effects of landscape features on gene flow
Phylogeography
Topology of tree provides
clues about evolutionary
and ecological history of a
set of populations
Dispersal creates poor
correspondence between
geography and tree
topology
Vicariance (division of
populations preventing
gene flow among
subpopulations) results in
neat mapping of geography
onto haplotypes
Example: Pocket gophers (Geomys pinetis)
 Fossorial rodent that inhabits
3-state area in the U.S.
 RFLP for mtDNA of 87
individuals revealed 23
haplotypes
 Parsimony network reveals
geographic relationships
among haplotypes
 Haplotypes generally
confined to single populations
 Major east-west split in
distribution revealed
Avise 2004
Problems with using Phylogenetics for
Inferring Evolution
 It
is a black box: starting from end point,
reconstructing past based on assumed
evolutionary model
 Orthologs versus paralogs
 Hybridization
 Differential evolutionary rates
 Assumes coalescence
Gene Orthology
 Phylogenetics
requires unambiguous identification of
orthologous genes
 Paralogous genes are duplicated copies that do not share
a common evolutionary history
 Difficult to determine orthology relationships
paralogs
paralogs
Lowe, Harris, Ashton 2004
paralogs
orthologs
Gene Trees vs Species Trees
 Genes
(or loci) evolve at different rates
Why?
 Topology derived by a single gene may not match
topology based on whole genome, or morphological traits
Gene Tree
B
C A
Coalescence
 Retrospective tracing
of existing alleles to a
common ancestral allele
 A reverse reconstruction of the evolution of
modern variation
 Allows explicit simulation of sequence
evolution
 Incorporation of factors that cause deviation
from neutrality: selection, drift, and gene flow
9 generations in the history of a population of 14 gene copies
Time
present
Slide courtesy of Yoav Gilad
Individual alleles
Gene Trees vs Species Trees
Failure to coalesce within species
lineages drives divergence of
relationships between gene and species
trees
Divergent Gene
Tree:
Concordant Gene
Tree
b is closer to a than to c
a b
c
b is closer to c
than to a
a b
c
How to model this process?
Modeling from Theoretical Ancestors: Forward Evolution
Can model
populations in a
forward direction,
starting with
theoretical past
Fisher-Wright model
of neutral evolution
Very computationally
intensive for large
populations
Alternative: Start at the end and work your way back
Most recent common ancestor (MRCA)
Time
present
Slide courtesy of Yoav Gilad
Individual alleles
The genealogy of a sample of 5 gene copies
Most recent common ancestor (MRCA)
Time
present
individuals
Slide courtesy of Yoav Gilad
The genealogy of a sample of 5 gene copies
Most recent common ancestor (MRCA)
Individual alleles
Slide courtesy of Yoav Gilad
Time
present
Examples of coalescent trees for a sample of 6
Time
Individual alleles
Slide courtesy of Yoav Gilad
Coalescence Advantages
 Don’t
have to model dead ends
 Only consider lineages that survive to modern
day: computationally efficient
 Based on actual observations
 Can simulate different evolutionary scenarios
to see what best fits the observed data
Coalescent Tree Example

Coalescence: Merging
of two lineages in the
Most Recent Common
Ancestor (MRCA)

Waiting Time: time to
coalescence for two
lineages

Increases with each
coalescent event
E(Tk ) =
2N
k(k -1)
2
Probability of Coalescence
 For any two lineages, function of population
size
Pcoalescence
1

2Ne
 Also a function of number of lineages
Pcoalescence
k (k  1) 1

2
2Ne
where k is number of lineages
Probability of Coalescence
 Probability declines over time
 Lineages decrease in number
 Can be estimated based on negative
exponential
Pcoalescence  e
 k ( k 1) 1
t 
2
2 Ne




where k is number of lineages
 Average time to coalescence:
2N
Time =
k(k -1)
2
Time to Coalescence Affected by Population
History
Bottleneck
Time to Coalescence Affected by Population
History
Population Growth
How will population structure
affect coalescence times?
Time to Coalescence Affected by Population
Structure
Applications of the Coalescent Approach
 Framework for efficiently testing
alternative models for evolution
 Inferences about effective population
size
 Detection of population structure
 Signatures of selection
 Reconstructing history of populations
Origins of Modern Humans
 Most fossil evidence points to origins in Africa and subsequent migrations
Skulls found in Omo Valley, Ethiopia
Dated at ~195K
Omo 1
Modern
http://wwwv1.amnh.org/exhibitions/
permanent/humanorigins
/history/origin.php
http://www.dhushara.com/book/unraveltree/unravel.htm
Human Phylogeography:
mtDNA
Most ancient and
diverse haplotypes in
Africa (dots)
Migration and
admixture is evident
from presence of
African haplotypes in
other clades
Complexities to Human Phylogeography




Sequence of X-linked ribonucleotide reductase M2 pseudogene 4 (RRM2P4)
Coalescent evidence of Asian origin
Strongly negative Tajima’s D in Europe due to low π
How might you explain this?
Garrigan 2007 Nature Reviews Genetics 7:669
Tajima’s D Expectations
• D=0: Neutrality
• D>0
d     S
– Balancing Selection: Divergence of alleles (π) increases
OR
– Bottleneck: S decreases
• D<0
– Purifying or Positive Selection: Divergence of alleles decreases
OR
– Population expansion: Many low frequency alleles cause low
average divergence