Evolution and conservation genetics
Download
Report
Transcript Evolution and conservation genetics
Evolution and conservation
genetics
Neutral model of evolution
What governs heterogyzosity levels?
Neutral model of drift and mutation
Single population
Constant size
Drift occurs at rate 1/2N per generation
Mutation creates new or alternative
alleles and prevents fixation of alleles
What model of mutation does a gene
locus follow under the neutral model?
Infinite Alleles Model
Stepwise-Mutation Model
Infinite Alleles Model (IAM)
(Crow-Kimura Model)
Average protein contains about 300
amino acids (900 nucleotides)
4900 10542
Mutations always occur to new
alleles
Finite population size (drift)
How is loss of alleles due to drift
balanced by new mutations
Do allozymes really
fall under a mutationdrift process?
What is the equilibrium heterozygosity
predicted by IAM?
F = probability that two alleles
are both copies of the same
ancestral allele (identical by
descent)
Probability that you are
not identical by descent
and neither allele has
mutated
1
1
2
Ft 1(1 )2
Ft
(1 u ) 1
2Ne
2Ne
Probability that two alleles are
IBD. No mutation.
Both
alleles
do not
mutate
At equilibrium then…
Ft Ft 1
1
Ft
4N 1
But we have two measures of
homozygosity both measure the same
thing thus equal each other
1
p Ft 4N 1
2
i
Can you derive this?
If H=1-F, then what is H at a mutation drift equilibrium?
Heterozygosity at a mutation drift
equilibrium, given an IAM is…
4Ne
H
4Ne 1
1.2
μ=0.001
1.0
When mutation rates are
held constant then as
population size increases:
higher equilibrium
heterozygosity
0.8
Heterozygosity
When mutation rates are
high and population size is
held constant: higher
equilibrium heterozygosity.
0.6
μ=10-5
μ=10-7
0.4
0.2
0.0
1e+1
1e+2
1e+3
1e+4
Population Size
1e+5
1e+6
1e+7
Stepwise-mutation model (SMM)
(Ohta and Kimura)
Generated by slipped strand mispairing,
mutations occur only at adjacent sites.
Mutation can produce alleles already
present in the population.
Expect that the equilibrium level of
heterozygosity under SMM to be lower
than that of IAM.
1
H 1
8N e 1
Genetic diversity and population size
What is the effect of “finite” population size on
gene frequencies
The various ways to mathematically study it
Effective population size
Drift defined
Random changes of gene frequencies
among generations
More important with
Small population sizes
Fluctuation in population size
Low selection and migration
Long time periods
A simple simulation of drift: “replicated
outcomes” (mean frequency is dotted)
Buri’s (1956) classic genetic drift experiment showing the number of
wildtype versus neutral mutant alleles in populations of 16 Drosophila
followed through time: “gene frequency distribution”
Generalized effect of drift
Allele frequencies do not change (much) on the landscape
scale
Within populations, drift decreases genetic variance
Between populations, drift increases genetic variance
Consider the following to simply illustrate the principle:
In a Buri-like experiment on 4 lines of n=4 hermaphrodite
snails, the frequency of an albinism allele was as follows at
generations 2 and 6.
Generation 2
0
1
2
3
4
5
variance = 2.67
6
Generation 6
7
8
0
1
2
3
4
5
6
variance = 21.33
7
8
Observed vs. expected changes of mean and
variance of gene frequency
Loss of heterozygosity due to drift
Buri used a population of size nine
Effective population size
Governs random change of gene frequency, p
Depends on several factors
All those that reduce the size of the breeding
population
Ne = number of individuals in an ideal population which
has the same magnitude of genetic drift as the actual
population.
Wright-Fisher model
Assume that the number of offspring is
distributed as a Poisson variable with
Mean = 2 ; Variance = 2
In this case, Ne = N
No selection,
Random mating, random number of offspring
Factors reducing N to Ne
Only adults of reproductive age count
Sex ratio
Variation in size over time
Variation in offspring number
Inbreeding (self-fertilization)
Factors reducing N to Ne -- 1
Ne usually less than census population
size
Non-breeding individuals do not
contribute juveniles
“bachelor males”
post-reproductives
Factors reducing N to Ne -- 2
different number of breeding individuals in the
two sexes – one sex represented by a small
number of breeding individuals
example:
Captive bred animals – only one male used
for breeding
Different numbers of males and females
Analogous to having two different population sizes
Unequal sex ration
The effective population size is strongly
influenced by the rarer of the two sexes.
Factors reducing N to Ne -- 3
Variation in number of offspring
produced by different individuals
Ne smaller when offspring numbers are
more unequal
Ne can be larger when variation in
offspring number is reduced
4N 2
Ne
V 2
V is the variance of reproductive
success
What is upper limit for effective
population size?
Factors reducing N to Ne -- 4
• Variation of population size in different generations
• Consider the effect on loss of variation caused by the
specific population in size in generations 1, 2, 3, .... ,t.
harmonic mean: occasional severe reductions in population
size will predominate over long stretches of stable large
population size in reducing variability
N=1000, 10, 1000
Factors reducing N to Ne -- 5
Self-fertilization causes increases of
homozygosity (most extreme form of close
inbreeding, or mating between relatives)
f = fraction of loci in which both alleles are
copies of an immediate ancestor
Ne = N / (1+f)
Effective size in continuous populations
What if there is one population, and
mating occurs to nearby individuals
progeny are dispersed a short distance
“Neighborhood size” (Wright 1943)
Number of individuals within which 95% of the alleles derive from
the previous generation
(twice the standard deviation of gene flow in one direction, … don’t
worry about the formula…)
Mainly applied to plants, Ne= 500-1000; why?
Estimation of effective population size
Demographic data (variance of number of
offspring, variation of population size
direct… but
usually difficult to obtain
Can use genetic data
reconstruct parentage of current population
temporal changes of gene frequency
(paternity analysis, in a few weeks)
(need to separate from sampling variance)
heterozygote excess, between few parents
(only applicable to very small populations)
Heterozygosity vs. allele number
as indicators of variation
Rarer alleles are lost in bigger bottlenecks
(n)
Predicted
Observed
reduction of
reduction of
H = (1/N)
Na =(8-n)/8
______________________
0.001
0.005
0.050
0.100
0.500
0.000
0.024
0.518
0.664
0.831
rarer genes lost faster than predicted by heterozygosity model!
Bottlenecks and founding effects
These
are special cases of genetic drift
Especially
important in conservation genetics
The Founder effect
•New populations often started by small numbers of migrants (analogous
to bottleneck)
• Carry only a fraction of the genetic variability of the parental population
• New populations tend to differ randomly both from the parent population
and from each other, tend to be “inbred”
•Applies to:
•Invasive species
•Island colonists
•Examples…
•Amish of Lancaster Co., PA (Ellis-van Creveld syndrome)
•Pirates of Pitcairn Isle
The Cheetah bottleneck
15,000 to 20,000 cats in the wild
All sampled cheetah share the same allozymes (Cohn 1996)
homozygosity of 100%, population 0% polymorphic
For genes mediating immune response, foreign skin is
recognized as their own
Why? Two bottlenecks – 10,000 years ago and another in the
last two centuries
Work of Stephen J. O’Brian and collaborators (Cat genome
project)
Intrinsic rate of growth affects H after a bottleneck
Dotted line: N=10
Solid line: N=2
Loss of alleles mainly depends on bottleneck size, not
rate of growth following bottleneck
Genetics 144: 2001-2014 (December, 1996)
Heterozygosity excess: difference between
the observed heterozygosity and the heterozygosity
expected from the observed number of alleles.
Journal of Heredity, 1998
Data from real populations
Inference of colonization history: the
Northern elephant seal
Formerly ranged from Mexico-California
Hunted and collected to death
Few survivors on Isla Guadalupe, Mexico (10-100?)
Currently 200,000, many in Central/Southern Calif.
How small was the bottlenecked population?
Attempted reconstruction of the bottleneck of
Northern Elephant Seals
Currently, two mitochondrial DNA haplotypes have frequency
0.27, 0.73, giving He=0.40
Museum sample of pre-bottleneck samples gave He=0.80
Use Ht=H0 (1-1/2Ne1) (1-1/2Ne2)…(1-1/2Net)
One generation bottlenect of 15 gives
This allows Ne to increase following the bottleneck (1922-1960)
Rate of increase about 1.7 per generation
Allows population to grow from 15 to 200,000 in 38 years
H0 =.80, H1 =.59, H2 =.50, H3 =.45… to H=.40 very shortly
But microsatellites don’t show such a reduction of diversity, why?
Inbreeding due to small population size
– Has predictable consequences for allele frequencies
and genotype frequencies:
• Increases the frequency of homozygous genotypes
– Similar in effect to:
• Genetic drift
• Variation in population size over time
• Skewed sex ratios, etc.
– Two “kinds” of inbreeding:
nonrandom – self-fertilization
random
Random inbreeding
Mutational meltdown?
Populations enter a positive feedback loop
– Inbreeding depression increases, population size
decreases
– Effect of drift increases: deleterious mutations
become fixed
– As deleterious mutations become fixed, inbreeding
depression increases
–Maybe the population dies!
Among-population gene diversity
Within populations (so far)
Between populations
Genetic variation in space and time in
populations
•
Genetic structure of populations and frequency of alleles
varies in space or time
•
Space:
Allele frequency clines in the blue mussel.
Variation across time: temporal variation in a prairie vole (Microtus
ochrogaster) esterase gene.
Measuring Genetic Differentiation: Fst
Fst= normalized variance in allele frequencies among
populations
Fst = Var(p)/p*(1-p*), where Var(p) is the variance in the
frequencies of allele p among populations and p* is the
observed mean allele frequency across populations
Or Fst= the relative reduction in gene diversity in a
single population compared to pooling all populations
Fst = (Ht - Hs)/Ht, where Ht is the expected heterozygosity for a
pooled sample of alleles and Hs is the average expected
heterozygosity within each sub-population
Wright’s F statistics
Separate components of genetic
variation into a hierarchy:
How much genetic variation is contained in
a subpopulation compared to region
a region compared to total
a subpopulation compared to total
Partition of Wright’s F
In general sense, F is the probability that two alleles share a
common ancestor (identity by descent)
Total F = Fit (individual-total)
Local F = Fis (individual-subpopulation)
Regional F = Fst (subpopulation-total)
Fit = Fis + (1- Fis ) Fst
If it ain’t locally inbred, then maybe it is regionally
Fundamental concept; can be defined for any number of levels
Stepwise mutation model
(for SSRs=simple sequence repeats=microsatellites)
Mutation is a progressive change so fragments that migrate
similar distances have had few mutations.
In the case of SSRs, mutation is assumed to change the
number of repeats, increasing or decreasing step by step.
The square of the difference in the number of repeats
between 2 microsatellites is proportional to the time of
divergence from a common ancestor.
Partitioning variation of SSRs: Rst - differentiation based on variance
in allele sizes between populations (Slatkin 1995)
Microsatellite analog of Fst that explicitly takes into account mutational
differences among alleles
Rst = (S - Sw)/S, where S is the average squared difference in size of all
alleles and Sw is the average sum of squares of the differences in allele
sizes within each population
Analogous to Fst = (Ht-Hs)/(1-Ht)
Assumes step-wise mutation model and weights differences between
alleles by size (= # repeats) differences
Issues with the use of the marker type
(Hedrick. 1999. Evolution 53:313-318)
High level of variation constrains maximum value of Fst that
is possible
Max Fst < 1 - Hs or the observed level of homozygosity
Complicates interpretation of significance of Fst values
Biological significance of statistically significant but small
values of Fst (e.g. 0.01) from microsatellite data
Genetic distance
Measures the genetic “difference” between populations; alternative to
variance partitioning
Proportional to the time of separation from a common ancestor
Between-population distance increases with time
Due to genetic drift, mutation
Four major models:
Mutation to infinite alleles
isozymes, sometimes microsatellites
Stepwise mutation
microsatellites
Genetic drift causes random changes of gene frequencies
Mutation in the nucleotide sequence
Genetic distance: infinite allele model of Nei
Expected homozygosities within and between populations
Two populations, "x" and "y"
Jx = probability that two alleles from population x are the same
(expected homozygosity) = ip2ix
Jy likewise defined for population y
Jxy = probability that two alleles chosen from different populations
x and y are the same = ipixpiy
Nei's gene identity I=Jxy/√(JxJy)
Analogous to a correlation coefficient
With multiple loci, take average of Jx , Jy, Jxy over loci
Nei’s genetic distance D = -ln(I)
Increases linearly with time under infinite allele mutation model
Genetic distance: stepwise mutation model
Based on squared difference of mean allele size
ux = mean for population x
uy =mean for population y
2u = (ux-uy)2
Take average over multiple loci
Increases linearly with time with stepwise mutation
Highly dependent on allele size distribution
Often Nei’s infinite allele model better
PNAS December 2, 2008