Evolution and conservation genetics

Download Report

Transcript Evolution and conservation genetics

Evolution and conservation
genetics
Neutral model of evolution
What governs heterogyzosity levels?
 Neutral model of drift and mutation
 Single population
 Constant size
 Drift occurs at rate 1/2N per generation
 Mutation creates new or alternative
alleles and prevents fixation of alleles

What model of mutation does a gene
locus follow under the neutral model?

Infinite Alleles Model

Stepwise-Mutation Model
Infinite Alleles Model (IAM)
(Crow-Kimura Model)

Average protein contains about 300
amino acids (900 nucleotides)
4900  10542

Mutations always occur to new
alleles
 Finite population size (drift)
 How is loss of alleles due to drift
balanced by new mutations
Do allozymes really
fall under a mutationdrift process?
What is the equilibrium heterozygosity
predicted by IAM?
F = probability that two alleles
are both copies of the same
ancestral allele (identical by
descent)
Probability that you are
not identical by descent
and neither allele has
mutated

1
1 
2
Ft 1(1  )2
Ft 
(1  u )  1 
2Ne
2Ne 

Probability that two alleles are
IBD. No mutation.
Both
alleles
do not
mutate
At equilibrium then…
Ft  Ft 1
1
Ft 
4N  1
But we have two measures of
homozygosity both measure the same
thing thus equal each other
1
 p  Ft  4N  1
2
i
Can you derive this?
If H=1-F, then what is H at a mutation drift equilibrium?
Heterozygosity at a mutation drift
equilibrium, given an IAM is…
4Ne 
H
4Ne   1
1.2
μ=0.001
1.0
When mutation rates are
held constant then as
population size increases:
higher equilibrium
heterozygosity
0.8
Heterozygosity
When mutation rates are
high and population size is
held constant: higher
equilibrium heterozygosity.
0.6
μ=10-5
μ=10-7
0.4
0.2
0.0
1e+1
1e+2
1e+3
1e+4
Population Size
1e+5
1e+6
1e+7
Stepwise-mutation model (SMM)
(Ohta and Kimura)
Generated by slipped strand mispairing,
mutations occur only at adjacent sites.
Mutation can produce alleles already
present in the population.
Expect that the equilibrium level of
heterozygosity under SMM to be lower
than that of IAM.
1
H  1
8N e   1
Genetic diversity and population size

What is the effect of “finite” population size on
gene frequencies

The various ways to mathematically study it

Effective population size
Drift defined

Random changes of gene frequencies
among generations

More important with
Small population sizes
 Fluctuation in population size
 Low selection and migration
 Long time periods

A simple simulation of drift: “replicated
outcomes” (mean frequency is dotted)
Buri’s (1956) classic genetic drift experiment showing the number of
wildtype versus neutral mutant alleles in populations of 16 Drosophila
followed through time: “gene frequency distribution”
Generalized effect of drift


Allele frequencies do not change (much) on the landscape
scale
 Within populations, drift decreases genetic variance
 Between populations, drift increases genetic variance
Consider the following to simply illustrate the principle:
 In a Buri-like experiment on 4 lines of n=4 hermaphrodite
snails, the frequency of an albinism allele was as follows at
generations 2 and 6.
Generation 2
0
1
2
3
4
5
variance = 2.67
6
Generation 6
7
8
0
1
2
3
4
5
6
variance = 21.33
7
8
Observed vs. expected changes of mean and
variance of gene frequency
Loss of heterozygosity due to drift
Buri used a population of size nine
Effective population size
Governs random change of gene frequency, p
 Depends on several factors
 All those that reduce the size of the breeding
population
 Ne = number of individuals in an ideal population which
has the same magnitude of genetic drift as the actual
population.

Wright-Fisher model

Assume that the number of offspring is
distributed as a Poisson variable with



Mean = 2 ; Variance = 2
In this case, Ne = N
No selection,
 Random mating, random number of offspring
Factors reducing N to Ne
Only adults of reproductive age count
 Sex ratio
 Variation in size over time
 Variation in offspring number
 Inbreeding (self-fertilization)

Factors reducing N to Ne -- 1
Ne usually less than census population
size
 Non-breeding individuals do not
contribute juveniles



“bachelor males”
post-reproductives
Factors reducing N to Ne -- 2

different number of breeding individuals in the
two sexes – one sex represented by a small
number of breeding individuals

example:

Captive bred animals – only one male used
for breeding
Different numbers of males and females
Analogous to having two different population sizes
Unequal sex ration
The effective population size is strongly
influenced by the rarer of the two sexes.
Factors reducing N to Ne -- 3

Variation in number of offspring
produced by different individuals

Ne smaller when offspring numbers are
more unequal

Ne can be larger when variation in
offspring number is reduced
4N  2
Ne 
V 2

V is the variance of reproductive
success
 What is upper limit for effective
population size?
Factors reducing N to Ne -- 4
• Variation of population size in different generations
• Consider the effect on loss of variation caused by the
specific population in size in generations 1, 2, 3, .... ,t.
harmonic mean: occasional severe reductions in population
size will predominate over long stretches of stable large
population size in reducing variability
N=1000, 10, 1000
Factors reducing N to Ne -- 5

Self-fertilization causes increases of
homozygosity (most extreme form of close
inbreeding, or mating between relatives)

f = fraction of loci in which both alleles are
copies of an immediate ancestor

Ne = N / (1+f)
Effective size in continuous populations

What if there is one population, and



mating occurs to nearby individuals
progeny are dispersed a short distance
“Neighborhood size” (Wright 1943)

Number of individuals within which 95% of the alleles derive from
the previous generation


(twice the standard deviation of gene flow in one direction, … don’t
worry about the formula…)
Mainly applied to plants, Ne= 500-1000; why?
Estimation of effective population size

Demographic data (variance of number of
offspring, variation of population size



direct… but
usually difficult to obtain
Can use genetic data

reconstruct parentage of current population


temporal changes of gene frequency


(paternity analysis, in a few weeks)
(need to separate from sampling variance)
heterozygote excess, between few parents

(only applicable to very small populations)
Heterozygosity vs. allele number
as indicators of variation
Rarer alleles are lost in bigger bottlenecks
(n)
Predicted
Observed
reduction of
reduction of
H = (1/N)
Na =(8-n)/8
______________________
0.001
0.005
0.050
0.100
0.500
0.000
0.024
0.518
0.664
0.831
rarer genes lost faster than predicted by heterozygosity model!
Bottlenecks and founding effects
These
are special cases of genetic drift
Especially
important in conservation genetics
The Founder effect
•New populations often started by small numbers of migrants (analogous
to bottleneck)
• Carry only a fraction of the genetic variability of the parental population
• New populations tend to differ randomly both from the parent population
and from each other, tend to be “inbred”
•Applies to:
•Invasive species
•Island colonists
•Examples…
•Amish of Lancaster Co., PA (Ellis-van Creveld syndrome)
•Pirates of Pitcairn Isle
The Cheetah bottleneck





15,000 to 20,000 cats in the wild
All sampled cheetah share the same allozymes (Cohn 1996)
 homozygosity of 100%, population 0% polymorphic
For genes mediating immune response, foreign skin is
recognized as their own
Why? Two bottlenecks – 10,000 years ago and another in the
last two centuries
Work of Stephen J. O’Brian and collaborators (Cat genome
project)
Intrinsic rate of growth affects H after a bottleneck
Dotted line: N=10
Solid line: N=2
Loss of alleles mainly depends on bottleneck size, not
rate of growth following bottleneck
Genetics 144: 2001-2014 (December, 1996)
Heterozygosity excess: difference between
the observed heterozygosity and the heterozygosity
expected from the observed number of alleles.
Journal of Heredity, 1998
Data from real populations
Inference of colonization history: the
Northern elephant seal





Formerly ranged from Mexico-California
Hunted and collected to death
Few survivors on Isla Guadalupe, Mexico (10-100?)
Currently 200,000, many in Central/Southern Calif.
How small was the bottlenecked population?
Attempted reconstruction of the bottleneck of
Northern Elephant Seals

Currently, two mitochondrial DNA haplotypes have frequency
0.27, 0.73, giving He=0.40
Museum sample of pre-bottleneck samples gave He=0.80

Use Ht=H0 (1-1/2Ne1) (1-1/2Ne2)…(1-1/2Net)





One generation bottlenect of 15 gives


This allows Ne to increase following the bottleneck (1922-1960)
Rate of increase about 1.7 per generation
Allows population to grow from 15 to 200,000 in 38 years
H0 =.80, H1 =.59, H2 =.50, H3 =.45… to H=.40 very shortly
But microsatellites don’t show such a reduction of diversity, why?
Inbreeding due to small population size
– Has predictable consequences for allele frequencies
and genotype frequencies:
• Increases the frequency of homozygous genotypes
– Similar in effect to:
• Genetic drift
• Variation in population size over time
• Skewed sex ratios, etc.
– Two “kinds” of inbreeding:
nonrandom – self-fertilization
random
Random inbreeding
Mutational meltdown?
Populations enter a positive feedback loop
– Inbreeding depression increases, population size
decreases
– Effect of drift increases: deleterious mutations
become fixed
– As deleterious mutations become fixed, inbreeding
depression increases
–Maybe the population dies!
Among-population gene diversity

Within populations (so far)

Between populations
Genetic variation in space and time in
populations
•
Genetic structure of populations and frequency of alleles
varies in space or time
•
Space:
Allele frequency clines in the blue mussel.
Variation across time: temporal variation in a prairie vole (Microtus
ochrogaster) esterase gene.
Measuring Genetic Differentiation: Fst

Fst= normalized variance in allele frequencies among
populations


Fst = Var(p)/p*(1-p*), where Var(p) is the variance in the
frequencies of allele p among populations and p* is the
observed mean allele frequency across populations
Or Fst= the relative reduction in gene diversity in a
single population compared to pooling all populations

Fst = (Ht - Hs)/Ht, where Ht is the expected heterozygosity for a
pooled sample of alleles and Hs is the average expected
heterozygosity within each sub-population
Wright’s F statistics

Separate components of genetic
variation into a hierarchy:

How much genetic variation is contained in
a subpopulation compared to region
 a region compared to total
 a subpopulation compared to total

Partition of Wright’s F

In general sense, F is the probability that two alleles share a
common ancestor (identity by descent)

Total F = Fit (individual-total)
Local F = Fis (individual-subpopulation)
Regional F = Fst (subpopulation-total)



Fit = Fis + (1- Fis ) Fst
 If it ain’t locally inbred, then maybe it is regionally

Fundamental concept; can be defined for any number of levels
Stepwise mutation model
(for SSRs=simple sequence repeats=microsatellites)

Mutation is a progressive change so fragments that migrate
similar distances have had few mutations.

In the case of SSRs, mutation is assumed to change the
number of repeats, increasing or decreasing step by step.

The square of the difference in the number of repeats
between 2 microsatellites is proportional to the time of
divergence from a common ancestor.
Partitioning variation of SSRs: Rst - differentiation based on variance
in allele sizes between populations (Slatkin 1995)

Microsatellite analog of Fst that explicitly takes into account mutational
differences among alleles

Rst = (S - Sw)/S, where S is the average squared difference in size of all
alleles and Sw is the average sum of squares of the differences in allele
sizes within each population

Analogous to Fst = (Ht-Hs)/(1-Ht)

Assumes step-wise mutation model and weights differences between
alleles by size (= # repeats) differences
Issues with the use of the marker type
(Hedrick. 1999. Evolution 53:313-318)

High level of variation constrains maximum value of Fst that
is possible
 Max Fst < 1 - Hs or the observed level of homozygosity
 Complicates interpretation of significance of Fst values

Biological significance of statistically significant but small
values of Fst (e.g. 0.01) from microsatellite data
Genetic distance




Measures the genetic “difference” between populations; alternative to
variance partitioning
Proportional to the time of separation from a common ancestor
Between-population distance increases with time
 Due to genetic drift, mutation
Four major models:

Mutation to infinite alleles
 isozymes, sometimes microsatellites
 Stepwise mutation
 microsatellites
 Genetic drift causes random changes of gene frequencies
 Mutation in the nucleotide sequence
Genetic distance: infinite allele model of Nei

Expected homozygosities within and between populations
 Two populations, "x" and "y"
 Jx = probability that two alleles from population x are the same
(expected homozygosity) = ip2ix
 Jy likewise defined for population y
 Jxy = probability that two alleles chosen from different populations
x and y are the same = ipixpiy

Nei's gene identity I=Jxy/√(JxJy)
 Analogous to a correlation coefficient
 With multiple loci, take average of Jx , Jy, Jxy over loci

Nei’s genetic distance D = -ln(I)
 Increases linearly with time under infinite allele mutation model
Genetic distance: stepwise mutation model

Based on squared difference of mean allele size

ux = mean for population x
 uy =mean for population y
 2u = (ux-uy)2

Take average over multiple loci
 Increases linearly with time with stepwise mutation
 Highly dependent on allele size distribution
 Often Nei’s infinite allele model better
PNAS December 2, 2008