Transcript statgen4a

Genetic Drift
 In small, reproductively isolated populations,
special circumstances exist that can produce rapid
changes in gene frequencies totally independent of
mutation, recombination, and natural
selection. These changes are due solely to chance
factors. The smaller the population, the more
susceptible it is to such random changes. This
phenomenon is known as genetic drift.
Neutral alleles
 Controversial when first proposed (Kimura 1968)
 Incontrovertible in 2001:




DNA sequence polymorphisms are abundant
In Eukarya, most of the genome is noncoding
most sequence polymorphism lies in noncoding regions
Most sequence polymorphisms appear selectively neutral
• Not useful for studies of genetic adaptation
• Ideal for detection of population substructure and
phylogenetic relationships
Genetic Drift
 For example, when women and their mates are both heterozygous (Aa)
for a trait, we would expect that 1/4 of their children will be
homozygous recessive (aa). By chance, however, a particular couple
might not have any children with this genotype (as shown below in the
Punnett square on the right).

 Unless other families have an unpredictably large number of
homozygous (aa) children for this trait, the population's gene pool
frequencies will change in the direction of having fewer recessive alleles.
Genetic Drift
The net effect of genetic drift on a small population's gene pool can be
rapid evolution, as illustrated in the hypothetical inheritance patterns
shown below. Note that the red trait dramatically increases from
generation to generation. It is important to remember that this can
occur independent of natural selection or any other evolutionary
mechanism.
Inbreeding in finite populations
 Assume a population size of N, therefore 2N alleles
in population. Imagine eggs and sperm released
randomly into environment (e.g. sea)
 What is the probability of 2 gametes drawn
randomly having the same allele?
2N alleles
gen 0
gen 1
probability = 1/(2N)
Inbreeding in finite populations
 Therefore, after 1
generation the level of
inbreeding is F1 = 1/2N
 After t generations the
probability is
1  1 
Ft 
 1
Ft1
2N  2N 
Why?
gen t-1
gen t
1/(2N)
1 - 1/(2N)
1  1 
Ft 
 1
Ft1
2N  2N 
Probability of
picking 2nd allele
Genetic drift and
heterozygosity
t
 1 
Ft  1 1

 2N 
 Genetic drift will to a gradual loss of genetic
diversity
 Follow an individual locus and gene frequency
will drift until one allele becomes fixed
Genetic diversity: HT = (1 – 1/2N) * HT-1
Probability of identity: FT = 1/2N + (1 - 1/2N) * FT-1
Average time to fixation: 4N
Several populations
 Genetic drift will make initially
identical population different
 Eventually, each population will
be fixed for a different allele
 If there are very many populations,
the proportion of populations fixed
for each allele will correspond to
the initial frequency of the allele
 Small populations will get
different more rapidly
Importance of genetic drift
 Two causes for allele substitutions:
 Selection -> adaptive evolution
 Genetic drift ->non-adaptive evolution
 Most populations are geographically structured
 All populations are finite in size
 All genetic variation is subject to genetic drift but
not necessarely to selection
 Genetic drift as a null hypothesis against which
evidence for selection has to be tested
Effective population size
 Effective population size < Census size (in most cases)
 The effective population size is the size of an ”ideal” population
having the genetic properties of the studied population
 The effective population size is determined by

Large variation in the number of offspring

Overlapping generation

Fluctuations in population size

Unequal numbers of males and females contributing to
4NfNm
reproduction
Ne =
1
1
= n
Ne
Σ
1
Ni
(Harmonic mean)
(Nf + Nm)
Effective population size
 What is an ideal population like?
 (Remember - each parent a has an equal and independent chance of
being the parent of each descendent allele.)
 This is approximated by a Poisson distribution of reproductive success.
 (Reproductive success = # of offspring per parent, or per parental
allele.)
Effective population size
Ft  Ft 1
 1 
1
 Ft 1 
Ft  1 
2Ne
 2Ne 
Ne  
Effective population size
Ft  1

1 
1
 Ft 1 
Ft  1 
2Ne
 2Ne 
1
Ne 
2
Effective population size
 For diploids:
 where V is the variance in
reproductive success among
diploid individuals. (Note that
the variance among individuals
in reproductive success with a
Poisson distribution is 2 in a
steady-state population, so that
Ne = N-1/2.)Note also that Ne
can also be as great as 2N-1, if
there is no variance in
reproductive success. This fact is
often used in animal and plant
breeding to slow the loss of
genetic material.
4N  2
Ne 
V 2
Effective population size

1 
1 
1  
1
1 
1 
....1  
1  Ft  1 
 2 Nt 1  2 Nt 2  2 Nt 3   F0 
 Population size in natural populations
does not remain constant
 Ne with population size fluctuations is
approximately the harmonic mean of N
over time:
 The harmonic mean is very sensitive to
small values.
 (Ne << ) if N is variable
1
1
= n
Ne
Σ
1
Ni
(Harmonic mean)
4NfNm
Ne =
(Nf + Nm)
Backwards: the coalescent
approach
Simplification: 0, 1 or 2 offspring
Coalesce: have the same parent
Probability to coalesce: 1/N
Probability Not to coalesce: 1 – 1/N
t generations:
(1-1/N)t
Average time to coalesce for 2 genes: N
For the whole population: 2N
Founder effects
 Another important small population effect is
the founder effect or founder principle. This
occurs when a small amount of people have
many descendants surviving after a number
of generations. The result for a population is
often high frequencies of specific genetic
traits inherited from the few common
ancestors who first had them.
A new population emerges from a
relatively small group of people.
# of
founders
4,000
500
# of
generations
12
80-100
Current size
2,500,000
5,000,000
Hutterites
Japan
Iceland
80
1,000
25,000
14
80-100
40
36,000
120,000,000
300,000
Newfoundland
25,000
16
500,000
2,500
500
12-16
400
6,000,000
1,660,000
Population
Costa Rica
Finland
Quebec
Sardinia
Founder effect example
 In the Lake Maracaibo region of northwest
Venezuela, for instance, there is an extremely high
frequency of a severe genetically inherited
degenerative nerve disorder known as Huntington's
disease. Approximately 150 people in the area
during the 1990's had this fatal condition and more
than 1,000 others were at high risk for developing
it.
 All of the Lake Maracaibo region Huntington's
victims trace their ancestry to one woman who
moved into the area a little over a century ago. She
had an unusually large number of descendents and
was therefore the "founder" of this population with
its unpleasant genetically inherited trait.
Founder effect example
It is also possible to find the results of the
founder effect even though the original
ancestors are unknown. For example,
South and Central American Indians were
nearly 100% type O for the ABO blood
system. Since nothing in nature seems to
strongly select for or against this trait, it is
likely that most of these people are
descended of a small band of closely
related "founders" who also shared this
blood type. They migrated into the region
from the north, mostly by the end of the
last Ice Age.
Bottleneck
In some species, there have been periods of
dramatic ecological crisis caused by changes
in natural selection, during which most
individuals died without passing on their
genes. The few survivors of these
evolutionary "bottlenecks" then were
reproductively very successful, resulting in
large populations in subsequent
generations. The consequence of this
bottleneck effect is the dramatic reduction in
genetic diversity of a species since most
variability is lost at the time of the bottleneck.
Migration
 A cline is a gradual change in allele frequency along a
geographic gradient
 Ecotypes are genetically distinct forms that are consistently
found in certain habitats.
 Changes in allele frequency can be mapped across geographical
or linguistic regions.
 Allele frequency differences between current populations can
be correlated to certain historical events.
 Contrary to selection and genetic drift gene flow homogenizes
allele frequencies
 Genetic diversity is restored if immigrants carry new alleles or
alleles which are rare in the population
Patterns of geographic variation
 Sympatric, parapatric and allopatric variants
 Subspecies are recognizable geographic variants within a
species (usually subject to discussions)
 A hybrid zone is a region where genetically different
parapatric species or population interbreed.
 Character displacement: sympatric populations of two
species differ more than allopatric populations
Allele distributions can reflect
historical events

Creutzfeldt-Jakob disease (CJD) is caused by a
mutation in the prion protein
 70% of families with CJD share the same allele
 Families from Libya, Tunisia, Italy,Chile and Spain
share a common haplotype.
 These populations were expelled from Spain in the
Middle Ages.
Genetic drift and mutation
no mutation
1  1 
Ft 
 1
Ft1
2N  2N 
gen t-1
gen t

1/2N
1 - 1/2N
 1 
1
2
2
Ft 
1 m  1 1 m Ft1
 2N 
2N
Probability of
neither of 2
alleles being
mutated is (1-m)2
Equilibrium Between Mutation
and Drift
 Run the recurrence equation over and over
and eventually it will settle down to an
equilibrium
Probability of
gen t-1
gen t
1/2N
picking 2nd allele
and it not being
mutated
1 - 1/2N

1
1 
2
2
Ft 
1 m  1 1 m Ft1
 2N 
2N
1
ˆ
F  Ft  Ft1 
1 4Nm
Gene flow
2N
m
?
Probability of identity: FT = [1/2N + (1 - 1/2N) * FT-1] * (1 – m)2
(assumes that the immigrants are different from each other)
N = 100
m=0
0.9
0.9
0.8
0.8
0.7
0.7
N = 100
0.6
0.5
0.4
m = 0.01
0.3
0.2
0.6
0.5
0.4
0.3
0.2
0.1
0.1
0
0
0
50
100
150
200
0
50
100
150
200
(Mutation has the same effect)
2N
μ
?
Probability of identity: FT = [1/2N + (1 - 1/2N) * FT-1] * (1 – μ)2
Assumes that each mutation creates a new allele: Infinite allele model
Mutation also retards the loss of genetic variability due to genetic drift
Equilibrium
 After a long time (the longer the larger the
population) there will be an equilibrium
between genetic drift, gene flow and
mutation
FT = [1/2N + (1 - 1/2N) * FT-1] * (1 – m- μ)2
 F and H will not change any more (if
everything remains constant !!)
F 
1
4N μ + 1
Mutation – drift equilibrium
F 
1
4N m + 1
Migration – drift equilibrium
Migration Models – Island Model One Way Migration
m is the probability that a randomly
chosen allele is a migrant
Main Land
P*
Change of allele frequency with one way migration
•A is fixed on the island
•a arrives from the mainland at rate m =0.01
p  P  A  in generation 1 on the island
p  P  A  in generation 2 on the island
p  1  m  p  mp *
Island
p
p  p*  1  m  p  p *
pt  p *  1  m   p0  p *
t
Mutation and Migration
m = 10-4
pt  p0 1  m 
m = 10-2
t
pt  p *  1  m   p0  p *
t
0.484  0.507  1  m   0.474  0.507 
10
Estimation of m
10 generations
Present allele frequency in “island” = pt =
0.484
Allele frequency in mainland p* = 0.507
Initial allele frequency in island = p0 = 0.474
m = proportion of migrant alleles each
generation
0.484  0.507  1  m   0.474  0.507 
10
0.484  0.507
10
 1  m 
0.474  0.507
0.484  0.507
1  m  10
 0.964
0.474  0.507
m  0.035
Estimation of M
10 generations
Present allele frequency in “island” = pt = 0.484
Allele frequency in mainland p* = 0.507
Initial allele frequency in island = p0 = 0.474
m = proportion of migrant alleles each generation
0.484  0.507  1  m   0.474  0.507 
10
0.484  0.507  1  m   0.474  0.507 
10
0.484  0.507
10
 1  m 
0.474  0.507
0.484  0.507
1  m  10
 0.964
0.474  0.507
m  0.035
Island Model of Migration
Many large subpopulations
Average allele frequency
= frequency in migrants
The change frequency in the subpopulations:
pt  p  1  m   p0 p 
t
p10  0.5  1  0.1  0.2  0.5  0.395
10
p
p
p = 0.2
p10  0.5  1  0.1  0.8  0.5  0.605
p = 0.8
10
p
 0.2  0.8
2
m = 0.10
Change in allele frequency over
time
Five subpopulations
1, 0.75, 0.5, 0.25, 0
Summary
 Genetic drift: In a finite population allele frequencies fluctuate at
random and eventually one allele will be fixed
 After 4N generations all individuals descend from one ancestor
 Genetic diversity is lost more rapidly in small populations
 Inbreeding reduces the number of heterozygotes
 Inbred individuals can have lower fitness: inbreeding depression
 The genetic composition of isolated populations diverges under the
effect of genetic drift
 Gene flow homogenizes allele frequencies among populations
 After a long time, the genetic variability in a population reaches an
equilibrium level: mutation – immigration – drift equilibrium