Transcript statgen4
Genetic diversity and evolution
Content
Summary of previous class
H.W equilibrium
Effect of selection
Genetic Variance
Drift, mutations and migration
Hardy-Weinberg assumptions
If all assumptions were met, the population would not
evolve
Real populations do in general not meet all the assumptions:
Mutations may change allele frequencies or create new
alleles
Selection may favour particular alleles or genotypes
Mating not random -> Changes in genotype frequencies
Population not infinite -> random changes in allele
frequencies: Genetic drift
Immigrants may import alleles with different frequencies
(or new alleles)
Fitness
The average fitness is:
W=(1-r)p2+2pq+(1-s)q2=1-rp2-sq2
DP=([(1-r)p2+pq]/W)-p=pq[s-(r+s)p]/W
0,1
[ p, q ]
1,0
s /( r s), r /( r s)
Hetrozygote Advantage
Pn+1-(s/(r+s))=[(1-r)pn2+pnq]/W -(s/(r+s))
=[(1-r)pn2+pnq]-(s/(r+s))W]/W=
=[(1-r)pn2+pnq]-(s/(r+s)) (1-rpn2-sqn2)]/W
= (1-rpn-sqn)/W[pn-(s/(r+s)]
The difference decreases to zero only for
positive r and s. Thus the scenario in which
both alleles can survive is Hetrozygote
Advantage
Recessive diseases
If r>0, and s=0, the disadvantage appears only
homozygotic A1.
In this case: pn+1=pn(1-rpn)/(1-rpn2)
1/pn+1-1/pn=1/pn[(1-rpn2)/(1-rpn)-1]=
[r(1-pn2)/(1-rpn)]
1/pn-1/p0=nr
Fitness Summary
Third fix point is in the range [0,1] only if r
and s have the same sign.
It is stable only of both r and s are positive
In all other cases one allele is extinct.
If r>0 and s=0 then the steady state is still
p=0, but is is obtained with a rate
pn=1/(nr+1/p0)
New Concepts
Genetic Variation
Genetic drift
Founder effects
Bottleneck effect
Mutations
Selection
Non Random Mating
Migration
Genetic Variation
1)
2)
3)
Three fundamental levels and each is a genetic resource
of potential importance to conservation:
Genetic variation within individuals (heterozygosity)
Genetic differences among individuals within a
population
Genetic differences among populations
Species rarely exist as panmictic population = single,
randomly interbreeding population
Typically, genetic differences exist among populations—
this geographic genetic differences=Crucial component
of overall genetic diversity
heterozygosity
Several measures of heterozygosity exist. The value of these
measures will range from zero (no heterozygosity) to nearly 1.0
(for a system with a large number of equally frequent
alleles). We will focus primarily on expected heterozygosity
(HE, or gene diversity, D). The simplest way to calculate it for a
single locus is as:
H 1 pi2
Eqn 4.1where pi is the frequency of the ith of k alleles. [Note
that p1, p2, p3 etc. may correspond to what you would normally
think of as p, q, r, s etc.]. If we want the gene diversity over
several loci we need double summation and subscripting as
follows
H 1 pij2
i
j
Heterozygosity
In H.W heterozygosity is given by 2pq. The rest of the expression (p2 +
q2) is the homozygosity.
What does heterozygosity tell us and what patterns emerge as we go to
multi-allelic systems? Let’s take an example. Say p = q = 0.5. The
heterozgosity for a two-allele system is described by a concave down
parabola that starts at zero (when p = 0) goes to a maximum at p = 0.5
and goes back to zero when p = 1. In fact for any multi-allelic system,
heterozygosity is greatest when
p1 = p2 = p3 = ….pk
The maximum heterozygosity for a 10-allele system comes when each
allele has a frequency of 0.1 -- D or HE then equals 0.9. Later, we will
see that the simplest way to view FST (a measure of the differentiation of
subpopulations) will be as a function of the difference between the
Observed heterozygosity, Ho, and the Expected heterozygosity, HE,
Genetic Variation
HT = HP + DPT
where HT = total genetic variation (heterozygosity) in the
species;
HP = average diversity within populations (average
heterozygosity)
DPT = average divergence among populations across total
species range
*Divergence arise among populations from random processes (founder effects,
genetic drift, bottlenecks, mutations) and from local selection).
Genetic differentiation
Inbreeding coefficients
can be used to measure
genetic diversity at
different hierachical
levels
Individual
Subpopulation
Total population
Wright’s F statistics
Used to measure genetic differentiation
Sometimes called fixation index
Defines reduction in heterozygosity at any
one level of population hierachy relative to
any other
levels: Individual - Subpopulation - Total
Wright’s F statistics
Heterozygosity based on allele frequencies,
H = 2pq.
HI, HS, HT refer to the average heterozygosity
within individuals, subpopulations and the
total population, respectively
Wright’s F statistics
Drop in heterozygosity defined as
HT H S
FST
HT
HS HI
FIS
HS
HT H I
FIT
HT
Example
2 subpopulations, gene frequencies p1 = 0.8, p2 = 0.3.
Gene frequency in total population midway between them
pt = 0.55
HS1 = 2p1q1 = 2 x 0.8 x (1-0.8) = 0.32
HS2 = 2p2q2 = 2 x 0.3 x (1-0.3) = 0.42
HS = average(HS1, HS2) = (0.32 + 0.42)/2 = 0.37
HT = 2 x 0.55 x (1 - 0.55) = 0.495
HT HS 0.495 0.37
FST
0.252
HT
0.495
Identity by descent
Imagine self-fertilising plant
A - A
1,2 - 1,2
|
|
X
?
1/4 of offspring will be of genotype 1,1
1/2 of offspring will be of genotype 1,2
1/4 of offspring will be of genotype 2,2
FX (inbreeding coefficient) is probability of IBD = 1/2
equivalently, let fAA be the probability of 2 gametes taken at
random from A being IBD.
Mutation occurred once
•Every mutation creates a new allele
•Identity in state = identity by descent (IBD)
A1A1
A1A2
A1A1
A1A2
A1A2
A2A2
The same mutation arises
independently
A1A1
A1A2
A1A2
A1A1
A1A1
A1A2
A1A2
A2A2
A1A2
A1A2
A2A2
A2 A2 IBD
A2 A2 IBD
A2 A2
alike in state (AIS)
not identical by descent
A1A1
Identity by descent
A - B
C - D
|
|
P
Q
|
X
Let fAC be the coancestry of A with C etc., i.e. the probability of
2 gametes taken at random, 1 from A and one from B, being
IBD.
Probability of taking two gametes, 1 from P and one from Q, as
IBD, FX
1
1
1
1
FX f PQ f AD f AC f BC f BD
4
4
4
4
Identity by descent
Example, imagine a full-sib mating
A - B
/
\
P - Q
|
X
Indv. X has 2 alleles, what is the probability of IBD?
1
1
1
1
FX f PQ f AD f AC f BC f BD
4
4
4
4
1
1 1 1 1
2 f AB f AA f BB 0
4
4 2 2 4
Identity by descent
Example, imagine a half-sib mating
A - B - C
|
|
P - Q
|
X
1
1
1
1
FX fPQ fAD fAC fBC fBD
4
4
4
4
1
1
1 1
2 fAB fAC fBC fBB 0 0 0
4
4
2 8
Mutations
m=0.0001
pt 0.5 p0
0.5 1 m
t1/ 2
t
ln 0.5
ln 1 m
6931 generations
A mutates to a at the rate m
a reverts back to A at the rate v
The equilibrium value for the frequency of A
is given by
ν
pˆ
μ+ν
SNP
Single Nucleotide Polymorphism (SNP) =
naturally occuring variants that affect a single
nucleotide
-predominant form of segregating variation
at the molecular level
SNPs are classified according to the nature of
the nucleotide that is affected
-Noncoding SNP
Coding SNPs
5' or 3' nontranscribed region (NTR)
5' or 3' untranslated region (UTR)
introns
intergenic spacers
replacement polymorphisms
synonymous polymorphisms
Transitions
[A to G OR C to T]
Transversions [A/G to C/T OR C/T to
A/G]
Natural Selection
Tuberculosis (TB) infections have historically
swept across susceptible populations killing many.
TB epidemic among Plains Indians of Qu’Appelle
Valley Reservation
annual deaths
1880s
10 %
1921
7 %
1950
0.2%
Nonrandom mating
Random mating occurs when individuals of one genotype
mate randomly with individuals of all other genotypes.
Nonrandom mating indicates individuals of one genotype
reproduce more often with each other
Ethnic or religious preferences
Isolate communities
Worldwide, 1/3 of all marriages are between people born within 10
miles of each other
Cultures in which consanguinity is more prominent
Consanguinity is marriage between relatives
e.g. second or third cousins