Slides-Brian_Charlesworth-Sex_and_molecular_evolution

Download Report

Transcript Slides-Brian_Charlesworth-Sex_and_molecular_evolution

Sex and molecular evolution
Brian Charlesworth
Institute of Evolutionary Biology
School of Biological Sciences
University of Edinburgh
• Sex is the most prevalent mode of reproduction
among the great division of life (the eukaryotes)
that includes the animals, green plants, algae,
fungi and protozoa.
• All mammals and all birds reproduce sexually,
and there are only a few dozen examples of
asexually reproducing species among reptiles,
amphibia and fish.
• Only about 0.1% of the over 300,000 species of
flowering plants are thought to reproduce
asexually.
An exception to the recent origin of asexual species
The Bdelloid rotifer Philodina roseola
(Meselson laboratory)
• A regular cycle of sexual reproduction is absent
from the other division of life (prokaryotes:
bacteria and viruses).
• There is, however, often detectable exchange of
pieces of genetic information between
individuals within prokaryote populations,
involving a variety of processes that act as a
substitute for sex.
The essence of sexual reproduction is
the reshuffling of genetic information
derived from the two parents of an
individual (genetic recombination).
To understand the evolutionary
significance of sex, we need to
understand the significance of
recombination.
Structure of DNA
How evolution works
• Evolution involves the transformation of variation between
members of a population into differences between ancestral and
descendant populations
• At the level of DNA sequences, this variation can be studied by
comparing the sequences of the same region of the genome in
different individuals
The gene for glucose-6-phosphate dehydrogenase
The set of basepairs for a given DNA sequence is known
as a haplotype
• The most general description of the state of a population would
thus be a list of the frequencies of all possible haplotypes.
• We could then characterise evolutionary change in terms of the
rates of change of the frequency of each haplotype, xi , where
xi is the frequency of the ith haplotype.
• In practice, we usually collect data on, or model the evolution
of, only limited portions of the genome, the simplest level being
that of a single basepair.
The genetic processes of evolution
• Mutation– changes in the sequence of DNA that occur during
transmission of a chromosome from parent to offspring.
• Natural selection– differences in fitness (survival and
reproductive success) between individuals with different genetic
make-ups
• Recombination– reshuffling of genetic material between the
chromosomes derived from different parents
• Genetic drift– random fluctuations in the frequencies of genetic
variations, caused by the finite size of the population
R. A. Fisher
J.B.S. Haldane
Sewall Wright
Mutation rates
• The most common type of mutation is a change from one
basepair to another, e.g. GC mutates to AT.
• Direct estimates have recently been done in several species of
animals and plants, and show that probability that a given site in
the DNA changes its state is of the order of 10-9 to 10-8 per
generation; this is the mutation rate per basepair.
• This means that mutation is a very weak force, operating on a
timescale of hundreds of millions of generations.
• Nevertheless, it is crucial for evolutionary change to happen.
Measuring the mutation rate
Initial isogenic stock
x
Many separate
single-pair mated
lines
~ 200 generations
to allow mutations to
arise
1
3
2
1
5
4
3
2
7
6
5
4
9
8
7
6
11
10
9
8
13
12
11
10
13
12
DNA extracted and sequenced, or mutations detected
by special methods
Selection
The simplest form of selection is when two alternative variants
at a given site in the genome confer differences in fitness on the
their carriers.
Let one variant be called A1 and the other A2.
Let the ratio of the fitnesses of A2 and A1 individuals be 1 – s .
The quantity s measures the intensity of selection, and is called
the selection coefficient.
If q is the frequency of A2, the change in q is q, given
approximately by sq(1 – q).
The peppered moth Biston betularia, with
melanic (dark) and non-melanic forms
Genetic recombination
A1
Parental
combinations of
two variants
B1
A T A G C T TG A C C T A T G
A2
B2
A A A G C C TG A G C T A T A
Recombination between
A and B sites or between
polymorphic sites in a
DNA sequence
Recombinant
haplotypes
A2
B1
A A A G C C TG A C C T A T G
A1
B2
A T A G C T TG A G C T A T A
Genetic drift
• The frequencies of genetic variants in one generation are a
random draw from the frequencies of the parents in the previous
generation.
• This cause variant frequencies to experience a “random walk”
(genetic drift).
• The state of the population must then be charaterised by a
probability distribution of variant frequencies, e.g. the
probability that variant A2 at a site takes frequency q at time t
• The rate at which the scatter of this distribution occurs is
measured by the inverse of the effective size of the population,
Ne.
• The timescale of genetic drift is of the order of Ne generations.
Drift in lab populations
of the fruitfly Drosophila
melanogaster
Generations
Numbers of copies of the bw eyecolour variant
The general equation of evolution
(forward diffusion equation)
(x, t)
=t

i
( xi )
+
xi
1
2

i <j
2
 ( Cij )
xi xj
Here, x is the vector of haplotype frequencies, (x, t) is the
probability density of x at time t, xi is the deterministic change in
the frequency of the ith haplotype, Cij is the covariance between the
random changes in frequencies of haplotypes i and j.
The Cij are all proportional to 1/Ne; this means that we can multiply
both sides by Ne, and work with Net as a time unit and with Ne xi
instead of xi.
What has all this got to do with the evolution of
sex and recombination?
• In order to understand how sexual reproduction and genetic
recombination influence the evolutionary process, we need to
have well-formulated models that can be related to data.
• To produce these models, we need to include processes that are
likely to be operating in the real world.
• Before introducing them, let’s look at some patterns that are
revealed by studying DNA sequence variation and evolution.
• Differences within different regions of the genome that
experience different levels of genetic recombination have
proved particularly useful for revealing these patterns.
The Drosophila melanogaster
genome
(heterochromatin in
black)
Some correlates of low versus high
recombination
1.
Regions of the genome with unusually low rates of genetic
recombination often seem to have low levels of within-species
DNA sequence variability.
2.
Species with low levels of genome-wide recombination, such
as largely self-fertilizing plants and animals, also show
reduced variability.
3.
The level of adaptation at the protein and DNA sequence level
is often reduced in non-recombining genomic regions.
The fourth/dot chromosome in D. melanogaster
1.2 million basepairs (Mb) / ~80 genes
Low recombination:
No crossing over under normal lab conditions
Characteristics of the dot chromosome:
Low silent site variability (about 10% of genomewide mean)
Divergence between D. melanogaster and D. yakuba
P. Haddrill et al. 2007 Genome Biology 8:R18
Drosophila miranda neo-sex chromosomes
Polymorphisms on the two neo-sex chromosomes
Neo-X
Neo-Y
Mean silent diversities: neo-X = 0.39% , neo-Y = 0.004%
A general feature of low recombination
genome regions
A lack of recombination among a set of genes in a genome or
genomic region means that the evolutionary fates of mutations
at different sites are not independent of each other, so that they
can interfere with each other’s evolution.
This is the Hill-Robertson effect.
(Hill and Robertson 1966 Genetical Research 8: 269-294)
The Hill-Robertson effect
A1
Fitness = 0.9
B1
Mutation A1  A2
Fitness = 0.95
A2
Mutation
B1  B2
B1
A1
Fitness = 1
Maximum fitness possible with both
advantageous mutations A2 and B2
Fitness = 1.05
A2
B2
B2
• The effective population size (Ne) of large non-recombining
portions of the genome is substantially reduced by such
interference among genes subject to selection.
• This leads to a reduction in the level of neutral variability in
DNA sequences
• Genes in low recombination genomic regions are more likely to
accumulate deleterious mutations, and less likely to fix
selectively advantageous mutations, than genes in regions with
normal or high recombination rates, since the chance of spread
of mutation with selection coefficient s is determined by the
magnitude of Nes.
Selection against deleterious mutations
with low recombination (background selection)
1
1
2
3
4
5
3
6
7
7
10 different sequences
7 different sequences
Does background selection have important
effects?
To answer this, we need to know:
1.
The rate of input of deleterious mutations into the population
each generation
2.
The frequency distribution of the sizes of effects of these
mutations on fitness (i.e. their selection coefficients)
• Our previous theoretical work showed that, using
estimates of the selection intensities against amino-acid
mutations, background selection wildly over-estimates
the reduction in neutral variability on the dot
chromosome and neo-Y chromosome.
(Loewe and Charlesworth 2007 Genetics 175: 1381-1393.)
• What is going on?
Weak Selection Hill-Robertson Effects
These models assumed that selection is sufficiently strong
relative to drift that deleterious mutations are mostly held close
to their equilibrium values for an infinitely large population.
If a large number of sites with low recombination are under
selection, this does not hold because of their mutual H-R
interference, which means that deleterious variants can drift to
intermediate frequencies.
This reduces the strength of their HR effects.
The Model:
• Population of 500 diploid individuals
• 2 selected sites alternating with one neutral site
00101000010100111001011100010101
01110100111011000101001001110001
• mutation: 0 → 1 or 1 → 0 (equal rates); red sites are under selection against
amino-acid mutations (0 is good, 1 is bad), black sites have no fitness effects
• crossing over/gene conversion:
0011000111000011111001
1111000000010111000000
0011000111000011000000
1111000000010111111001
• multiplicative effects on fitness of mutations at different sites
(Kaiser and Charlesworth 2009, Trends in Genetics 25: 9-12)
Results:
Compare observed reduction in neutral diversity (π/π0) to expectation under
background selection model:
B = exp –
 s (1 + [1 – s ]r /s )
ui
2
i
i
i i i
Log (B(observed/B(expected))
8
7
6
5
no recombination
4
only gene conversion
3
cross over and gene
conversion
2
1
0
0
-1
50,000
100,000
150,000
200,000
250,000
300,000
350,000
number of sites
→ formula overpredicts reduction in π if recombination rates are
low and chromosomes long
Results:
Exponential decline of neutral diversity with chromosome length
neutral diversity (relative value)
1.2
1
no recombination
0.8
only gene conversion
0.6
cross over and gene
conversion
0.4
0.2
0
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
number of sites
Drosophila dot chromosome
D. miranda neo-Y
Conclusions
• Many features of variability in the non-recombining
regions of the Drosophila genome are captured by
models involving mutations at a large number of
selected sites, corresponding to the amino-acid coding
sites in genes.
• This suggests that the reduction in effectiveness of
selection resulting from Hill-Robertson interference of
this kind may be a major player in the evolutionary
significance of recombination and sexual reproduction.
ACKNOWLEDGEMENTS
• THEORY: Deborah Charlesworth, Isabel Gordo, Vera
Kaiser, Laurence Loewe, Martin Morgan, Magnus
Nordborg
• DATA AND ANALYSES: Andrea Betancourt, Doris
Bachtrog, Penny Haddrill, John Welch
• MONEY: BBSRC, NSF, Royal Society