Transcript ppt

Lecture 26 : Tests of Neutrality 2
April 17, 2014
Last Time
Sequence data and quantification of variation
 Infinite sites model
 Nucleotide diversity (π)
Sequence-based tests of neutrality
 Ewens-Watterson Test
 Tajima’s D
Today
Hudson-Kreitman-Aguade Test
Synonymous versus Nonsynonymous
substitutions
McDonald-Kreitman
Hudson-Kreitman-Aguade Test
Divergence between species should be of same
magnitude as variation within species
Provides a correction factor for mutation rates at
different sites
Complex goodness of fit test
Perform test for loci under selection and
supposedly neutral loci
Hudson-Kreitman- Aguade(HKA) test
(Hamilton 266)
Hudson-Kreitman- Aguade(HKA) test
Adh
Polymorphism within 0.101
species (S/m)
Divergence between 0.056
Species(D/m)
Ratio
1.80
(within/between)
χ2
6.09
p-value
0.016
Control locus
0.022
0.052
0.42
http://www.nsf.gov/news/mmg/media/images/corn-and-teosinte_h1.jpg
Teosinte
Maize
Maize w/TBR mutation
http://www.nsf.gov/news/mmg/media/images/corn-and-teosinte_h1.jpg
Mauricio
2001; Nature Reviews Genetics 2, 376
Problem 3. Files utr_mays.arp, utr_par.arp,
exon_mays.arp, and exon_par.arp contain sequence
data from the 5’ untranslated region and from an exon
of the teosinte branched1(tb1) gene of maize (Zea
mays ssp. mays) and its most likely wild progenitor
Zea mays ssp. parviglumis.
File
Region
of tb1
utr_mays.arp
5’ UTR
utr_par.arp
5’ UTR
exon_mays.arp exon
exon_par.arp
exon
Subspecies
mays
parviglumis
mays
parviglumis
For each of these regions of tb1 and for each subspecies:
a) Use Arlequin to determine the number of segregating sites (S) and calculate the nucleotide
diversity (). What can you infer by comparing nucleotide diversity between the two species
for each region?
b) Use Arlequin to perform the tests of neutrality developed by Ewens-Watterson and Tajima.
Interpret and discuss the results both statistically and biologically.
c) Interpret and discuss the results from the following 2 HKA tests:
Test A
tb1 5’ untranslated Average of
region
control loci
Polymorphism 0.00093
0.01996
within
subspecies
Divergence
0.05255
0.02242
between
subspecies
χ2
13.58
p-value
0.001
Test B
Polymorphism
within
subspecies
Divergence
between
subspecies
χ2
p-value
tb1 translated
region
0.00243
Average of
control loci
0.01996
0.01273
0.02242
2.70
0.26
HKA Example: Teosinte Branched
Using Synonymous Substitutions to Control
for Factors Other Than Selection
dN/dS or Ka/Ks Ratios
Types of Mutations (Polymorphisms)
Synonymous versus
Nonsynonymous SNP
 First and second position
SNP often changes
amino acid
 UCA, UCU, UCG, and UCC
all code for Serine
 Third position SNP often
synonymous
 Majority of positions are
nonsynonymous
 Not all amino acid
changes affect fitness:
allozymes
Synonymous & Nonsynonymous Substitutions
Synonymous substitution rate can be used to set neutral
expectation for nonsynonymous rate
dS is the relative rate of synonymous mutations per
synonymous site
dN is the relative rate of nonsynonymous mutations per
non-synonymous site
 = dN/dS
If  = 1, neutral selection
If  < 1, purifying selection
If  > 1, positive Darwinian selection
For human genes,  ≈ 0.1
Complications in Estimating dN/dS
 Multiple mutations in a codon
give multiple possible paths
 Two types of nucleotide base
substitutions resulting in SNPs:
transitions and transversions not
equally likely
CGT(Arg)->AGA(Arg)
CGT(Arg)->AGT(Ser)->AGA(Arg)
CGT(Arg)->CGA(Arg)->AGA(Arg)
 Back-mutations are invisible
 Complex evolutionary models
using likelihood and Bayesian
approaches must be used to
estimate dN/dS (also called KA/KS
or KN/KS depending on method)
(PAML package)
http://www.mun.ca/biology/scarr/Transitions_vs_Transversions.html
dn/ds ratios for 363 mouse-rat comparisons
 Most genes show purifying
selection (dN/dS < 1)
 Some evidence of positive
selection, especially in genes
related to immune system
interleukin-3: mast cells and
bone marrow cells in
immune system
Hartl and Clark 2007
McDonald-Kreitman Test
Conceptually similar to HKA test
Uses only one gene
Contrasts ratios of synonymous divergence and
polymorphism to rates of nonsynonymous divergence
and polymorphism
Gene provides internal control for evolution rates and
demography
Application of McDonald-Kreitman
Test:
 Aligned 11,624 gene sequences
between human and chimp
 Calculated synonymous and
nonsynonymous substitutions
between species (Divergence)
and within humans (SNPs)
 Identified 304 genes showing
evidence of positive selection
(blue) and 814 genes showing
purifying selection (red) in
humans
 Positive selection: defense/immunity,
apoptosis, sensory perception, and
transcription factors
 Purifying selection: structural and
housekeeping genes
Bustamente et al. 2005. Nature 437, 1153-1157
Genes showing purifying (red) or positive (blue) selection in the human genome
based on the McDonald-Kreitman Test
Bustamente et al. 2005. Nature 437, 1153-1157
Problem 4. Calculate the ω = dN/dS ratio based on the following 2 DNA
sequences:
5’-ATG GTT CAT TTT ACC GGA CGA AGT CGA TTA-3’
5’-ATG GAT CAC TTG ACC GCA CGA AGT AGA TTA-3’
What does the value of ω indicate?
Problem 5. GRADUATE STUDENTS ONLY: Search the literature for an
example of an application of one of the tests for departures from
neutrality. Describe the question that the test is addressing, the results,
and the authors’ interpretation of the results. Receive two points of
extra credit if you can find a case in which the test is inappropriately
applied and/or interpreted. Please send the paper to Rose when you
submit your report.