Generation and Analysis of AFLP Data

Download Report

Transcript Generation and Analysis of AFLP Data

Generation and Analysis of
AFLP Data
ESPM 150/290: Biology, Ecology, and
Genetics of Forest Diseases
Laboratory Exercise
April 1, 2010
Some Considerations in
Choosing a Genotyping Method
• What is the level of taxonomic resolution desired?
(Populations? Species? Phyla?)
– Comparison of distantly related individuals requires slowly
evolving markers (e.g., protein-coding DNA or Amino Acid
sequences) due to saturation of changes in quickly-evolving
markers
– Comparison of closely related individuals requires rapidly
evolving markers (e.g., microsatellites or non-coding DNA
sequences)
• What is the level of genotypic resolution desired?
– Dominant vs. codominant markers
– Fine (e.g., nucleotide-level) data vs. coarse (e.g., fragment size)
genomic scale – detailed information about one or a few loci vs.
less-detailed information about more loci
Some Considerations in
Choosing a Genotyping Method
• How much previous sequence knowledge is available?
– DNA sequencing, microsatellite amplification, PCR-RFLP, etc.
require previous sequence information so that PCR primers can
be designed
– AFLPs and RAPDs allow genetic fingerprinting when previous
sequence knowledge is not available
• What are the cost and labor constraints?
– DNA sequencing is more costly than fragment analysis
– Techniques requiring fluorescent labeling are generally more
costly than techniques that don’t require labeling
A review of PCR amplification
Requirements:
 DNA template
 2 oligonucleotides - Primers
 Nucleotides dATP, dCTP, dGTP, dTTP
 Taq polymerase
1.
Double strand denaturation
2.
Annealing of the primers
3’
5’
3.
Elongation
5’
3’
5’
Restriction Enzymes
• Found in bacteria
• Cut DNA within the molecule (endonuclease)
• Cut at sequences that are specific for each enzyme
(restriction sites)
• Leave either blunt or sticky ends, depending upon the
specific enzyme
Tobin & Dusheck, Asking About Life, 2nd ed. Copyright 2001, Harcourt, Inc.
http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/R/RestrictionEnzymes.html
Random Genomic Markers
 DNA sequence of suitable SNPs is not available
 Relatively inexpensive
 Scan the entire genome producing information
on several variations in the same reaction
 RAPD Random Amplification of Polymorphic DNA
 AFLP Amplified Fragment Length Polymorphism
AFLP
Amplified Fragment Length Polymorphisms
(Vos et al., 1995)
 Genomic DNA digested with 2 restriction
enzymes:
– EcoRI (6 bp restriction site)
cuts infrequently
– MseI(4 bp restriction site)
cuts frequently
GAATTC
CTTAAG
TTAA
AATT
 Fragments of DNA resulting from restriction digestion are
ligated with end-specific adaptors (a different one for
each enzyme) to create a new PCR priming site
 Pre selective PCR amplification is done using primers
complementary to the adaptor + 1 bp (chosen by the
user)
N
N
N
N
 Selective amplification using primers complementary to
the adaptor (+1 bp) + 2 bp
NNN
NNN
NNN NNN
AFLP OVERVIEW
(VOS ET AL., 1995)
Sample AFLP Gel
AFLP Electropherogram
Peak
Height
Fragment Size (bp)
Source: Wikimedia Commons
AFLP
Fluorescent electrophoresis
AFLP Data Map
from Urbanelli et al.
(2007)
Rows: individuals
Columns: alleles
AFLP genotyping
 PCR amplification using primers corresponding to the
new sequence
If there are 2 new priming sites within 400 – 1600 bp
there is amplification
 The result is: Presence or absence of amplification
1 or 0
Dominant marker: does not distinguish between
heterozygote and homozygote
 Due mostly to SNPs but also to deletions/insertions
Analysis of AFLP data




Similarity (cluster analysis)
NJ (Neighbor Joining)
UPGMA (Unweighted Pair Group Method with Arithmetic mean)
AMOVA (Analysis of Molecular Variance)
 Model-based
 Maximum likelihood
 Bayesian
Example of a
sequence
distance matrix
Image Source: http://media.wiley.com/CurrentProtocols/BI/bi0603/bi0603-fig-0002-1-full.gif
Analysis of AFLP data




Similarity (cluster analysis)
NJ (Neighbor Joining)
UPGMA (Unweighted Pair Group Method with Arithmetic mean)
AMOVA (Analysis of Molecular Variance)
 Model-based
 Maximum likelihood
 Bayesian
Example of a
sequence
distance matrix
Image Source: http://media.wiley.com/CurrentProtocols/BI/bi0603/bi0603-fig-0002-1-full.gif
AFLP Clustering Analysis
Clustering Dendrogram
Fragment Visualization
Source: Wikimedia Commons
AFLP Data Map with UPGMA
dendogram from Urbanelli et al.
(2007): “Distinguishing taxa in the
Pleurotus eryngii (King Oyster
Mushroom) complex using AFLPs”
• 90 populations sampled
• 94 AFLP loci scored
Photos: (Top) The New York Times
(Bottom L) Wikimedia Commons
(Bottom R) http://steinpilz.up.seesaa.net
Example Structure Output
“Estimated population structure for 10 runs of structure using 1056 individuals from
52 human populations. Each graph represents the output of one run of structure. In
each graph, each individual is represented by a vertical line, which is partitioned into
5 colors that represent its estimated membership fractions in K=5 clusters.”
(Source: http://rosenberglab.bioinformatics.med.umich.edu/clumppExample.html)
Rosenberg et al. (2002). Science 298: 2381-2385.