GLYPHOSATE RESISTANCE Background / Problem

Download Report

Transcript GLYPHOSATE RESISTANCE Background / Problem

Reverse genetics:
Quantitative Trait Locus
(QTL) mapping
Association mapping
Integrating Mendelian and Quantitative
Genetics using molecular techniques
Mendelian trait
1
Individual
2
3
4
5
6
7
8
9 10
Allele A1
Allele A2
Genotype =
12 11 22 22 11 22 12 11 22 12
Quantitative trait
16 28 40 52 64 76 88
Height
Courtesy of Glenn Howe
Identifying Genes Underlying Phenotypes
 Linkage and quantitative trait
locus (QTL) analysis
 Need a pedigree with
segregating traits
 Linkage map with moderate
number of markers
 Very large regions of
chromosomes represented
by markers
Quantitative Trait Locus Mapping
A
B
C
A
B
C
Parent 2
a
b
c
X
HEIGHT
Parent 1
a
b
c









A
B
C
A
B
c
B
b
Bb
A
B
C
X
A A
b B
c c
a a
BB
c c
BB
a
b
c
F1
F1
BB
a A
B b
c c
bb
A a
b b
c c
bb
A A
b B
c C
BB
bb
a
b
c
A A
B b
c c
Bb
a a
B B
c c
Bb
Bb
GENOTYPE
A a
b B
c c
BB
a
B
c
BB
“Genetic architecture” of quantitative traits
QTL studies can reveal the following facets of
the genetic architecture of a quantitative trait:
-Number of genes underlying the trait
-The strength of effect of each gene
-Additive vs. dominant effects of traits
-Potential gene interactions among genes
-Ultimately, “QTN” or the actual genes involved
Quantitative Trait Locus Analysis
 Step 1: Make a controlled cross to create a large
family (or a collection of families)
 Parents should differ for phenotypes of interest
 Segregation of trait in the progeny
 Step 2: Create a genetic map
 Large number of markers phenotyped for all progeny
 Step 3: Measure phenotypes
 Need phenotypes with moderate to high heritability
 Step 4: Detect associations between markers and
phenotype using a model
 Step 5: Identify underlying molecular mechanisms
Step 1: Construct Pedigree
 Cross two individuals with
contrasting characteristics
 Create population with
segregating traits
 Ideally: inbred parents
crossed to produce F1s, which
are intercrossed to produce
F2s
 Recombinant Inbred Lines
created by repeated
intercrossing
 Allows precise phenotyping,
isolation of allelic effects
Grisel 2000 Alchohol Research & Health 24:169
Step 2: Construct Genetic Map
 Based on nonrandom
association of alleles at
different loci in pedigree
 Calculate pairwise
likelihood of linkage
 Gives overview of
structure of entire
genome
 Most efficient with
anonymous markers:
AFLP
 Codominant markers
much more informative:
SSR
Step 3: Determine Phenotypes of Offspring
 Phenotype must be
segregating in pedigree
0.1
 Must differentiate genotype
and environment effects
0.5
 How?
 Works best with phenotypes
with high heritability
0.9
 Proportion of total
phenotypic variance due to
genetic effects
 Why is this important?
Step 4: Detect Associations between Markers
and Phenotypes
 Single-marker associations are
simplest
 Simple ANOVA, correcting for
multiple comparisons
 Log likelihood ratio: LOD (Log10
of odds)

If QTL is between two
markers, situation more complex
 Recombination between QTL and
markers (genotype doesn't
predict phenotype)
 'Ghost' QTL due to adjacent
QTL

Use interval mapping or
composite interval mapping
 Simultaneously consider pairs of
loci across the genome
Step 5: Identify underlying molecular
mechanisms
QTL
chromosome
Genetic Marker
QTG: Quantitative Trait Gene
QTN: Quantitative Trait Nucleotide
Adapted from Richard Mott, Wellcome Trust
Center for Human Genetics
QTL mapping: model for a single marker locus
r
A Q
a
q
a
q
x
a
q
 Marker locus A, quantitative trait locus Q, recombine at rate r
 Qq genotype has mean Qq
 qq genotype has mean qq
 Offspring
 Aa has mean Aa=Qq (1-r) + qq r
 aa has mean aa=Qq r + qq(1-r)
 QTL effect  = (Qq - qq )= (Aa-aa)/(1-2r)
 Recombination rate confounded with QTL effect
QTL mapping: model for flanking marker loci
r1 r2
A Q B
a
q
b
a
q
b
x
a
q
b
 In simplest case, two markers A and B flank the QTL
 Enough degrees of freedom to separately estimate QTL
effect
 "Interval mapping": estimate QTL effect in a sliding
window along the marker map
 Many approaches developed...
QTL map of in Douglas fir (bud opening date)
Figure 2.—Seven QTL for terminal bud flush were detected in the growth initiation
experiment . QTL were found on six linkage groups (2, 3, 4, 5, 12, and 14) and were
detected in five of the six treatment combinations.
Jermstad et al. (2003) Genetics, Vol. 165, 1489-1506
QTL Vary by Year, Site, and Population

Loblolly pine QTL measured in different years at same site, in
different sites, and with a different genetic background
 Stippled: not repeated across years
% latewood
wood-specific gravity
Brown et al
Drawbacks of QTL mapping
 Often results are difficult to reproduce, and vary by
year, pedigree and location
 Multiple experiments are needed to confirm results, but
experiments are large undertakings (population size,
genotyping, phenotyping)
 Even if QTL localized to a few cM, this could correspond
to 1000s of KB of DNA, containing many genes
 As controlled crosses are used, only a fraction of natural
variation surveyed
 Biased towards detecting large effect QTL, as small
effect QTL are not statistically significant
Association
Genetics
Methods for associating phenotypes with SNPs
Effects of population structure
Candidate gene approaches
QTL mapping vs. association genetics
Indirect vs. direct association
Two approaches to association studies
Population-based
Cases (affected individuals) and unrelated population controls (unaffected
individuals) collected from “one” population
Effects of population structure can be incorporated
Family-based
Child-family trios and TDT design is the most common
Robust to effects of population structure
Case – control association test



The simplest method
Compare SNP
frequencies of
affected vs.
unaffected
Chi-square with one
degree of freedom
test
Genotype
“AA”
Genotype Total
“Aa”
affected
a
b
a+b
unaffected
c
d
c+d
Total
a+c
b+d
C21 = (ad - bc)2N
.
(a+c)(b+d)(a+b)(c+d)
Case-Control Example: Diabetes
 Knowler et al. (1988) collected data on 4920
Pima and Papago Native American populations
in Southwestern United States
 High rate of Type II diabetes in these
populations
 Found significant associations with
Immunoglobin G marker (Gm)
 Does this indicate underlying mechanisms of
disease?
Case-control test for association (case=diabetic, control=not diabetic)
Gm Haplotype
Type 2 Diabetes
present
absent
Total
present
8
29
37
absent
92
71
163
100
100
200
Total
Question: Is the Gm haplotype associated with risk of Type 2 diabetes???
(1) Test for an association
C21 = (ad - bc)2N
.
(a+c)(b+d)(a+b)(c+d)
= [(8x71)-(29x92)]2 (200)
= 14.62
(100)(100)(37)(163)
(2) Chi-square is significant. Therefore presence of GM haplotype
seems to confer reduced occurence of diabetes. (Note the test is
exactly analogous to calculating r2 between two loci).
Case-control test for association (continued)
Question: Is the Gm haplotype actually associated with risk of Type 2 diabetes???
The real story: Stratify by American Indian heritage
0 = little or no indian heritage;
8 = complete indian heritage
Index of indian
Heritage
Conclusion:
Gm Haplotype
Percent
with
diabetes
0
Present
Absent
17.8
19.9
4
Present
Absent
28.3
28.8
8
Present
Absent
35.9
39.3
The Gm haplotype is NOT a risk factor for Type 2
diabetes, but is a marker of American Indian heritage
Family-Based Association: The Transmission Disequilibrium Test (TDT)
Still an association test (like a case-control), but we study parents
and offspring and we condition on the parental genotypes
-this reduces effects of population stratification
Given the genotypes of the parents, is there an allele that is transmitted more
frequently to affected individuals?
Only look at affected
offspring with at least one
heterozygous parent, and
consider only family with
affected progeny
Under the null hypothesis
(H0) of no linkage,
what proportion of alleles
do we expect the
heterozygous parent to
transmit?
AB
AA
AB or AA?
To do TDT,
(1) we count the number of kids
inheriting A or B across many
families (trios) with affected kids
(2) Statistically test whether this
observed number is different
from 50:50
(3) If NOT 50:50, then affected
kids may be inheriting one allele
preferentially over the other
Transmission Disequilibrium Test (TDT)
(with known parental genotypes and 2 alleles at the locus)
For each heterozygous parent in each family, we determine which allele is
transmitted to the affected offspring and which is not.
AB
AA
AB
AB
number=b
AA
AA
number=c
H0: Two alleles are transmitted equally
(no linkage and no association)
Ha: One of the alleles is preferentially transmitted
(linkage and association)
Test statistic is
(b - c)2
b+c
; c 2 with 1 df
Transmission Disequilibrium Test (TDT) : Example
For each heterozygous parent in each family, see which allele is transmitted to
the affected offspring and which is not.
12
11
12
12
10 families
11
15 families
TDT test
b= , c=
(b - c)2 =
b+c
11
=
, p-value =
Methods for genetic association in natural
populations
• Standard general linear models (GLMs), usually with p values
computed by permutation.
y =  + mi + eij, where y is the trait value,  is a general mean,
mi is the genotype of the i-th SNP and eij is the residual.
• Structured Association (Pritchard et al. 2000; Thornsberry 2001)
and PCA Association (Price et al. 2006).
Controls for population structure by incorporating a Q matrix.
This matrix is an n × p population structure incidence matrix
where n is the number of individuals assayed and p is the
number of populations defined.
• Mixed Linear Models (MLMs; Yu et al. 2006).
They incorporate a Q matrix (fixed effect) but also a pairwise
relatedness matrix (K matrix, a random effect), which account for
within population structure.
Genetic association method depends upon population structure
SA=structured association
GC=genomic control
GLM=general linear model
TDT=transmission disequilibrium
MLM=mixed linear model
Population structure
unknown
SA
GC
GLM
GC
GLM
GC
MLM
MLM
TDT
Familial relatedness
Based on Yu & Buckler (2006)
Current Opinion in Biotechnology
Pinus taeda L
Continuous range, no clear
population genetic structure
Pinus pinaster Ait.
22 populations
Fragmented range, significant
population structure
Pinus pinaster
geographic
range
(46) Pleucadec
(47)Erdeven
France
St Jean de Monts(45)
Olonne/Mer(44)
(43)Le Verdon
(42)Hourtin
(41)Mimizan
(40)Petrock
Spain
(27)San Cipriano
Cuellar
Cuellar (25)San Leonardo de Yagüe
(23)Cuellar
(26)Bayubas de Abajo
(22)Coca
(21)Arenas de San Pedro Valdemaqueda(24)
Cenicientos
(20)
Portugal
Restonica (2)
Pinia (15)
(11)Pinet
a (10)Aulenne
Ahin(28)
(29)Oria
Tabarka(50)
Tabarka
Tabarka
Tunisia
Tamrabta(30)
Morocco
ADEPT project
TREESNIPS project
(also P. sylvestris, Picea abies and oaks)
Genetic association with wood property traits in
loblolly pine
Phenotypic traits
• Earlywood specific gravity (ewsg)
• Latewood specific gravity (lwsg)
• Percent latewood (lw)
• Earlywood microfibril angle (ewmfa)
• Lignin & cellulose content (lgn-cel)
microfibril
angle
S3
S2
S1
1o wall
• Synthetic PCAs for different wood-age types
González-Martínez et al. 2007
Genetics
2o wall
Significant genetic association of cad gene with
earlywood specific gravity and 4cl with % latewood
4cl
0
500
1000
9
9
4
1
cad
1500
1
4
1
0
2000
1
6
0
9
1
6
9
7
1
8
4
5
1
9
3
4
2500
2
0
0
4
2
3
8
5
2
5
8
9
0
-60
90 208
90
F1A
F4
61
R4
601
F5
491
F3
947
R3
F2
1454 1486
R3
R1A
2003
F6
1956
500
1000
321
781
R1A
F6
R6
2728
1500
1008 1133
F2
R6
2000
2500
2500
3000
1417 1528 1681
R2
3500
3192 3284
F3
R3
Genetic association method depends upon population structure
SA=structured association
GC=genomic control
GLM=general linear model
TDT=transmission disequilibrium
MLM=mixed linear model
Population structure
unknown
SA
GC
GLM
GC
GLM
GC
MLM
MLM
TDT
Familial relatedness
Based on Yu & Buckler (2006)
Current Opinion in Biotechnology
K vs. Q matrix
Traits measured
Power
Power considerations: structured populations
% variation explained by QTN
Zhao et al. (2007)
PLoS Genetics
(Small association pop of ~100 accessions)
Candidate Gene Associations vs. Whole Genome Scans

If LD is high and haplotype
blocks are conserved, entire
genome can be efficiently
scanned for associations with
phenotypes
 Biased by existing knowledge
 Use "Candidate Regions" from
high LD populations, assess
candidate genes in low LD
populations
ABOVE:BELOW
If LD is low, candidate
genes are usually identified a
priori, and a limited number
are scanned for associations
I
COARSE ROOT
 Simplest for case-control
studies (e.g., disease, gender)

QTL
154.1
157.3
163.4
171.3
178.2
180.8
182.1
184.2
193.5
198.1
206.8
210.6
219.9
226.5
230.3
232.7
243.1
P_204_C
S8_32
P_2385_C P_2385_A
T4_10
S15_8S5_37
T4_7S6_12
S8_29
P_2786_A S12_18
T1_13
T7_4
T3_13 T3_36
S17_21
S15_16T12_15
T2_30
S13_20
S1_20
T9_1 S1_19
S3_13
S1_24
S2_7
P_575_A
T12_22
S2_32
T7_9
S2_6
S13_16 T5_25
T5_12
T10_4
T1_26 T7_13
P_93_A
S4_20
S7_13 S7_12
T12_4
S4_24T3_10
S6_4
P_2852_A
S3_1
S6_20 S13_31
T7_15
T2_31
S8_4
S8_28
O_30_A
T5_4
T3_17
T12_12
S5_29
P_2789_A
P_634_A S17_43
S17_33
S17_12
S4_19
262.9
S17_26
0.0
8.8
11.6
12.1
13.8
15.5
17.9
20.4
22.3
23.5
24.1
25.3
26.5
29.5
36.5
43.2
50.5
52.9
54.1
59.1
60.6
85.0
95.7
107.8
121.4
124.3
129.0
135.7
148.6
150.2
152.8
Candidate
Region
Candidate
Gene Identification
The “Candidate gene” approach
Candidate genes are
selected by knowledge of
how they influence similar
traits in other organisms.
There is increasing
evidence that some genes
can control similar
phenotypic traits even in
distantly related species.
Easy to apply: lets see if
this primer set works on
this particular species!
Candidate gene definitions
Candidate genes are genes of known biological
action involved with the development or
physiology of the trait - Biological candidates
They may be structural genes or genes in a
regulatory or biochemical pathway affecting
trait expression
Positional candidates lie within the QTL region
that affect the trait
Traditional candidate genes
and traits
MHC related genes for studying disease and
parasite resistance, and mate choice
Heat shock proteins (HSP) for temperature and
stress tolerance
Growth hormone and its receptors for growth,
size
Candidate genes also available for many
ecologically relevant traits incl. morphology,
color, foraging, learning and memory, social
interactions, alternative mating strategies
Success story: Melanocortin-1
receptor gene
Coat colour variation in mice (Robbins et al. 1993)
Hair and skin color in humans (Valverde et al. 1995)
Feather coloration in chickens (Takeuchi et al. 1996)
Coat colour in pigs (Kijas et al. 1998)
Feather coloration in several bird species
(Theron et al. 2001; Mundy et al. 2004)
Coat colour in several mammals such as
horse, red fox and pocket mice (Mundy et al. 2004)
Skin color in lizards (Rosenblum et al. 2004).
Coat color of Kermode Bear (Ritland et al. 2001)
Melanocortin-1 receptor gene (MC1R)
Mundy 2005
MC1R in pocket mouse
Nachman et al. 2003
MC1R in pocket mouse: habitat
differences
Nachman et al. 2003
MC1R in lesser snow goose
Mundy et al. 2004
MC1R in Arctic skua
Mundy et al. 2004