GEM_McMullen_05

Download Report

Transcript GEM_McMullen_05

Genetic Diversity and the Effects of Artificial
Selection in Maize
Maize Diversity Project Team
Molecular Diversity
How has selection shaped molecular diversity in
maize?
What is the relationship of selected genes to
agronomic traits?
Goal: Identify genes exhibiting selection
– Domestication, agronomic improvement, and local
adaptation
Community resource: SNP marker collection
Teosinte
Landraces
Inbreds/Hybrids
Photos courtesy J. Doebley
Major predictions for the model
Those genes have contributed most to maize improvement, i.e.
have experienced the strongest history of selection have the
least genetic variability left to contribute to crop improvement
by classical breeding.
These genes will not be detected in standard QTL experiments
because all lines will contain similar alleles.
Can we develop genomics screens to identify
genes that have undergone selection?
Invariant SSR approach (Vigouroux et al. 2002 PNAS 99:9650)
Directly contrast sequence diversity among teosintes and
inbreds (Wright et al. 2005 Science 308:1310)
Are genes with low inbred diversity enriched for selected
genes? (Yamasaki et al. 2005 Plant Cell 17:2859)
[email protected] for .pdfs
Summary of Sequencing on Random Genes
(Irie Vroh Bi, Masanori Yamasaki, Kate Houchins)
MPZ inbreds – (temperate) B73(2), Mo17(2), Hp301, Il14H,
Ky21, M37W, Oh43, (tropical) CML69, CML247, CML322,
CML333, KUI3, KUI11, NC350.
1095 alignments - 6169 SNPs.
MPZ inbreds + 16 teosinte partial inbreds
774 alignments – 3463 SNPs MPZ inbreds – 6136 SNPs in
teosintes.
Sequence statistics for 1095 genes
for diverse maize inbred lines.
All Maize
Temperate
Tropical
N
L
13.1 280.4
6.7 292.2
6.6 290.8
Total L
307034
310306
308816
S
5.6
4.3
4.2
Total S
6169
4560
4427
N = number of sequences, L = length of alignment,
S = number of segregating sites, π average number
of pairwise differences per bp.
π
0.0067
0.0065
0.0061
Inbred-Teosinte Sequence Summary
•
•
•
•
•
•
Number of alignments >5 in both sets
Average sample size inbreds
Average sample size teosinte
Average alignment length
Total SNPS in inbreds
Total SNP in teosintes
774
12.0
12.7
294
3463
6136
Diversity in maize inbreds vs. teosinte
0.07
0.06
q inbreds
0.05
0.04
0.03
0.02
0.01
0
0
0.02
0.04
0.06
q teosintes
Average q.inbred/q.teosinte 0.57
Excluding q.inbred=0 values 0.63
0.08
To identify the selected genes we need
new statistical approaches
• There are two models: a selection model and a
bottleneck model
• We must estimate the size of the bottleneck
• For each model, we estimate the probability of the
model given the data (the likelihood) for each gene
• This is very simulation and computer intensive!
• This approach allows us to estimate the proportion
of genes under selection and to identify the
candidates
Two models: To be considered selected need to fail the
neutral model and be accepted by the selected model.
Na
t1
Na
Nb
t2
t1
Nb
t2
Np
neutral
Np
selected
Genes significant for selection
Locus
S inb.
S teo.
Probability of being
in selected class
Annotated BLAST hit
scl394_p3**
0
27
0.74
Arabidopsis thaliana L28
ribosomal protein
scl491_p3**
0
13
0.62
Maize dihydrodipicolinate
synthase
scl405_p3**
0
12
0.59
Unknown expressed protein
scl427_p2*
0
16
0.54
A. thaliana DNAJ heat shock
protein
scl526_p3**
1
16
0.54
Maize hexokinase
scl499_p5**
0
12
0.51
Unknown expressed protein
scl512_p1**
0
16
0.51
Triticum adenylosuccinate
synthetase
scl536_p4**
0
17
0.49
Oryza sativa putative acetyl
transferase
scl531_p4**
0
11
0.46
Oryza sativa putative auxininduced protein
scl457_p4*
0
7
0.45
Oryza sativa putative growth
factor
On a genomic scale….
• Assume 40,000 genes in maize
• 40,000 x 0.04 = 1600 selected genes
• Before genome scans, 11 genes had been identified
as selected by population genetic approaches
• By sequencing 1000 genes, have ~30 novel
candidates
• These genes need to be divided between
domestication and improvement
What genes show evidence of selection?
• Genes involved in amino acid synthesis or
metabolism
• Genes involved in growth response.
• Transcription factors and signal transduction
components.
• Unique genes with no significant BLAST
homologies.
Are genes with low inbred diversity enriched for
domestication and improvement candidates?
(Masanori Yamasaki)
Chose 35 genes with no diversity among the MPZ inbred set.
Sequenced same region in 16 haploid landrace samples, 16
teosinte partial inbreds and a Tripsacum dactyloides sample.
Performed Hudson-Kreitman-Aguadé (HKA) (tests for
selection) on inbreds, landraces and teosintes against the
neutral genes adh1, glb1, fus6 and bz2.
Performed coalescent simulations of domestication (CS) of
inbreds vs. teosintes and landraces vs. teosintes.
ARF
Amino Acid Transporter
0.01
0.02
0.01
0
1
0.02
500
0
1000
GTP-binding Protein
1
2000
3000
Unknown
0.02
0.01
1000
0.01
1
500
1000
0
1500
1
500
1000
1500
p
0
F-box (circadian clock)
0.03
Ankyrin repeat
0.08
0.06
0.02
0.04
0.01
0.02
0
0.02
1
1000
0
2000
1
Fruit protein
1000
2000
3000
Chromatin remodeling
0.03
0.02
0.01
0.01
0
1
1000
2000
3000
0
1
Nucleotide position (bp)
500
1000
1500
Inbreds
P value in
HKAtotal
Unigene
N
L
S
AY108876
14
1,055
1
< 0.0068 **
AY107195
14
3,119
1
AY110109
14
1,466
AY105060
14
AY108178
Teosintes
P value in
HKAsilent
P value in
HKAtotal
P value in
HKAsilent
N
L
S
Candidate status
Homology search
< 0.0120 *
16
1,026
13
< 0.0433 *
< 0.1849
Selected Gene
Amino acid transporter
< 0.0058 **
< 0.0087 **
11
3,097
81
< 0.5889
< 0.6761
Selected Gene
Auxin response factor
1
< 0.0051 **
< 0.0054 **
14
1,355
43
< 0.3613
< 0.5321
Selected Gene
GTP-binding protein
1,090
0
< 0.0041 **
< 0.0053 **
15
1,112
59
< 0.7005
< 0.7631
Selected Gene
14
1,259
0
< 0.0054 **
< 0.0082 **
13
1,224
54
< 0.3233
< 0.4719
Selected Gene
AY106616
14
2,745
84
< 0.4395
< 0.2205
7
2,619
97
< 0.6859
< 0.7214
-
Ankyrin repeat-like protein
AY107952
14
2,469
23
< 0.1193
< 0.0927
14
2,599
38
< 0.1453
< 0.1678
-
Putative fruit protein, Oxidoreductase
AY106371
14
1,574
4
< 0.0094 **
< 0.0061 **
15
1,615
65
< 0.4603
< 0.4047
Selected Gene
Circadian clock
Putative methyl-binding domain protein
Do genes exhibiting signatures of
selection control agronomic traits?
(Sherry Flint-Garcia)
• Hypothesis: manipulation of the expression of
domestication and improvement genes will alter key
agronomic traits
• Methods: use genetic and transgenic approaches to
examine teosinte, exotic, and inbred alleles
• Test case: amino acid composition in kernels
• Evidence for selection for cysteine synthase,
chorismate mutase, dihydrodipicolinate synthase
and hexokinase
To what extend has diversity in amino
acid synthesis genes been reduced by
selection? (Sherry Flint-Garcia)
• Whitt et al., 2002 demonstrated that 3 of 6 genes in
starch synthesis pathway in maize show solid
evidence of artificial selection
• Evidence for selection for cysteine synthase,
chorismate mutase, dihydrodipicolinate synthase
and hexokinase from random sequencing
• Chose 16 additional genes for important steps in
amino acid synthesis, sequenced in teosintes,
landraces and inbreds and conducted tests of
selection
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
ns
**
**
**
ns ns
** ns
Arginine
Aspartic Acid
Cysteine
Glutamic Acid
Glycine
Histidine
Isoleucine
Leucine
Lysine
Methionine
Phenylalanine
Proline
**
**
ns **
** ns
Tryptophan
Tyrosine
15
10
Percent of Kernel Weight
Teosinte (n = 7)
Landraces (n = 11)
Maize (n = 27)
5
Total
Amino Acid
Valine
**
**
Threonine
20
Serine
0
Teosinte vs. Landraces **
Teosinte vs. Inbred Lines **
Alanine
Percent of total amino acid
25
30
25
20
15
10
5
0
**
**
Trans-cinnamic
acid
Lignin
PAL
Glycine
Glucose
Serine
Phenylalanine
3-Phosphoglycerate
O-Acetylserine
Cysteine synthase
Prephenate
Erythrose 4-P
Leucine
Cysteine
2-isopropylmalate
synthase
Pyruvate
Tyrosine
Chorismate
mutase
Phosphoenol
pyruvate
DAHP
Pyruvate
Alanine
Shikimate
Chorismate
Anthranilate
Synthase β
Anthranilate
Valine
Acetyl-CoA
Acetohydroxy
acid synthase
Isoleucine
Asparagine
Asparagine
synthetase
2-Ketobutyrate
Threonine
Aspartate
deaminase
Aspartate
kinase
Aspartate
4-seminaldehyde
Threonine
Homoserine
4-phosphate
Cystathionine γ-synthase
Indole-3-glycerol
phosphate
Cysteine
Cystathionine
Homocysteine
Tryptophan
Synthase β1
Aspartate
Aminotransferase
Tryptophan
Oxaloacetate
TCA Cycle
Glutamate
α-Ketoglutarate
DHDP
synthase
Arginine
Proline
Glutamate
dehydrogenase
2,3-Dihydrodipicolinate
Glutamate
Proline
dehydrogenase
Lysine
NO3–
Methionine
SAM synthetase I
SAM synthetase II
S-Adenosylmethionine
NH4
NO2–
Glutamine
NH4
Nitrate
Reductase
Hexokinase
(N:C sensing)
ntl1 -- nitrogen regulating protein
Histidine
Sequencing candidate genes
• Goal is to sequence 1000 candidate genes in all
inbreds for the 25DL, 16 teosintes, 2 Tripsacum,
and W22 R-std
• Shared responsibility by E. Buckler and
M. McMullen laboratories
• Develop SNP (or sequence) based assays for
association analysis
• Develop a mechanism to accept candidate gene
suggestions for outside the project
• www.panzea.org
100%
80%
60%
38,000 genes
1,000 genes
1,000 genes
Implications for GEM
• For the vast majority of genes inbreds lines retain
on average 60% of common diversity of teosinte
and 80% of the diversity of landraces. Therefore
the problem of loss of diversity is a specific problem
to particular genes and traits rather than a general
problem
• Most of the diversity lost in unselected genes is in
rare alleles and therefore hard to capture
Implications for GEM
• Our studies to date have not addressed
specific adaptation, possibly a more
important justification for GEM than limited
diversity per se
• It is hard for me to think about how to tap
diversity for specific adaptation without
considering diversity in a trait context.