Genome wide line specific variations

Download Report

Transcript Genome wide line specific variations

Reaping the benefits of
genome sequence and resequence information for
chickpea improvement
Mahendar Thudi
The IYP 2016 aims to
heighten public awareness of
the nutritional benefits of
pulses as part of sustainable
food production aimed
towards food security and
nutrition
Chickpea is an
important pulse
crop for food and
nutritional security
The Chickpea crop
Second most important food legume globally
Genome size= ~740 Mbp
Grown on about 13.54 m ha
across 59 countries with
13.10 m tons of total production
Marginal environments in
Sub-Saharan Africa and Asia
Kenya
Myanmar
Ethiopia
India
Production constraints
Abiotic stresses
500
Wilt
1000
Pod borer
Salinity
1500
Salinity
Terminal drought
Million (US$)
2000
Drought
Total production
2500
Ascochyta blight
3000
0
-500
-1000
Wilt
Blight
bo
re
r
Po
d
ht
bl
ig
yt
a
ch
tu
re
m
pe
ra
W
ilt
As
co
Lo
w
te
Dr
ou
gh
pr
od
u
ta
l
To
Pod borer
t
-1500
ct
io
n
Biotic stresses
Botrytis grey mould
Translational genomics in
agriculture (TGA)
Genomics
(incl. informatics)
How can we use
genomics to overcome
the problems of
international agriculture?
Breeding
Genetics
Varshney et al. 2014 PLoS Biology
-logy disciplines
Varshney et al. 2015 Crit Rev Plant Sci
Genome sequence and re-sequence
 Illumina sequencing used to generate
153.01 Gb
 73.8% of the genome is captured in
scaffolds
 Genome analysis predicted 28,269 genes
 High levels of synteny observed between
chickpea and Medicago
 > 81,845 SSRs and 4.4 million variants
(SNPs and INDELs)
Samples:
- 5 wild species
Varshney et al. 2013
- 25 landraces
Nature Biotechnology
- 60 breeding lines
Or
- 5 wild species
- 57 Desi genotypes
- 28 Kabuli genotypes
35 parental lines of
mapping populations
Drought
Segregating for abiotic, biotic and
nutritionally important traits were targeted
Cross
Segregating trait
ICC 4958 × ICC 1882
ICC 283 × ICC 8261
ICC 6263 × ICC 1431;
ICCV 2 × JG 11
WR 315 × C104
ICCV 2 × ICC 1496;
ICCV 10 × ICC 1496
ICC 506-EB × Vijay;
ICC 3137 × IG 72953;
ICC 3137 × IG 72933
ICCV 2 × JG 62
JG 62 × ICCV 05530;
Pb 7 × ICCV 04516
ICC 995 × ICC 5912
Salinity
Drought
Drought
Salinity
Ascochyta blight
Fusarium wilt (FW)
Botrytis grey mould
(BGM)
Helicoverpa
Botrytis grey
FW, BGM, Helicoverpa Kabuli
17%
and salinity
Ascochyta blight
Protein content
Fusarium
Helicoverpa
Pod
borer
Wild
9%
Desi
Data generation and analysis
 35 select inbred lines sequenced
using Hi Seq 2500
 3.41 trillion base pair data
 916.21 million 150 bp reads
Illumina
sequencing
Raw
Reads
Quality
filtering
Failed
 Avg. sequencing depth of 8.6X
 916 million clean reads
 92.19% reads with 82.18%
average coverage from each
genotype were aligned to the
reference genome CDC Frontier
Clean
Reads
Mapping reads to
reference genome
using SOAP
Structural
variations
Outline of data analysis
Discarded
Indels in 35 inbred lines
The maximum and minimum indel ratios were 1.03 and
0.81 in case of JAKI 9218 and ILC 3279 respectively
Codon insertions in genes
Two
codon
insertions
in gene
Ca_04570
related to
7s seed
storage
gene
family
Jaganathan et al. 2015
Identified 26 candidate genes
Structural variations
Type of
structural
variation Ca1
Pseudomolecules
Total
Ca2
Ca3
Ca4
Ca5
Ca6
Ca7
Ca8
CTX
4
2
2
4
9
10
30
6
67
DEL
92
58
42
46
75
65
53
10
441
INS
22
1
12
14
73
26
30
23
201
INV
0
1
0
2
0
8
2
0
13
ITX
0
11
2
4
4
2
4
3
30
Total
118
73
58
70
161
111
119
42
752
Ca5 had large number of structural variations
large deletion
*CTX
- inter chromosomal translocations, DEL - deletions, INS- insertions,
INV- inversions and ITX-intra chromosomal translocations,
Genome wide line specific variations
 PI 489777, ICC 3137, IG
72953 and IG 72933
segregate for Helicoverpa
resistance
 Maximum number of line
specific variations, 78,320,
were observed in PI
489777 (68,799 SNPs and
9,521 indels), and followed
by IG 72953 (55,393 SNPs
and 7,415 indels).
Thudi et al. 2016, BMC Plant Biology
Circos diagrams were plotted based on line specific variations like SNPs, indels,
deletions and duplications using Circos software. The outer circle represents line
specific SNPs followed by line specific InDels, deletions and duplications. Inside
each circle four lines with distinct colors show genotypes (PI 489777, ICC 3137,
IG 72953 and IG 72933).
129 release varieties
- Representing 10 countries –
India, Canada, Australia,
Ethiopia, Bangladesh,
Kenya, Myanmar, Sudan,
Burlgaria, Spain and
ICARDA;
- Based on year of release
a. before 1993
b. 1993-2002
c. 2003-2012
Genome wide variation -129 lines
SNPs
All_129 Genotypes
1,378,790
Market type (129 Genotypes) 1,378,790
Desi (88 genotypes)
1,323,116
Kabuli (41 genotypes)
1,150,365
Year wise ( 124 genotypes)
1,374,090
Before 1993 (38 genotypes) 1,172,636
Indels
151,440
151,440
144,046
103,036
150,246
123,549
CNVs
3,822
3,822
2,954
3,273
3,811
2,315
PAVs
24,593
24,593
22,819
20,818
24,078
19,856
SVs
24,126
24,126
22,398
20,406
23,652
19,482
1993-2002 (40 genotypes)
1,194,990
103,514 2,318 20,331 19,967
2003-2012 (46 genotypes)
1,242,979
112,370 3,511 21,137 20,776
CNV-copy number variations; PAVs- presence absence variations SVs-structural variations
 Varieties released between 2003-2012 are more diverse
 SNPs, indels, CNVs, PAVs and SVs were high on Ca4
Population structure and genetic
diversity
More kabuli varieties
developed after 1993
Increase in diversity in
release varieties after
2002
Chickpea reference set
(300 lines from 35 countries)
Genome wide association for drought
and heat tolerance
Reference set phenotyped
 1–6 seasons
 1–3
locations
in
India
(Patancheru,
Kanpur,
Bangalore)
 three locations in Africa
(Nairobi, Egerton in Kenya
and Debre Zeit in Ethiopia)
Whole genome re-sequencing
provided – 4 million SNP
Both generalized linear model (GLM) & mixed linear model
(MLM) were used for identifying markers associated with
traits of interest.
Root related traits
Root traits
Traits
Seasons
Root length (cm)
3
-3
Root length density (cm cm )
3
3
Root volume (cm )
3
Root dry weight (g)
3
Rooting depth (cm)
3
Root surface area
3
R-T ratio(%)
3
Shoot dry weight (g)
3
Stem dry weight (g)
3
Leaf dry weight (g)
3
Projected area
2
Average diameter
2
Root length screening
Experiment of chickpea
root growth in ROS
Agronomic traits
Traits
Morphological traits
Plant height (cm)
Plant width (cm)
Seasons Traits
Seasons
14
7
Yield related traits
Pods/plant
100 SDW (g)
7
7
Yield (g/m )
Yield (Kg/ha)
3
10
7
Yield per plant
7
7
Production
7
7
Biomass
6
7
2
6
2
2
2
10
Plant stand
Apical primary
branch
Apical secondary
branch
Basal primay
branch
Basal secondary
branch
Teritiary branches
Phenological traits
Days to flowering
13
Biomass/plant
Harvest index
2
TDM weight (g/m )
Days to maturity
Seeds per pod
Seeds/plant
9
7
2
Transpiration efficiency
13
C
2
SPAD
2
Field phenotyping under rainfed
and irrigated environments
Heat tolerance phenotyping
249 MTAs for drought tolerance
related traits
Trait
Root length density
(RLD, cm cm-3)
Root dry weight (RDW, g plant-1)
Root surface area (RSA, cm2
plant-1)
Root volume (RV, cm3 plant-1)
Number
P-value
of MTAs
PVE (%)
3
5.73 × 10-6 - 2.1 × 10-8
6.5 - 16.6
11
6.81 × 10-6 - 9.18 × 10-10
5.58 - 10.49
6
9.17 × 10-6 - 1.65 × 10-7
5.9 - 10.12
13
7.28 × 10-6 - 1.43 × 10-7
5.77 - 10.41
Days to 50% flowering (DF)
24
8.1 × 10-6 - 7.8 × 10-9
9.09- 20.36
Days to maturity (DM)
48
9.06 × 10-6 - 4.82 × 10-8
8.96 -21.29
100 seed weight (100SDW, g)
98
1.07 × 10-6 - 2.89 × 10-22
10.34 - 14.4
Yield (YLD, Kg/ha)
22
9.42 × 10-6 - 2.77 × 10-7
7.16 - 18.6
Biomass (BM, g)
8
1.6 × 10-6 - 6.35 × 10-8
6.29 -12.02
Harvest index (HI, %)
15
8.87 × 10-6 - 1.46 × 10-8
5.97-14.84
Delta Carbon ratio (δ13C)
1
6.02 × 10-7
20.7
84 MTAs for heat tolerance
related traits
Trait
Days to 50% flowering (DF)
Pods per plant (PPP)
100 seed weight (100SDW,
g)
Yield (YLD, g)
Harvest index (HI, %)
Heat tolerance index (HTI)
Numb
er of
P-value
MTAs
3.69 × 10-6 - 1.54 ×
10
10-8
1.02 × 10-6 - 1.17 ×
11
10-8
1.07 × 10-6 - 2.89 ×
16
10-22
2.03 × 10-6 - 7.31 ×
16
10-9
2.55 × 10-6 - 2.77 ×
14
10-8
8.58 × 10-6 - 7.75 ×
17
10-7
PVE (%)
9.64 - 18.26
9.19 - 17.42
10.34 - 14.4
10.96 - 21.56
6.94 - 14.79
7.45 - 14.11
GWAS signals 100 seed weight
GWAS signals for Days to 50%
flowering
Time for coffee
24 significant MTAs under drought
GWAS signals root length density
SNP locus Ca4_37195367 on Ca4 associated with RLD explained 16.67% PVE
Ca_09763
Ca1
Ca2
Ca3
Ca4
Ca5
Ca6
Ca7 Ca8
 Ca_09763 gene associated with correlated traits like
RSA and RV encoded for protein Aspartic protease
 Root proteases are shown to play a key role in nitrogen
acquisition and drought tolerance
GWAS signals biomass
Ca1
Ca2
Ca3
Ca4
Ca5
Ca6
Ca7 Ca8
 8 significant MTAs for biomass (BM), of which 6 loci were on Ca5 and one each on
Ca1 and Ca4.
 two loci were in candidate genes Ca_8986 and Ca_8946 that code for key disease
resistance genes in plants (CC-NBS-LRR gene) and potassium ion transport.
 Four SNP loci (Ca5_26316428,
Ca5_26343334, Ca5_26421763 and
Ca5_26788879) on Ca5 were found associated with BM and HI under drought
stress.
Whole genome re-sequencing (WGRS)
of MAGIC lines
 On an average the
sequencing depth on 8
parental lines was 48X and
96% of the reads generated
were aligned
 On an average 1135 MAGIC
lines were sequenced at 7X
and 96% of the reads
generated were aligned
250,000
Deletions
Insertions
Total SNPs in genes
synonymous SNPs
non-synonymous SNPs
Total SNPs
Total variant count
200,000
150,000
100,000
 51.69 billion clean reads
(90 bp) containing 4.65
TB clean data was
generated
50,000
0
Ca1 Ca2 Ca3 Ca 4 Ca 5 Ca 6 Ca 7 Ca 8
Genome-wide variations in parents of
MAGIC population
Genotype
ICC 4958
Total
SNPs
Line
Deletions Line
Insertions Line
Genes Genes
specific
specific
specific
deleted duplicated
SNPs
deletions
insertions
340,803 3
29976
1
28429
0
23
47
ICCV 00108 125,680 49
ICCV 10
174,644 219
9217
6
8907
2
30
323
14137
20
13534
22
14
21
ICCV 97105 261,406 120
JAKI 9218 315,032 1
24016
6
24274
13
18
42
30320
0
31339
0
8
1120
JG 11
17690
36
17640
31
21
25
JG 130
207,785 496
166,662 13
12936
2
12342
3
25
17
JG 16
147,615 95
10579
10
10094
6
27
73
Deletion, insertions and duplications were high in case
of JAKI 9218
Summary
 Resequencing 90 elite lines provided thousand of SSRs
and SNPs
 Resequencing of parental lines provided genome wide
variations for trait mapping
 Resequencing of 129 lines provided insights into spacial
and temporal trends in diversity
 Resequencing of reference set provided insights into
genetic diversity and population structure
 Diversity existing the germplasm lines including MAGIC
population can be deployed for chickpea improvement
Acknowledgements
ICRISAT
Rajeev K Varshney
Manish Roorkiwal
Anu Chitikineni
Abhishek Rathore
Dadakhalandar Doddamani
Aamir W Khan
Pooran Gaur
Hari Upadhayaya
BGI
Weiming He
Jianbo Jian
Gengyun Zhang
Jun Wang
ICAR
Narendra P Singh
SK Chaturvedi
Swapan K Datta
Funding Agencies
AISRF
GCP
Thanks for your attention