Molecular_Plant_Breeding_Theories_and_Applications-4
Download
Report
Transcript Molecular_Plant_Breeding_Theories_and_Applications-4
讲座提纲
1
2
3
4
5
6
7
8
9
10
什么是分子育种
历史回顾
全基因组策略
基因型鉴定
表现型鉴定
环境型鉴定 (etyping)
标记-性状关联分析
标记辅助选择
决策支撑系统
展望
Evolution of Genotyping (1980-2010s)
Systems
From gels to chips and sequencing (GBS)
Throughput
From singles to millions
Resolution
10-30 cM to many markers per gene
Cost (per data point)
Several US dollars to 1/1000 cent
Marker type
Morphological
Cytological
Protein
DNA
RFLP
RAPD
AFLP
SSR
SNP
Xu 2010
Molecular Plant Breeding
CABI Publisher
Molecular basis of
DNA markers
A single-nucleotide polymorphism
(SNP) is a DNA sequence variation
occurring when a single nucleotide
— A, T, C, or G — in the genome
differs between members of a
biological species. (Wiki)
Revised from:
Xu 2010
Molecular Plant Breeding
CABI Publisher
Copy-number variations
(CNVs) — a form of
structural variation—are
alterations of the DNA of a
genome that results in the
cell having an abnormal
number of copies of one or
more sections of the DNA
Presence/Absence Variation (PAV)
Sample
Chromosome distribution
A
Presence
B
Absence
Presence/Absence Variation (PAV) results in many
genes that cannot be mapped based on regular
linkage mapping with SNP markers
单倍型的概念及其发展
A haplotype is a group of genes within an organism that was inherited
together from a single parent. A haplotype can describe a pair of genes
inherited together from one parent on one chromosome, or it can
describe all of the genes on a chromosome that were inherited together
from a single parent. This group of genes was inherited together
because of genetic linkage.
The term "haplotype" can also refer to the inheritance of a cluster of
single nucleotide polymorphisms (SNPs), which are variations at single
positions in the DNA sequence among individuals.
功能区域SNP构成的单倍型
基因内SNP构成的单倍型
染色体内SNP构成的单倍型
全基因组范围 SNP构成的单倍型
从SNP
到单倍型和
标签SNP
SNP2
SNP1
SNPs
Chromosome 1
Chromosome 2
Chromosome 3
Chromosome 4
SNP3
AACACGCCA …. TTCGGGGTC….AGTCGACCG ….
AACACGCCA …. TTCGAGGTC….AGTCAACCG ….
AACATGCCA …. TTCGGGGTC….AGTCAACCG ….
AACACGCCA …. TTCGGGGTC….AGTCGACCG ….
Haplotype
Haplotype 1
Haplotype 2
Haplotype 3
Haplotype 4
Tag SNPs
Individual 01
Individual 02
Individual 03
Individual 04
Individual 05
Individual 06
Individual 07
Individual 08
Individual 09
Individual 10
Individual 11
Individual 12
CTCAAAGTACGGTTCAGGCA
CTCAAAGTACGGTTCAGGCA
CTCAAAGTACGGTTCAGGCA
CTCAAAGCACGGTTGAGGCA
CTCAAAGCACGGTTGAGGCA
CTCAAAGCACGGTTGAGGCA
CTCGAAGTACGGTTCAGGCA
CTCGAAGTACGGTTCAGGCA
CTCGAAGTACGGTTCAGGCA
CTCAAAGCACGGTTCAGGCA
CTCAAAGCACGGTTCAGGCA
CTCAAAGCACGGTTCAGGCA
A
/
G
T
/
C
C
/
G
SNP Genotyping Platforms
Winner?
# Markers
Throughput
Cost
Data deliverry
Service
Genotyping by Arraying (Chips)
● Three Illumina 1536-SNP chips:
Illumina-Cornell-CIMMYT collaboration
Yan et al 2009; Yan et al 2010
● Illumina MaizeSNP50 Beadchip:
Up to 56,110 SNPs, 1 SNPs/40 kb
Covering 19,540 genes, 2 SNPs/gene
Functionally tested with over 30 diverse
maize lines
Developed by Illumina in collaboration
with TraitGenetics, INRA, and Syngenta
SNP genotyping by Array Tape
Douglas Scientific Array Tape 平台包括:
Nexar Inline Liquid Handling System
Soellex Thermal Cycler
Araya Inline Fluorescence Scanner
Centrifuge
Kraken SNPline XL System
高通量数据: 每天处理400 张384孔反应数据(15万个)
低运行成本:极微量反应体系, 节省80-90% 的反应试剂
模块化程序设计:
NEXAR微量液体转移系统
SOELLEX高通量PCR反应系统
ARAYA扫描系统
特别适合于大量样本
少量标记的分析
Genotyping By Sequencing (GBS)
GBS technology enables the detection of a wider range
of polymorphisms: SNPs plus small indels
No pre-discovery or validation
Applicable to any species or population
GBS approaches
Simply sequence the entire genomes of individuals: expensive
Several extant methods. Each enriches for a portion of the genome
which is then sequenced. Enrichment is most often achieved via
restriction enzyme (RE) digestion. The existence of only 4, 6 or 8bp
recognition sites limits the “tunability” of extant methods.
Huang et al., 2010 Nature Genetics; Andolfatto et al., 2011
Genome Research; Elshire et al., 2011, PLoS ONE; Davey et
al., 2011 Nature Reviews Genetics
Genotyping-By-Sequencing GBS
Created for high-throughput, semi-automated genotyping
Sequencing adaptor
Barcode
Sticky ends
Genomic DNA
Sample
plants
Isolate
DNA
Restriction
digest
Ligate
adaptors
Sequence
• Drawbacks
• Advantages
•
•
•
•
Pool &
amplify
One step SNP discovery + genotyping
Simple protocol; no reference required
Large numbers of SNPs found cheaply
Broadly applicable
• False SNPs from
sequencing errors
• Missing data from
stochastic sampling
Images: Qiagen, Illumina, Elshire et al 2011, PLoS ONE
1. 限制性酶切
2. 添加接头
3. 混池构建
4. 片段长度选择
5. 测序
6. 质量检控
7. 序列比对
8. HMM模型拟合
9. 下游分析
Andolfatto et al. 2011
Genome Research
GBS: Competitive Landscape
1 Commercialized
by Floragenex Inc.
2 Not disclosed; Data2Bio’s proprietary technology
From P. S. Schnable
Maize GBS 2.7 Build
Trained on 32K taxa including extensive CIMMYT material
(landraces and diverse breeding materials)
45K taxa now scored with build
960K core SNPs
Production Tags On Physical Map (TOPM) file for one step
SNP calling available at panzea.org (imputation and calling in
15 min)
Ed Buckler, personal comm.
Genotyping by Whole Genome Sequencing
Sequencing Everything !!
Resequencing to discover SNPs, haplotypes and tag SNPs
Tag SNPs can be developed to represent haplotypes. Each tag
SNP represents one haplotype fragment.
A set of tag SNPs can be developed to represent whole genome
diversity.
Approaches to Reduce Cost and
Increase Scale in Genotyping
Seed-based DNA genotyping
Efficient sample tracking
Selective genotyping and pooled DNA analysis
Integrated diversity analysis, genetic mapping
and MAS
Developing breeding strategies for simultaneous
improvement of multiple traits
Seed DNA-based Genotyping in Maize
① Soaking
⑥ Tracking back
and planting
② Sampling
⑤ PCR and genotyping
③ Grinding
④ DNA extraction
Gao et al 2008 Mol Breed 22:477–494
Automatic
seed chipping
Laser-assisted
seed selection
Selective Genotyping: QTL Effects and Population/Tail Sizes
N = 200
N = 500
100
90
80
70
60
50
40
30
20
10
0
20%
15%
10%
5%
3%
100
50
30
15
20%
15%
1%
N = 1000
10%
5%
3%
100
50
30
15
1%
N = 3000
100
90
80
70
60
50
40
30
20
10
0
20%
100
90
80
70
60
50
40
30
20
10
0
15%
10%
5%
3%
100
50
30
15
100
90
80
70
60
50
40
30
20
10
0
20%
15%
10%
1%
Sun et al 2010 Mol Breed 26:493–511
100
50
30
5%
3%
1%
15
Bulked or Pooled DNA Analysis
A
B
Population
distribution
Selection
DNA Pools
1.0
Linked
Linked
Genotyping
PCR markers
Chip genotyping
DNA sequencing
RNA sequencing
Linked
Unlinked
Allele frequency
0.5
0.0
1.0
Linked
0.5
0.0
1.0
Unlinked
0.5
0.0
R plants
S plants
High
tail
Low
tail
Xu, 2010, Molecular Plant Breeding, CABI