Transcript Slide 1
Genomics
• Structural, functional
• Genome, Transcriptome, Proteome,
Metabolome, Interactome
www.the-scientist.com
Genomics or Genetics?
“What's
the Difference?
Well, as a rule, genetics is the study of single genes in
isolation. Genomics is the study of all the genes in the
genome and the interactions among them and their
environment(s).
Analogy 1
If genomics is like a garden, genetics is like a single
plant. If the plant isn’t flowering, you could study the
plant itself (genetics) or look at the surroundings to see
if it is too crowded or shady (genomics) – both
approaches are probably needed to find out how to
make your plant blossom.”
http://www.genomebc.ca/education/articles/genomics-vs-genetics/
Genomics and Molecular Markers
Structural genomics for plant breeders and
applied geneticists = molecular markers
•
•
•
•
How many genes determine important traits?
Where these genes are located?
How do the genes interact?
What is the role of the environment in the phenotype?
• Molecular breeding: Gene discovery, characterization, and
selection using molecular tools
• Molecular markers are a key implement in the molecular
breeding toolkit
What is a Molecular Marker
Markers are based on polymorphisms
• Amplified fragment length polymorphism
• Restriction fragment length polymorphism
• Single nucleotide polymorphism
• The polymorphisms become the alleles at marker loci
• The marker locus is not necessarily a gene: the
polymorphism may be in the dark matter, in a UTR, in an
intron, or in an exon
• Non-coding regions may be more polymorphic
DNA Mutations & Polymorphisms
• Changes in the nucleotide sequence of genomic DNA
that can be transmitted to the descendants.
• If these changes occur in the sequence of a gene, it is
called a mutant allele. The most frequent allele is
called the wild type.
• A DNA sequence is polymorphic if there is variation
among the individuals of the population.
Types of DNA Mutations (1)
Wildtype
5’ – AGCTGAACTCGACCTCGCGATCCGTAGTTAGACTAG -3’
Substitution
(transition: A
5’ – AGCTGAACTCGGCCTCGCGATCCGTAGTTAGACTAG -3’
G
Substitution
(transversion: G
5’ – AGCTCAACTCGACCTCGCGATCCGTAGTTAGACTAG -3’
C)
C
Deletion
(single bp)
5’ – AGCTAACTCGACCTCGCGATCCGTAGTTAGACTAG -3’
CAACTCGACC
Deletion
(DNA segment)
5’ – AGCTTCGCGATCCGTAGTTAGACTAG -3’
Types of DNA Mutation (2)
Wildtype
5’ – AGCTGAACTCGACCTCGCGATCCGTAGTTAGACTAG - 3’
Insertion
(single bp)
5’ – AGCTGAACTACGACCTCGCGATCCGTAGTTAGACTAG - 3’
Insertion
(DNA segment)
5’ – AGCTGAACTAGTCTGCCCGACCTCGCGATCCGTAGTTAGACTAG -3’
Inversion
5’ – AGCAGTTGACGACCTCGCGATCCGTAGTTAGACTAG -3’
Tranposition:
5’ – AGCTCGACCTCGCGATCCGTAGTTATGAACGACTAG - 3’
Why Use Markers?
A way of dealing with the
• Large number of genes per genome
• Huge genome size
• Technical challenges and cost of whole genome sequencing
The search for DNA polymorphisms was not driven by a desire
to complicate things, but rather by the low number of naked
eye polymorphisms (NEPs)
Markers may be linked to target genes
Markers in target genes are perfect markers
What is a perfect marker for a gene deletion?
DNA Markers
• Polymorphisms can be visualized at the metabolome,
proteome, or transcriptome level but for a number of reasons
(both technical and biological) DNA-level polymorphisms are
currently the most targeted
• Regardless of whether it is a “perfect” or a “linked” DNA
marker, there are two key considerations that need to be
addressed in order for the researcher/user to visualize the
underlying genetic polymorphism
• Applications in Mapping and Marker Assisted Breeding
Key steps for DNA Markers
1. Finding and understanding the genetic basis of the
DNA-level polymorphism, which may be as small as a
single nucleotide polymorphism (SNP) or as large as an
insertion/deletion (INDEL) of thousands of nucleotides
2. Detecting the polymorphism via a specific assay or
"platform". The same DNA polymorphism may be
amenable to different detection assays
Applications of Marker Maps
1. Establish evolutionary relations: homoeology, synteny and orthology
• Homoeology: Chromosomes, or chromosome segments, that are similar in terms
of the order and function of the genetic loci.
Homoeologous chromosomes may occur within a single allopolyploid
individual (e.g. the A, B, and D genomes in wheat)
May also be found in related species (e.g. the 1A, 1B, 1D series of wheat
and the 1H of barley)
• Orthology: Refers to genes in different species which are so similar in sequence
that they are assumed to have originated from a single ancestral gene.
• Synteny:
Classically refers to linked genes on same chromosome
Also used to refer to conservation of gene order across species
2. Associations due to linkage or pleiotropy
• Identify markers that can be used in marker assisted selection
3. Locate genes for qualitative and quantitative traits
• Map-based cloning strategies
Polymorphism Detection Issues
Polymorphisms vs. assays
An ever-increasing number of technology platforms have been,
and are being, developed to deal with these two key
considerations
These platforms lead to a bewildering array of acronyms for
different types of molecular markers. To add to the complexity,
the same type of marker may be assayed on a variety of
platforms
Ideal marker is one that targets the causal polymorphism
(perfect marker). Not always available though…..
Restriction Fragment Length
Polymorphism (RFLP)
• RFLPs (Botstein et al. 1980) are differences in restriction
fragment lengths caused by a SNP or INDEL that create
or abolish restriction endonuclease recognition sites.
• RFLP assays are based on hybridization of a labeled DNA
probe to a Southern blot (Southern 1975) of DNA digested
with a restriction endonuclease
Labeled
Probe
Target
3’ TGGCTAGCT 5’
1
3’ TGGCTAGCT 5’
|||||||||
5’-CCTAACCGATCGACTGAC-3’
2
5’-GGATTGGCTAGCTGACTG-3’
RFLP Steps
Co-Dominant RFLP Polymorphism
A C A T T GCGAA T T C A T GT A CGC A T
T GT AA CGC T T AAGT A CA T GCGT A
A C A T T GCGAAG T C A T GT A CGC A T
T GT AA CGC T T C AGT A CA T GCGT A
Allele A
Restriction Site
Allele a
A
a
a
A
a
a
A
Aa
Ind 1 Ind 2 Ind 3 Ind 4 Ind 5 Ind 6 Ind 7 Ind 8
Dominant RFLP Polymorphisms
A C A T T GCGAA T T C A T GT A CGC A T
T GT AA CGC T T AAGT A CA T GCGT A
A C A T T GCGAAG T C A T GT A CGC A T
T GT AA CGC T T C AGT A CA T GCGT A
Allele A
Restriction Site
Allele a
A
a
a
A
a
a
A
Aa
Ind 1 Ind 2 Ind 3 Ind 4 Ind 5 Ind 6 Ind 7 Ind 8
Features of RFLPs
•
•
•
•
•
•
•
Co-dominant, unless probe contains restriction site
Locus-specific
Genes can be mapped directly
Supply of probes and markers is unlimited
Highly reproducible
Requires no special instrumentation
Radioactive detection……
Amplified Fragment Length
Polymorphism (AFLP)
• Fragment genomic DNA with frequent and rare cutters
• AFLPs (Vos et al. 1995) are differences in restriction
fragment lengths caused by SNPs or INDELs that
create or abolish restriction endonuclease recognition
sites.
• AFLP assays are performed by selectively amplifying a
pool of restriction fragments using PCR.
AFLP Protocol
EcoRI (1/4096)
MseI (1/256)
Digestion with 2
restriction enzymes
Restriction site
adapter ligation
Selective preamplification
3’
5’
T
A
T
5’
3’
A
Amplification
3’
5’
CTT
5’
ATG
3’
AFLP Polymorphisms
• Polymorphisms between genotypes may arise from:
– Sequence variation in one or both restriction sites
– Sequence variation in the region immediately adjacent to the
restriction sites
– Insertions or deletions within an amplified fragment
• Band Detection
– Denaturing polyacrylamide gel electrophoresis &
autoradiography or silver staining
– Sequencing
Features of AFLPs
•
•
•
•
•
•
Very high multiplex ratio
Very high throughput
Off-the-shelf technology
Fairly reproducible
Dominant and co-dominant
Radioactive detection but less hazardous options
available
• Can convert favourite marker to SCAR
Simple Sequence Repeats (SSR)
• SSRs or microsatellites (Nakamura et al. 1987) are
tandemly repeated mono-, di-, tri-, tetra-, penta-, and
hexa-nucleotide motifs
• SSR length polymorphisms are caused by differences
in the number of repeats
• Assayed by PCR amplification using pairs of
oligonucleotide primers specific to unique sequences
flanking the SSR
• Detection by autoradiography, silver staining,
sequencing…
SSR Repeats
Repeat Motifs
• AC repeats tend to be more abundant than other di-nucleotide
repeat motifs in animals (Beckmann and Weber 1992)
• The most abundant di-nucleotide repeat motifs in plants, in
descending order, are AT, AG, and AC (Lagercrantz et al. 1993;
Morgante and Oliveri 1993)
• Typically, SSRs are developed for di-, tri-, and tetra-nucleotide
repeat motifs
• CA and GA have been widely used in plants
• Tetra-nucleotide repeats have the potential to be very highly
polymorphic; however, many are difficult to amplify
Simple sequence repeat in hazelnut
Note the difference in repeat length AND the consistent flanking
sequence
SSR Protocol
Individual 1 (AC)x9
Individual 2 (AC)x11
51 bp
Chloroplast SSRs of pine
Powell et al. 1995. Proc Natl Acad
Sci U S A. 92(17): 7759–7763.
55 bp
Features of SSRs
•
•
•
•
•
•
Highly polymorphic
Highly abundant and randomly dispersed
Co-dominant
Locus-specific
High throughput
Can be automated
Diversity Arrays Technology - DArT
DArT Analysis
• 2,500 markers per sample
• 94 samples - ~$4,500
• ~ 2 cents per datapoint
http://www.diversityarrays.com/
Features of DArT
•
•
•
•
•
•
•
Very high multiplex ratio
Very high throughput
Bi-allelic
Dominant marker system
Requires substantial investment
Fairly reproducible
DArT sequences now available
Single Nucleotide Polymorphisms (SNP)
• DNA sequence variations that occur when a
single nucleotide (A, T, C, or G) in the genome
sequence is altered
…..ATGCTCTTACTGCTAGCGC……
…..ATGCTCTTACTGCTAGCGC……
…..ATGCTCTTCCTGCTAGCGC……
…..ATGCTCTTACTGCAAGCGC……
Consensus…..ATGCTCTTNCTGCNAGCGC……
Alleles
Single
Nucleotide
Polymorphisms
(SNPs)
Features of SNP
• Highly abundant (1 every 200 bp in barley; Rostoks et
al., 2005)
• Locus-specific
• Co-dominant and bi-allelic
• Basis for high-throughput and massively parallel
genotyping technologies
• Genic rather than anonymous marker
• Phenotype due to SNP can be mapped directly
SNPS in Allopolyploids
www.cerealsdb.uk.net
Varietal SNPs in Allopolyploids
www.cerealsdb.uk.net
SNP Detection Strategy
• Locus specific system
– Many samples with few markers
• Marker assisted selection in commercial breeding
programmes for key target characters
• Addition of characteristic major genes to e.g. mapping
populations and association panels
• KASP – buy master mix and synthesise own primers
• Genome wide system
– Fewer samples with many markers
• Germplasm characterization, academic and breeding
• Genotyping panels for GWAS
• Illumina or Affymetrix for higher density arrays, costs↓
• What about bi-parental populations??
Affymetrix Axiom Technology
• Two colour ligation based assay
• Utilises unique oligonucleotide complementary to
flanking genomic sequence
• Automated parallel processing
Wheat SNP Arrays
KASPTM Genotyping
More Information:
http://www.lgcgroup.com/services
/genotyping/#.VCMgyPldWJ0
Wheat SNP Resources
www.cerealsdb.uk.net
Wheat SNP Haplotypes
www.cerealsdb.uk.net
Sequencing Approaches
• RRL – Reduced
Representation Library
• RAD-Seq – Restriction
Site Associated DNA
Sequencing
• GBS – Genotyping by
Sequencing
• See Davey et al.,
(2011) Nature Reviews
Genetics 12: 499-510
RADseq: Restriction-site Associated DNA markers
• Uses Illumina sequencing technology
• Based on digestion with restriction enzymes. An adapter binds to the restriction
site and up to 5kb fragments are sequenced around the target size.
• Bioinformatics work used to find SNPs on the amplified regions
Genotyping by Sequencing
Genotyping by Sequencing
Genomic DNA
digestion
Barcode
adaptor
PstI, MseI
+
Common
adaptor
GP x Morex map
+
1
ligation
Pooling and cleanup
P1
PCR enrichment
Library size analysis
Illumina sequencing
P2
2
0.0
1.7
3.2
*MR_1276826_1H
*MR_104832_1H
MR_137377_P5852F51_1H
6.9
8.0
*MR_1570047
MR_112662_P2522R4
12.4
14.1
15.8
MR_134866_P2478F39
MR_1558776_P8419R20
*MR_1558776
18.9
21.1
23.2
23.6
*MR_107223
*MR_118609
*MR_136272
BK_1688877_P193R49
38.0
38.9
40.6
42.0
43.7
44.6
44.9
45.1
45.3
45.9
46.0
46.1
46.2
46.3
46.4
46.6
46.8
47.2
48.5
49.1
49.6
50.4
51.7
53.6
54.8
56.3
70.3
72.4
75.6
79.1
79.9
81.0
83.3
86.7
92.1
93.4
95.9
MR_1566497_P5071F52
MR_138882_P1622R4
MR_1561831_P3942F59
MR_139179_P5528F23
MR_140361_P70F25
MR_1458736_P77R55
MR_131409_P671F28
BK_788008_P113R9
MR_1560843_P1839F61
MR_135195_P8340R34
MR_128994_P1295F58
BK_2569298_P70F59
BK_2478601_P171F36
MR_1135837_P125R45
MR_144808_P4920F25
MR_140562_P405R52
MR_10966_P107R21
BK_2693165_P231R33
MR_1266903_P238R54
*MR_1567178
MR_136889_P1033R7
MR_1559182_P21590F9
MR_1561783_P4985F37
MR_101181_P7543R8
*BK_582988
*MR_141504
MR_120198_P1020R24
MR_139962_P433F24
*MR_135645
MR_1561237_P3808R7
*MR_110268
MR_1558327_P6889F22
MR_1039081_P76R47
*MR_1566429
BW_999558_P142F15
MR_1562271_P255R17
*MR_1569341
114.0
115.7
116.5
118.0
MR_141931_P2537F61
MR_134723_P5971F51
*BK_301066
*MR_146408_1H
133.7
136.7
138.4
139.2
140.5
MR_109075_P4268F37_1H
MR_132049_P3206R58_1H
MR_1036344_P168R24
*MR_121539
*MR_1563012_1H
3
0.0
*MR_128619
3.3
*MR_1568534_2HL
9.8
11.9
13.0
14.3
*MR_135496_2HL
MR_135496_P11711R33
MR_1565157_P2236F53_2HL
*MR_136074_2HL
19.5
20.1
22.7
23.5
26.7
28.8
30.2
32.4
32.6
*MR_1005688
MR_130829_P726F63
*MR_116040
*MR_1560188
MR_120904_P162F19
*BW_1492788
MR_1559679_P1442F26
MR_110436_P648R24
MR_102751_P1157R23
40.0
41.6
42.0
44.3
46.5
*MR_136407
*MR_1562278
*MR_139948
*MR_1568395
MR_138589_P395F25
50.9
52.5
MR_127779_P10445R3
MR_1564529_P1596R45
57.2
57.5
58.5
MR_142671_P2664F59
MR_122092_P5573F32
MR_1558515_P5631R11
70.8
*MR_1501374
73.8
BK_540153_P1310F8
92.6
94.3
95.5
96.2
96.4
97.3
97.9
98.0
98.1
98.2
98.3
98.6
98.8
98.9
99.0
99.1
99.9
100.1
101.1
101.5
102.8
103.8
115.2
125.0
125.6
127.2
132.5
135.4
135.6
136.6
138.4
139.2
140.5
141.4
142.6
145.6
MR_141795_P216F57
BK_837501_P177F52
MR_137965_P6544R32
MR_144119_P1286F39
BK_932326_P250R20
*BK_1519822
MR_108261_P3875R57
MR_143177_P4240F30
MR_1559558_P2766R47
BW_2039910_P27F19
BK_2323017_P183F20
MR_151040_P653F59
MR_1566116_P3952F61
BW_212248_P60F15
MR_138866_P5828R17
MR_128550_P1407R32
MR_138239_P3883F60
BW_995640_P66R43
BW_1563827_P115F49
BW_1880334_P144F33
MR_135936_P7299R52
MR_1560545_P1288F58
MR_1562102_P4921R13
MR_144736_P3402R44
MR_135631_P5364R50
MR_142805_P3716F28
MR_117787_P16F20
BW_860235_P7039R30
MR_13526_P100F27
MR_1558263_P1004R40
MR_134800_P2632F12
*MR_138683
*MR_138225
BW_941631_P2145R30
*MR_1117107
*MR_135823
156.0
MR_1044900_P92R22
0.0
1.7
2.1
3.6
*MR_1558729_3HS
*MR_122161_3HS
*MR_107168
MR_1557973_P10720R35
16.6
*MR_1562590
20.1
21.4
*MR_147597
MR_1561375_P1243R16
30.9
31.8
MR_105908_P2026F15
MR_154974_P870R31
47.8
49.2
52.5
54.1
54.4
55.1
55.8
56.2
56.3
56.5
56.9
57.1
57.3
57.4
57.6
57.7
57.8
57.9
58.0
58.1
58.2
58.4
58.5
58.7
59.0
59.2
59.6
59.7
60.4
61.3
63.3
69.5
71.4
74.6
76.5
77.6
79.2
80.4
81.3
81.4
81.8
82.4
82.8
85.5
100.4
105.6
108.9
117.9
120.4
120.6
124.2
125.9
126.8
127.4
127.8
128.8
129.3
130.8
132.0
134.7
136.3
138.3
140.3
142.2
145.8
MR_135433_P590F19
MR_1558131_P463F38
MR_1558686_P3301F60
MR_1566281_P6984R1
*BW_1670301
MR_135476_P1178R54
*BW_352819
*MR_1407623
MR_136029_P2340R19
BW_1973916_P239F43
BK_343652_P3505R6
*MR_1565554
BK_1376240_P151R56
MR_124120_P1444R29
BW_1325607_P128F26
MR_103909_P1929R21
MR_120303_P1253R40
BK_2407688_P416R58
MR_141378_P30F24
BK_1861733_P2582R29
MR_128655_P966R8
BK_2258647_P145F24
MR_120286_P596F24
MR_105969_P2010F47
MR_141688_P3417F64
MR_156480_P824F57
*MR_1570532
BK_833635_P145R35
BK_538655_P631F43
*MR_1512661
*MR_1566637
MR_135723_P5464F55
*BW_1616787
MR_1560072_P3136F16
MR_1558760_P710R13
*MR_1558760
*MR_1557906
MR_126674_P1695F14
MR_141625_P2405R35
MR_1488714_P64F61
MR_1558260_P3059F27
MR_1558586_P4265F14
MR_1560884_P818R25
*BK_2842106
*MR_134557
MR_134626_P4265R35
*MR_117030
BK_1647625_P177F26
MR_1559011_P1779R11
MR_148389_P981R22
MR_139796_P4223F34
MR_116520_P1739R46
MR_145473_P1827R34
MR_143077_P2526R32
MR_139464_P3595R51
*MR_138895
*MR_137247
MR_138554_P3977F53
*MR_136112
MR_125855_P2720F24
MR_1566051_P821F15
MR_135524_P4003F58
*MR_1570494
*MR_1558791_3HL
*MR_1568158_3HL
159.4
BW_1845219_P144F53
172.7
*MR_1559531
SNPs vs GbS
• SNPs
–
–
–
–
Minimal input, don’t even have to isolate DNA
Rapid turn around and data is ready to use
Markers in known genes and generally mapped
More useful in GWAS
• GbS
– Now quite cheap and potentially many markers
– Rapid generation of sequence output but markers are
anonymous
• Find an expert bio-informatician to align your data and, if
possible, align to reference sequence
– More useful in bi-parental mapping studies
Marker to Candidate Gene
4H
Head
GrainN
Yield
Ferm_Ext
Viscosity
SNR
HWE
Glucose
GT25Sv
TGW
Tot_Sugars
Grains
Gwidth
11_20392
a
b
b
a
b
a
b
b
a
a
a
b
a
b
a
b
b
b
a
b
b
b
a
a
b
b
a
a
a
b
b
a
a
b
a
a
Ferment
sdw1
a
b
a
a
b
b
a
a
a
a
a
b
a
b
b
a
b
a
a
b
b
a
a
b
a
b
a
b
b
b
b
b
b
b
b
a
MillEn
Line/Marker 11_21508 ari-eGP
Derkado
a
a
B83-12/21/5
b
b
DH_001
b
b
DH_002
a
a
DH_003
b
b
DH_004
a
a
DH_005
b
b
DH_006
b
b
DH_007
a
a
DH_008
a
a
DH_009
a
a
DH_010
b
b
DH_011
a
a
DH_012
b
b
DH_013
a
a
DH_014
b
b
DH_015
b
b
DH_016
b
b
DH_018
a
a
DH_019
b
b
DH_020
b
b
DH_021
b
b
DH_022
a
a
DH_023
a
a
DH_024
b
b
DH_026
b
b
DH_027
a
a
DH_028
a
a
DH_029
a
a
DH_030
b
b
DH_031
b
b
DH_032
a
a
DH_033
a
a
DH_034
b
b
DH_035
a
a
DH_036
a
a
11_20145
11_21056
11_10221
11_21385
11_20210 11_10132
11_21374
11_20411
11_21122
11_10028 11_10093
11_20269 11_20289
11_20496 11_10667
11_20939 11_10793
11_10942 11_11042
11_11114 11_21490
11_11332
11_10046 11_10568
11_11244
11_20020 11_20135
11_10262 11_20412
11_20450 11_20472
11_10809
11_10527
11_20361 11_10914
11_21400
11_21191
11_20062 11_10509
11_20723 11_20820
11_11207
11_10010 11_10606
11_10639 11_20906
11_20924 11_11431
11_20072
11_20580 11_21504
11_20740 11_10829
11_20689 11_21151
11_11398
11_10751
11_20119 11_20762
11_11292
11_11470
11_10510 11_10614
mlo
mlo07646 mlo04264
mlo02559
11_10123
11_10712
11_10611 11_11066
11_10269
11_20007
11_10610
11_11186
af459084_02