Transcript Slide 1
Polymorphisms
Polymorphism
A DNA polymorphism is a sequence difference
compared to a reference standard that is present in at
least 1%–2% of a population.
Polymorphisms can be single bases or thousands of
bases.
Polymorphisms may or may not have phenotypic
effects
Polymorphic DNA Sequences
Polymorphisms are found throughout the genome.
If the location of a polymorphic sequence is known, it
can serve as a landmark or marker for locating other
genes or genetics regions.
Each polymorphic marker has different versions, or
alleles.
Types of Polymorphic DNA
Sequences
RFLP: restriction fragment length polymorphisms
VNTR: variable number tandem repeats (8 to >50 base
pairs)
STR: short tandem repeats (1–8 base pairs)
SNP: single-nucleotide polymorphisms
Types of Polymorphisms
Single nucleotide polymorphisms (SNPs)
A Single Nucleotide Polymorphism, or SNP
(pronounced "snip"), is a small genetic change, or
variation, that can occur within a person's DNA
sequence.
Common SNPs are defined as >1% in at least one
population
Rare SNPs are hard to identify and validate
But, it is estimated that there are a large number per
individual
Some areas more variable than others (HLA)
Restriction Fragment Length Polymorphisms
Restriction fragment sizes are altered by changes in or
between enzyme recognition sites.
GTCCAGTCTAGC GAATTC GTGGCAAAGGCT
CAGGTCAGATCG CTTAAG CACCGTTTCCGA
GTCCAGTCTAGC GAA ATC CG TGGC CAAGGCT
CAGGTCAGATCG CTTTAG GC ACCG GTTCCGA
Point mutations
GTCCAGTCTAGC GA AGCGA ATTC GTGGC AAAGGCT
CAGGTCAGATCG CTTCGCT TAAG CACCG TTTCCGA
Insertion
GTTCTAGC GAATTC GTGGC AAA GGCT GAATTC GTGG
TCAGATCG CTTAAG CACCG TTTCCGA CTTAAG CACC
GTTCTAGC GAATTC GTGGC AAAAAA GGCT GAATTC GTGG
TCAGATCG CTTAAG CACCG TTTTT TCCGA CTTAAG CACC
Fragment insertion (or deletion)
RFLP
RFLP (often pronounced "rif lip”) is used to follow a
particular sequence of DNA as it is passed on to other
cells.
RFLPs can be used in paternity cases or criminal cases
to determine the source of a DNA sample. RFLPs can
be used determine the disease status of an individual.
RFLPs can be used to measure recombination rates
which can lead to a genetic map with the distance
between RFLP loci measured in centiMorgans.
RFLP
An RFLP is a sequence of DNA that has a restriction
site on each end with a "target" sequence in between.
A target sequence is any segment of DNA that bind to
a probe by forming complementary base pairs.
A probe is a sequence of single-stranded DNA that has
been tagged with radioactivity or an enzyme so that
the probe can be detected. When a probe base pairs to
its target, the investigator can detect this binding and
know where the target sequence is since the probe is
detectable.
RFLP
RFLP produces a series of bands when a Southern blot
is performed with a particular combination of
restriction enzyme and probe sequence
RFLP
Recall: EcoR I binds to its recognition sequence
GAATTC and cuts the double-stranded DNA as shown
EcoRI
-GAATTC-CTTAAG-
In the segment of DNA shown below, you can see the
elements of an RFLP; a target sequence flanked by a pair of
restriction sites. When this segment of DNA is cut by EcoR
I, three restriction fragments are produced, but only one
contains the target sequence which can be bound by the
complementary probe sequence (purple).
Restriction Fragment Length Polymorphisms
RFLP genotypes are inherited.
For each locus, one allele is inherited from each parent.
Father
Locus
1
2
Mother
Locus
1
2
Southern blot
band patterns
Parents
1
Child
Locus
2
Let's look at two people and the segments of DNA they
carry that contain this RFLP (for clarity, we will only see
one of the two stands of DNA). Since Jack and Jill are both
diploid organisms, they have two copies of this RFLP.
When we examine one copy from Jack and one copy from
Jill, we see that they are identical:
Jack 1: -GAATTC---(8.2 kb)---
GCATGCATGCATGCATGCAT---(4.2 kb)---GAATTC Jill 1: -GAATTC---(8.2 kb)--GCATGCATGCATGCATGCAT---(4.2 kb)---GAATTC-
When we examine their second copies of this RFLP, we
see that they are not identical. Jack 2 lacks an EcoR I
restriction site that Jill has 1.2 kb upstream of the
target sequence (difference in italics).
Jack 2: -GAATTC--(1.8 kb)-CCCTTT--(1.2 kb)--
GCATGCATGCATGCATGCAT--(1.3 kb)-GAATTC-
Jill 2: -GAATTC--(1.8 kb)-GAATTC--(1.2 kb)-GCATGCATGCATGCATGCAT--(1.3 kb)-GAATTC-
Therefore, when Jack and Jill have their DNA subject
to RFLP analysis, they will have one band in common
and one band that does not match the other's in
molecular weight:
Parentage Testing by RFLP
Which alleged father’s genotype has the paternal
alleles?
AF2
Mother
AF 1
Locus
1
2
Locus
1
2
Child
Locus
1
2
Locus
1
2
Evidence Testing by RFLP
Which suspect—S1 or S2—was at the crime scene?
(V = victim, E = crime scene evidence, M = molecular weight
standard)
M S1 S2 V E M
Locus 1
M S1 S2 V E M
Locus 2
M S1 S2 V E M
Locus 3
Short Tandem Repeat Polymorphisms (STR)
STRs are repeats of nucleotide sequences.
AAAAAA…: mononucleotide
ATATAT…: dinucleotide
TAGTAGTAG…: trinucleotide
TAGTTAGTTAGT…: tetranucleotide
TAGGCTAGGCTAGGC…: pentanucleotide
Different alleles contain different numbers of
repeats.
TTCTTCTTCTTC : four-repeat allele
TTCTTCTTCTTCTTC: five-repeat allele
Short Tandem Repeat Polymorphisms
STR alleles can be analyzed by fragment size
(Southern blot).
Restriction site
One repeat unit
Allele
M 1 2 M
Allele 1
GTTCTAGC GGCC GTGGC AGCTAGCTAGCTAGCT GCTG GGCC GTGG
CAAGATCG CCGG CACCG TCGATCGATCGATCGA CGAC CCGG CACC
tandem repeat
Allele 2
GTTCTAGC GGCC GTGGC AGCTAGCTAGCT GCTG GGCC GTGG
CAAGATCG CCGG CACCG TCGATCGATCGA CGAC CCGG CACC
Short Tandem Repeat Polymorphisms
STR alleles can also be analyzed by amplicon size
(PCR).
Primers
Allele 1
....TCATTCATT CATT CATT CATTCATT CAT....
....AGTAAGTAAGTAAGTAAGTAAGTAAGTA....
Allele 2
....TCAT TCATT CATTCATT CATT CATTCATTCAT....
....AGTAAGTAAGTAAGTAAGTAAGTAAGTAAGTA....
PCR products:
Allele 1 187 bp (7 repeats)
Allele 2 191 bp (8 repeats)
(Genotype)
7/8
Short Tandem Repeat Polymorphisms
Allelic ladders are standards representing all
alleles observed in a population.
11 repeats
(Allelic ladder)
5 repeats
Genotype: 7,9
Genotype: 6,8
Short Tandem Repeat Polymorphisms
Multiple loci are genotyped in the same reaction using
multiplex PCR.
Allelic ladders must not overlap in the same reaction.
Short Tandem Repeat Polymorphisms by
Multiplex PCR
FGA
PentaE
TPOX
D18S51
D2S11
D8S1179
THO1
vWA
D3S1358
Amelogenin Locus, HUMAMEL
The amelogenin locus is not an STR.
The HUMAMEL gene codes for amelogenin-like
protein.
The gene is located at Xp22.1–22.3 and Y.
X allele = 212 bp
Y allele = 218 bp
Females (X, X): homozygous
Males (X, Y): heterozygous
Analysis of Short Tandem Repeat Polymorphisms
by PCR
STR genotypes are analyzed using gel or capillary
gel electrophoresis.
11 repeats
5 repeats
11 repeats
5 repeats
(Allelic ladder)
Genotype: 7,9
STR-PCR
STR genotypes are inherited.
Child’s alleles
Mother’s alleles
Father’s alleles
One allele is inherited from each parent.
Parentage Testing by STR-PCR
Which alleged father’s genotype has the paternal
alleles?
Locus
D3S1358
vWA
FGA
TH01
TPOX
CSF1PO
D5S818
D13S317
Child
16/17
14/18
21/24
6
10/11
11/12
11/13
9/12
Mother
16
16/18
20/21
6/9.3
10/11
12
10/11
9
1
17
14/15
24
6/9
8/11
11
13
12/13
2
17
16/17
24
6/7
8/9
11/13
9/13
11/12
Evidence Testing by STR-PCR
Which suspect—S1 or S2—was at the crime scene?
(V = victim, E = crime scene evidence, AL = allelic
ladder)
AL S1 S2 V E M
AL S1 S2 V E M
AL S1 S2 V E M
Locus 1
Locus 2
Locus 3
Short Tandem Repeat Polymorphisms:
Y-STR
The Y chromosome is inherited in a block without
recombination.
STR on the Y chromosome are inherited paternally as a
haplotype.
Y haplotypes are used for exclusion and paternal
lineage analysis.
Short Tandem Repeat Polymorphisms: miniSTR
Mini-STRs are STRs on smaller amplicons.
Recommended for degraded specimens
Used to identify remains from mass graves and disaster
areas
MHC
Major histocompatibility complex (MHC) is a cell surface molecule
encoded by a large gene family in all vertebrates. MHC molecules
mediate interactions of leukocytes. MHC determines compatibility of
donors for organ transplant as well as one's susceptibility to an
autoimmune disease via crossreacting immunization. In humans,
MHC is also called human leukocyte antigen (HLA).
Protein molecules—either of the host's own phenotype or of other
biologic entities—are continually synthesized and degraded in a cell.
Occurring on the cell surface, each MHC molecule displays a molecular
fraction, called an epitope. The presented antigen can be either self or
nonself.
The MHC gene family is divided into three subgroups—class I, class II,
and class III.
We will discuss these later…..
SNPs & Function: We know so
little..
Majority are “silent”: No known functional
change
Alter gene expression/regulation
Promoter/enhancer/silencer
mRNA stability
Small RNAs
Alter function of gene product
Change sequence of protein
Large and Small Scale
Polymorphisms
Copy Number of Polymorphisms
Regional “repeat” of sequence
10s to 100s kb of sequence
Estimate of >10% of human genome
Multi-copy in many individuals
Duplicons
90-100% similarity for >1 kb
5-10% of genome (5% exons elsewhere)
Multi-copy (high N) in all individuals
Copy Number Variation
Across the Genome
Frequency of CNVs
Most are uncommon (<5%)
Familial vs Unrelated Studies
Associated with Disease
OPN1LW Red/green colorblind
CCL3L1 Reduced HIV Infection
CYP2A6 Altered nicotine metabolism
VKORC1 Warfarin metabolism
Engraftment Testing Using STR
Allogeneic bone marrow transplants are
monitored using STR.
Autologous
transplant
Recipient
receives his or her
own purged cells.
Allogeneic
transplant
Recipient
receives donor
cells.
A recipient with donor marrow is a chimera.
Engraftment Testing Using STR
There are two parts to chimerism testing: pretransplant
informative analysis and post-transplant engraftment
analysis.
Before transplant
Donor
Recipient
After transplant
Complete (full
chimerism)
Mixed
Graft failure
chimerism
Engraftment Testing Using STR:
Informative Analysis
STR are scanned to find informative loci (donor
alleles differ from recipient alleles).
Which loci are informative?
Locus:
M
1
D
2
R
D
3
R
4
D
R
5
D
R
D
R
Engraftment Testing Using STR:
Informative Analysis
There are different degrees of informativity.
With the most informative loci, recipient bands or
peaks do not overlap stutter in donor bands or peaks.
Stutter is a technical artifact of the PCR reaction in
which a minor product of n-1 repeat units is produced.
Examples of Informative Loci
(Type 5) [Thiede et al. Leukemia 2004;18:248]
Recipient
Stutter
Donor
Recipient
Donor
Examples of Non-informative Loci (Type 1)
Recipient
Donor
Engraftment Testing Using STR:
Informative Analysis
Which loci are informative?
vWA
TH01
Amel
TPOX
CSF1PO
Engraftment Testing Using STR:
Engraftment Analysis
Using informative loci, peak areas are determined
in fluorescence units or from densitometry scans
of gel bands.
A(R) + A(D)
A(R)
A(D)
area under recipient-specific peaks
A(R) =
A(D) = area under donor-specific peaks
Engraftment Testing Using STR:
Engraftment Analysis
Formula for calculation of % recipient or % donor (no
shared alleles).
% Recipient DNA =
% Donor DNA =
A(R)
A(R) +
A(D)
A(D)
A(R) +
A(D)
× 100
× 100
Engraftment Testing Using STR:
Engraftment Analysis
Calculate %
recipient DNA in
post-1 and post-2:
Use the area
under these
peaks to
calculate
percentages.
Engraftment Analysis of Cellular Subsets
Cell subsets (T cells, granulocytes, NK cells, etc.)
engraft with different kinetics.
Analysis of cellular subsets provides a more detailed
description of the engrafting cell population.
Analysis of cellular subsets also increases the
sensitivity of the engraftment assay.
Engraftment Analysis of Cellular Subsets
T cells (CD3), NK cells (CD56), granulocytes, myeloid
cells (CD13, CD33), myelomonocytic cells (CD14), B
cells (CD19), stem cells (CD34)
Methods
Flow cytometric sorting
Immunomagnetic cell sorting
Immunohistochemistry + XY FISH
Engraftment Analysis of Cellular Subsets
Detection of different levels of engraftment in
cellular subsets is split chimerism.
R
R = recipient alleles
D = donor alleles
T = T-cell subset (mostly recipient)
G = granulocyte subset (mostly donor)
D
T G
Single Nucleotide Polymorphisms (SNPs)
Single-nucleotide differences between DNA
sequences.
One SNP occurs approximately every 1250 base
pairs in human DNA.
SNPs are detected by sequencing, melt curve
analysis, or other methods.
99% have no biological effect; 60,000 are within
genes.
SNP Detection by Sequencing
T/T
5′ AGTCTG
T/A
5′ AG(T/A)CTG
A/A
5′ AGACTG
SNP Haplotypes
SNPs are inherited in blocks or haplotypes.
haplotype
~10,000 bp
Applications of SNP Analysis
SNPs can be used for mapping genes, human
identification, chimerism analysis, and many other
applications.
The Human Haplotype Mapping (HapMap) Project is
aimed at identifying SNP haplotypes throughout the
human genome.
Mitochondrial DNA Polymorphisms
Sequence differences in
the hypervariable regions
(HV) of the
mitochondrial genome
HV 1
(342 bp)
PH1
PH2
HV 2
(268 bp)
PL
Mitochondrial genome
16, 600 bp
Mitochondrial DNA Polymorphisms
Mitochondria are maternally inherited.
There are an average of 8.5 base differences in the
mitochondrial HV sequences of unrelated
individuals.
All maternal relatives will have the same
mitochondrial sequences.
Mitochondrial typing can be used for legal
exclusion of individuals or confirmation of
maternal lineage.
Summary
Four types of polymorphisms are used for a
variety of purposes in the laboratory: RFLP,
VNTR, STR, and SNP.
Polymorphisms are used for human
identification and parentage testing.
Y-STR haplotypes are paternally inherited;
maternal relatives have the same mitochondrial
DNA alleles.
Polymorphisms are used to measure
engraftment after allogeneic bone marrow
transplants.
Summary
Single-nucleotide polymorphisms are detected
by sequencing, melt curve analysis, or other
methods.
SNPs can be used for the same applications as
other polymorphisms.
Mitochondrial DNA typing is performed by
sequencing the mitochondrial HV regions.
Mitochondrial types are maternally inherited.