Molecular and Immunological Methods

Transcript Molecular and Immunological Methods

BLAT: Molecular and
Immunological Methods
Lyle McMillen
Contact: [email protected]
Molecular and Immunological
Methods
 2 basic approaches will be covered,
both based on specific interactions
found in vivo
 Nucleic acid specificity (DNA and RNA
binding)
 Antibody recognition and interactions
Nucleic acid techniques
 These techniques all depend upon the specific nature
of nucleic acid interactions.
 Namely, Adenosine forms 2 hydrogen bonds with
Thymine (DNA) or Uracil (RNA). Guanine form 3
hydrogen bonds with Cytosine.
 Purines hydrogen bond with pyrimidines.
 So, A is complemented with T/U, while G is
complemented with C.
 Interactions outside of these specific pairings are not
stable.
 Specific nature of these pairings allows one strand of
DNA or RNA to specify the nucleic acid sequence of a
complementary sequence.
Base pairing
PCR – a quick review
 DNA is replicated and transcribed to RNA in vivo by
DNA or RNA polymerases, which covalently bond
single nucleotides (deoxynucleoside triphosphates –
dA, dC, dG, and dT or dU) into a complementary
sequence to the single stranded DNA template.
 Each strand of DNA serves as a template for the
synthesis of a second, complementary strand of DNA.
PCR – a quick review
 The use of DNA polymerases allowed duplication of DNA
when used in conjunction with a pair of primers
complementary to the ends of the target DNA sequence.
Unfortunately the polymerase (isolated from E. coli)
degraded rapidly at high temperatures, and high
temperatures were needed to denature the double
stranded DNA produced, and allow more DNA replication.
 The discovery of a thermostable DNA polymerase in
Thermus aquaticus allowed it’s inclusion in a series of
repeated thermal cycles, in which the DNA was
denatured to single strands, the primers annealed, and
the Taq polymerase allowed to synthesise new
complementary DNA.
 Since the discovery of Taq DNA polymerase, a number of
alternative thermostable DNA polymerases have been
discovered or engineered to provide different
characteristics and performance in PCR.
Taq polymerase
Taq polymerase activities:
•Activity optimum at 75-80
ºC
•5’-3’ DNA polymerase
(~100 bases/second)
•No 3’-5’ exonuclease
activity (ie no
proofreading, and an error
rate of 1 in 9000 bases)
•Low 5’-3’ exonuclease
activity
•Polyadenylates at the 3’
end, creating 3’-dA
overhang
Taq polymerase – 94 kDa monomer
PCR – a quick review
DNA sequencing – a variant PCR
 As DNA polymerases synthesise a second strand of
DNA complementary to the sequence of the template
strand, dNTP’s are covalently linked to the growing
polymer in a specific order.
 A modified PCR reaction is used to determine the
order in which these nucleotides are added to the
DNA polymer – DNA sequencing.
 Addition of dideoxynucleotides (ddNTP’s, lacking the
3’-OH required for formation of the phosphodiester
bond between 2 nucleotides) in a low concentration to
the mix terminates extension of the DNA polymer at
random points. A series of fragments terminated at
random points in the DNA sequence are generated.
DNA sequencing – a variant PCR
Key concept: Reporter molecules
 DNA and RNA are fairly hard to see in a research
environment, particularly in low concentrations.
 A variety of reporter molecules, or labels, are used to
make DNA/RNA easier to detect.
 Fall into 2 broad categories
 Molecules which bind to NA’s and fluoresce (Dyes) –
used in agarose gels and some other applications.
Examples: Ethidium bromide, GelRed, SYBR green
 Modified nucleotides which have an integral label,
which are incorporated into the DNA or RNA (labels).
Examples: Radioisotope (35S or 32P) labelled dNTP’s,
fluorescently tagged ddNTP’s
DNA sequencing – a variant PCR




Historically, radioactively labelled dATP was included in four
separate sequencing mixes, along with one of four ddNTP’s
(which terminate extension when incorporated) in low
concentrations. This was the Sanger, or dideoxy terminator,
method, developed by Frederick Sanger and colleagues in
the UK in 1975.
Each mix generates a population of varying length DNA’s,
radioactively labelled, which start with the primer sequence.
These mixed populations could be separated on the basis of
size (and therefore number of bases) by gel electrophoresis
on a denaturing polyacrylamide-urea gel, and the different
sized fragments visualized on an autoradiograph.
The terminal nucleotide for each fragment was determined
by which ddNTP was incorporated into the reaction.
DNA sequencing – a variant PCR
A number of limitations arise from this technique.
•4 datasets per DNA fragment, which need to be
intregrated.
•Data collected manually.
•Short lengths of sequence data - generally 200300 bases was as much as could be realistically
achieved, although 500-800 bases were possible.
•Radioisotopes present a hazard to researchers and
a problem for waste disposal.
Reporter molecules – Fluorophores,
fluorescent labels and dyes





A fluorophore is a portion of a molecule which causes that molecule to be
fluorescent. It’s a functional group which absorbs a specific wavelength of
light and re-emits the energy at a different, specific wavelength.
The wavelength absorbed is the excitation frequency, while the
wavelength emitted is the emission frequency.
The wavelength shift is due to a loss in energy as heat, resulting in the
emission of a longer wavelength photon. This is a Stoke’s shift.
Fluorescent labels bind specifically to the target molecule, and include a
fluorophore. They bind specifically to a target nucleic acid sequence.
Fluorescent dyes bind to the target molecule type (eg. All DNA, or all
double stranded DNA), but binding is not dependent on the target
sequence. Dyes also include a fluorophore functional group.
Reporter molecules – Fluorophores,
fluorescent labels and dyes

Examples include:





Fluorescein and the derivative Fluorescein isothiocyanate: Excitation at 494 nm,
emission at 521 nm. Fluorescent dye or fluorophore used in
immunohistochemistry and Fluorescent In-Situ Hybridisation (FISH)
Ethidium Bromide (EtBr): A nucleic acid dye commonly used to stain DNA in
electrophoresis.
SYBR green: A nucleic acid dye, that fluoresces when intercalated in doublestranded DNA. Typically excited at one of three wavelengths (290 nm, 380 nm,
and 497 nm), and emits at 520 nm.
Dichlororhodamine: A range of fluorophores with different emission spectra.
Used to label dNTP’s
6-carboxyfluorescein (6-FAM): Fluorophore used to label oligonucleotide in real
time PCR.
DNA sequencing – current
technologies
Dichlororhodamine dyes are
used to label ddNTP’s in a
dideoxy terminator reaction.
Each ddNTP is labelled with a
particular variant dye, with
different emission wavelengths
(i.e. Different colour), resulting
in a single reaction generating
random fragments, with each
fragment labelled with a dye
that corresponds to the
terminal base.
Dichlororhodamine dye
DNA sequencing – current
technologies
 These fragments can be separated on a gel
or using capillary gel electrophoresis.
Detection is via a laser filtered to the dye
excitation wavelengths, with a
corresponding emission wavelength filter to
detect any fluorescence.
 Generates a chromatographic trace of the
four emission wavelengths (corresponding
to the four labelled ddNTP’s).
DNA sequencing – current
technologies
This trace is easily interpreted, with each peak corresponding to
the terminal base on the labelled DNA fragment.
DNA sequencing – current
technologies
This technology presents a number of advantages compared to
radioisotope labelling approaches:
•Single tube reaction vs 4 reactions/sample.
•Automated data collection, into a single data set vs manual data
collection, collating 4 data sets.
•Generally able to read 800-1200 bases/reaction vs 200300/reaction.
•No significant hazardous waste vs radioisotope waste.
A number of high throughput sequencing technologies are being
developed, with the goal of sequencing millions of bases very
rapidly.
PCR – end-point analysis
 Conventional PCR is typically analysed by
electrophoresis and visualisation of the
amplicon (PCR product) on an agarose gel.
Visualisation is achieved through the use of
a fluorescent dye such as ethidium
bromide.
 This occurs at the end of the PCR reaction.
This is an end-point analysis.
PCR – end-point analysis
PCR kinetics
 Three distinct phases during a PCR
reaction.
 Exponential phase – exact doubling of product
every cycle (assuming 100% efficiency). Very
specific and precise.
 Linear phase – highly variable, with reaction
components starting to be consumed, products
degrade, and the reaction is slowing. The
extent of slowing will vary from replicate to
replicate.
 Plateau/end-point – the reaction has stopped,
and no more products are being prepared.
Product may begin to degrade. Final yield will
vary significantly between replicates.
PCR kinetics
PCR kinetics
 So, conventional PCR (via end-point
analysis) is not an accurate way to
quantitate the PCR template. It is also
limited in it’s ability to quantitate different
yields of amplicon using staining.
 It would be preferable to measure the
accumulation of amplicon during the
exponential phase, when the rate limiting
factors are the amount of template and
efficiency of amplification.
Real time PCR






Also called quantitative or kinetic PCR (but not RT-PCR, which is
Reverse Transcriptase PCR).
Adds a reporter molecule to a PCR reaction, allowing detection of the
amplicon through the course of the PCR. This is the most important
difference to conventional PCR methodologies. These reporter
molecules are attached to primers, oligonucleotide probes, or the
amplicon, conferring fluorescent potential on these molecules.
Reporter molecules are fluorescent molecules, and are detected using
a fluorescent spectrophotometer in the real time PCR platform.
Two broad categories of reporter molecule – they interact either
specifically (labels) or non-specifically (dyes) with the amplicon’s
nucleotide sequence.
Quantitative analysis is based on detection of the amplicon during the
exponential phase of the PCR.
Data is presented as the thermal cycle at which the level of
fluorescence reaches an arbitrary threshold, set within the exponential
phase of the PCR. This is referred to as the CT value.
Real time PCR



So, how does it work?
Two commonly used approaches.
Double stranded DNA detection





This approach utilises a fluorescent dye which specifically
binds to double stranded DNA (intercalating agent) – SYBR
green, and later derivatives such as SYBR greener, LC
green 1, SYTO 9, EVA Green.
The PCR proceeds as normal, and the dye intercalates into
the double stranded amplicon.
The more amplicon is produced, the more dye is
intercalated.
As these dyes intercalate, their emission intensity increases
(over 100-fold for SYBR green), due to conformational
changes on binding.
It is worth noting that SYBR green is toxic to PCR, and is therefore
used at extremely low concentrations. There are saturation dyes
available that are not toxic, and can be used at higher concentrations
giving stronger fluorescence.
Real time PCR
Real time PCR





The second major approach utilises hydrolysis of a specific
oligonucleotide containing a fluorescent label – often called the TaqMan
method, but also called 5’ nuclease, Taq nuclease or dual-labelled
probes.
Taq polymerase has a 5’-3’ exonuclease activity. Hydrolyses DNA on
the same strand as the newly synthesised DNA.
The oligonucleotide probe contains 2 functional groups: a 5’
fluorophore, and a 3’ fluorophore (e.g. TAMRA) or non-fluorescent
quencher (NFQ). Energy generated by the excitation of the 5’
fluorophore is captured by the 3’ quencher, and emitted as
fluorescence or heat (NFQ). If a second fluorophore is the quencher,
the emission wavelength is different to that of the 5’ fluorophore. This
process is called Fluorescence Resonance Energy Transfer, or FRET.
The probe anneals to the target region specifically. As the Taq
polymerase synthesises DNA, it hydrolyses the probe. Cleavage of the
5’ fluorophore from the rest of the probe enables it to emit
fluorescence, which can be detected.
The level of fluorescence detected is proportional to amount of probe
hydrolysis, and therefore the amount of amplicon synthesised.
Real time PCR
Real time PCR – alternative probe
strategies

There are a number of other probe strategies available,
many of which are patented.

Hybridisation probing entail using two probes, each labelled
with a different fluorophore (typically 6-FAM and a red
fluorophore).






These probes hybridise within 1-5 bases of each other on the
amplicon.
Excitation of the first fluorophore allows the excitation of the
second fluorophore via FRET.
Leads to fluorescence at the second fluorophore’s emission
wavelength (610, 640, 670 or 705 nm, depending on
fluorophore), while exciting using the wavelength of the first
(470 nm).
Detection occurs at the end of annealing step.
Once detection is complete, an increase in temperature
triggers DNA polymerase activity, displacing probe and
amplifying the target region.
Note that when not hybridised, the first fluorophore will emit
fluorescence in its emission wavelength (530 nm for 6-FAM).
Hybridisation probes
Annealing
Denaturation
Extension
Completion
Real time PCR platforms
 The real time PCR platform consists of a few basic
elements
 Thermal cycler – basically a PCR machine, usually
capable of rapid and precise variations in
temperature (usually between 15 and 99 ºC).
 Excitation wavelength emitter, capable of transmitting
the excitation wavelength of the fluorescent reporter
to each sample.
 Emission detector, capable of precise quantitation of
the amount of fluorescence being emitted by the
sample at the fluorophore’s emission wavelength.
 Data recorder, recording the fluorescence from each
sample at the end point of each thermal cycle (end of
extension step).
Real time PCR platforms
 There are a few major types of real time PCR platform
from a range of suppliers, but they all perform the
same function.
 Most are capable of managing multiple fluorophores
simultaneously, allowing multiple amplicons to be
probed in a multiplex assay (we’ll discuss this in more
detail later).
 All are also associated with sophisticated data
management and analysis software, which makes
data analysis easy, reliable and reproducible.
 Raw data integrity is always protected – important for
clinical and diagnostic applications.
Data output
Real time PCR applications
 The most obvious application of real
time PCR is for detection and
quantitation of a specific DNA
sequence.
 May also be used for monitoring
changes in gene expression,
genotyping, or detection of genetic
variations such as single nucleotide
polymorphisms (SNP’s).
Quantitative real time PCR
 Real time PCR data is presented as CT (Cycle
threshold) values, defined as the thermal cycle at
which the fluorescence reaches an arbitrary threshold.
 If a series of samples with known concentrations of
initial template DNA is included in the assay, a linear
plot of CT vs log [initial template] may be generated.
 These standards can be a known number of cells, a
defined number of copies of a plasmid, or any other
defined, quantifiable and reproducible number of
target templates.
 This plot permits linear regression analysis, allowing
the calculation of the copy number of any unknown
target relative to the standards.
 The plot also indicates amplification efficiency (slope)
and some indication of sensitivity (y-intercept).
Quantitative real time PCR
Quantitative real time PCR

There are 4 basic assumptions underlying quantitation by real time
PCR:

The initial template is double stranded.


PCR efficiency is 100%, and both strands of all templates are copied
into full length copies each cycle.


This never happens, due to inefficient primer hybridisation, template
folding and probe and dye interference.
PCR efficiency is constant throughout the amplification process.



When analysing RNA, reverse transcriptase produce single stranded
cDNA, which is made double stranded in the first amplification cycle.
True amplification begins in cycle 2.
Secondary structures may inhibit amplification from long templates such
as genomic DNA or from supercoiled plasmids, mitochondria and
bacterial genomes.
Compare with standards based on the same starting material.
Fluorescence is proportional to the amount of template.

This depends on the dye used, the sequence amplified, the length of the
amplicon, the optical properties of the platform, data acquisition and
instrument settings.
Gene expression analysis


One application of quantitative real time PCR is analysis of gene expression
in different tissues or under different treatment regimes.
mRNA expression from the gene of interest is quantitated from each sample
using a real time RT-PCR.




These are real time PCR’s performed on cDNA, generated from RNA (extracted
from the target tissue) by reverse transcriptase. The reverse transcriptase primer
can be the same as one of the real time PCR primers, or be just outside the real
time PCR amplicon. Only one primer is needed.
Reverse transcriptase efficiency is a significant contributor to variability observed
in real time RT-PCR, and needs to be taken into account when developing any real
time RT-PCR method.
Samples are normalised to the level of expression of a house-keeping gene.
These are genes that are always expressed at constant levels in each cell,
thought to be involved in routine cellular metabolism. e.g. glyceraldehyde3-phosphate dehydrogenase (G2PDH or GAPDH), beta actin, some
ribosomal proteins.
This allows proportional comparison of target mRNA levels between
samples.
Genotyping







A range of real time PCR methods can be used to determine genotype of a
target amplicon.
The simplest approach relies upon determining the melting temperature of
the amplicon using a melting curve.
The real time PCR is performed as normal, incorporating a non-hydrolysed
probe or dye – typically performed with SYBR Green or a saturation dye
such as SYTO 9 or LC Green 1.
Once the amplification program is complete (and quantitation data
collected), the samples is heated through a gradient, with fluorescence data
gathered at set temperature intervals (typically every 1 ºC, but can be as
often as every 0.2 ºC in high resolution equipment). The gradient is
typically from 50 ºC to 95 ºC, but can be refined to a narrower range.
As the temperature increases, the amplicon will denature, “unzipping” from
double stranded to single stranded. The fluorescence of the dyes will
decrease as more of the amplicon denatures.
The temperature at which this decrease in fluorescence is at its fastest is
called the melting temperature (TM), and varies with the G+C% and
sequence of the amplicon. Determined by plotting reduction in fluorescence
against change in temperature (dF/dT).
Different genotypes of the amplicon will have different TM’s. TM analysis is
also used to ensure that the desired target has been amplified.
Genotyping
SNP analysis – a specific form of
genotyping





Single nucleotide polymorphisms (SNP’s) are the most common
form of genetic variation.
SNP’s are a single base variation at a specific locus within a gene,
usually consisting of two alleles. The rare allele is generally
present in >1% of the population.
The SNP may be in the coding sequence, non-coding region, or
intergenic regions, and can have impacts on polypeptide sequence,
gene splicing , transcription factor binding or non-coding RNA
sequence.
They can be detected through melting curve analysis , with specific
patterns being generated for each homozygote (both diploid alleles
containing the wild type or mutant) and for a heterozygote (one
wild type and one mutant allelle).
Heterozygote amplifications include both alleles in the one
reaction, and they will anneal, but do not match perfectly (a
heteroduplex). As a consequence, the TM of the heteroduplex will
be lower than that of the amplicons from the homozygotes.
SNP analysis
NB: This plot is
normalised to the
mutant allele melting
curve.
SNP analysis




A second option involves the inclusion of hydrolysis probes
for each allele in the one reaction. Each hydrolysis probe
contains a different fluorophore with distinct excitation and
emission wavelengths. This is an example of a multiplex
assay.
If probe design is right and reaction conditions are stringent
enough, a single base variation from the probe target will
be sufficient to prevent annealing of the probe to the
variant sequence.
Amplification of each allele will be reported by the probe
specific for that allele.
Multiple SNP’s can interrogated simultaneously, although
this is limited by the number of distinct probes available,
the optimisation of reaction conditions, and the capabilities
of the real time PCR platform.
Real time PCR design
considerations

In order to have the most efficient amplification and detection possible, a
number of guidelines have been developed for optimum real time PCR assay
design.








The length and structure of the amplicon are fundamental to good real time PCR
design. In general, real time PCR amplicons are very short compared to
conventional PCR amplicons (70-300 bp).
G+C% of the amplicon should be between 30-80%, and runs of identical
nucleotides, particularly 4 or more G’s, should be avoided.
Primers should not be complementary to themselves or each other to avoid
primer-dimer formation.
Primer TM‘s should be within 2 ºC of each other (58-60 ºC is optimal), and primers
should be 18-22 nucleotides long.
Number of G’s and C’s in the last 5 bases of the 3’ end of the primer should not
exceed 2.
Hydrolysis probes should have a TM 10 ºC higher than the primers (but not above
75 ºC), and should anneal close to the primer on the same strand.
Hydrolysis probes should be 20-30 bases long, unless stabilised with a minor
groove binder moiety (a tricyclic functional group that folds back into the minor
groove of the probe-target duplex and stabilises the interaction). MGB probes
may be only 13-18 bases long, and still achieve the desired TM.
The probe should not have a G at the 5’ end, to avoid unpaired Gs quenching
fluorescence, and there should be more C’s than G’s (increases the change in
fluorescence when the probe is hydrolysed).
Real time PCR design
considerations
 Fortunately, there are a number of
software tools available (e.g.
PrimerExpress, Oligo 6.0, Vector NTI)
which can generate a number of
design options from a DNA sequence.
These possible designs will meet the
basic rules for assay design, and will
need to be checked manually by the
researcher to ensure they are suitable
for the specific application.
Another use for fluorescent probes
- microarrays
 A DNA microarray is a series of microscopic
spots of specific oligonucleotides (typically
stretches of a gene) covalently bound to a
matrix (ie. A slide or chip).
 Under high stringency conditions, only a
complementary sequence will bind to these
probes.
 If the sample to be probed is fluorescently
labelled, array sites containing probes that
bind the sample will fluoresce.
Another use for fluorescent probes
- microarrays






Typical uses include gene expression profiling, comparing genome content,
and SNP detection.
In gene expression profiling, mRNA is isolated from two samples, and cDNA
is prepared by reverse transcriptase.
During cDNA synthesis, fluorescently labelled nucleotides are incorporated –
different fluorophores are used in each sample. Cy3 (emission at 570nm,
or green) and Cy5 (emission at 670 nm, or red) are commonly used.
The cDNA samples are then hybridised to the microarray, containing
thousands of oligos specific to individual genes in known locations on the
array.
Fluorescence is measured at each location on the array, and variations in
expressed genes are identified.
Note that if both Cy3 and Cy5-labelled cDNA bind to an array location, the
spot appears yellow. This gives four possible outcomes:




Black – no expression of that gene in either sample.
Red or green – expression of that gene in only one of the samples.
Yellow – expression of the gene in both samples.
Housekeeping genes are always included as a reference.
Microarrays
Microarrays
A 40000 spot two-colour oligo microarray.

Molecular and Immunological Methods

Transcript Molecular and Immunological Methods

Directory