PowerPoint - Parabiosys
Download
Report
Transcript PowerPoint - Parabiosys
Plotting the path from RNA to
microarray: the importance of
experimental planning and
methods
Glenn Short
Microarray Core Facility/Lipid Metabolism Unit
Massachusetts General Hospital
Talk Outline
Why perform a microarray experiment?
Choosing a microarray platform
Sources of variability that lend to
experimental considerations
Overcoming experimental variability
Why perform a microarray
experiment?
Genomic vantage point
– Detect gene expression
– Compare gene expression levels
• Over time
• Over treatment course
– Map genes to phenotypes
– Map deleted or duplicated regions
– Identify genes that modulate other
genes
Binary decision-making
When not to perform a
Microarray Experiment
Interested in a small number of specific
genes
QRT-PCR, Northern blots
Desire quantitative results
Low tolerance of variability
Cannot afford to perform experiment with
adequate replication
Asking a Specific Question
The most fundamental; the MOST
IMPORTANT
Simplifies experimental design
Empowers interpretation of data
Simplicity, simplicity, simplicity! I say let your affairs
be as one, two, three and to a hundred or a
thousand… We are happy in proportion to the things
we can do without.--Henry David Thoreau
Considerations of Microarray
Experimental Design
Which microarray platform will be used?
What is the end goal of the experiment?
What is the specific question being asked?
What are the most pertinent comparisons?
What controls will be applied to the experiments?
Which statistical methods will be used during data
analysis?
What methods will be used to verify results from the
microarrays?
Choosing a Microarray Platform
Are genes of interest included on the
array?
Are genes replicated?
Tiling of genes that undergo splicing
Controls on array
Quantity of RNA needed for testing
Are the arrays adequately QC’d?
Cost
Affymetrix Platform
Affymetrix Platform
Affymetrix Platform
Pro’s
– standardized production
– gene replication
– probe tiling across gene
– Reproducible
– Affymetrix custom
database user-friendly
Con’s
– Expensive
– Annotation differences
– single sample per chip
cDNA Platform
Pro’s
– Genome sequence
independent
– High stringency
hybridization
– Little need for signal
amplification
cDNA clones
(probes)
Con’s
– Clone handling
– Clone authentication
– cDNA resources difficult to
access and often crosscontaminated
1. PCR product
amplification
2. Purification
3. Printing
PCR products
used as probes
Spotted oligonucleotide Platform
Synthesized
oligonucleotides
in 384 well
Pro’s
plates
– Complete control over oligo
sequences
– Absence of contamination
– Additional probes may be
added when needed
– Flexibility of design, probe
replication, and tiling
– Inexpensive, enabling
experimental replication
Con’s
– Sequence data required for
probe design
– No consensus set of probe
design algorithms
– Must have arraying
instrumentation
1.
2.
3.
Purification
QC
Printing
Oligonucleotides
used as probes
Spotted Oligonucleotide vs Affymetrix
Arrays
Oligonulceotide
Probe design and synthesis
probe set
Affymetrix
ParaBioSys Platform
Long Oligonucleotides, 70mer
Designed and synthesized in-house
5’-amine modified
Extensively QC’d
Probes designed to the 5’-orf
Set is updated as known orf list grows
– Currently 20,000 probes
ParaBioSys probe design and synthesis
Probe design using OligoPicker
– based on gen-pept database
– Tm’s of selected oligos approx. the
same
– improved specificity
Oligonucleotide Quality Control
pass
fail
Capillary Electrophoresis
–
Identifies relative
abundance of fulllength product
Use of mass spectral
analysis
–
–
Identifies relative
abundance
Ensures probe is of the
expected mass based
upon sequence
Array Quality Control
Spotted probes are 3’labeled with dCTP-Cy3
using terminal
deoxynucleotidyl
transferase
First and last array of
the print-run are QC’d
Understanding sources of variability
in microarray experiments
Sources of Variation
Differences in identical treatments
Intrinsic biological variation
Technical variation in extraction and labeling
of RNA samples
Technical variation in hybridization
Spot size variation
Measurement error in scanning
When graphing expression data, use log
0
5
10
15
ratio (T/C)
20
-4
-2
0
2
log2 ratio (T/C)
4
M
log2 C
Plotting expression data
A
log2 T
log 2T
vs log 2C
M log 2T / C
vs A
1
log 2TC
2
M= log ratio vs A=log geometric
mean
Expression data-cont
Genes expressed
up relative to
reference by a
factor of 32.
1
log 2(TiCi )
2
Low expressed
Highly expressed
Genes expressed
down relative to
reference by a factor
of 1/32.
Differences Due to Treatment
RNA isolation protocol differences
Cell-culture media changes
Expression differences over time
– Cell cycle genes (synchronization)
Variables need to be minimized!
Biological Variability
Self-self hybridizations of four independent biological replicates
Biological variability of inhibitory PAS domain protein
Sample 3
Sample 2
Technical Variability
Sample 1
Sample 1
Self-self hybridization (Cerebellar vs cerebellar)
– Sample 1 and 2 labeled together and hybridized on separate
slides
– Sample 3 labeled separately
Arises from differences in labeling, efficiency in RT,
hybridization, arrays, etc.
Dye Effects
Environmental Health Perspectives • VOLUME 112 | NUMBER 4 | March 2004
Variation in quantum yield of fluorophores
Variation in the incorporation efficiency
Differential dye effects on hybridization
Hybridization Variability
Printing Variability
Differences in Probe Performance
Academic_1
Academic_2
ParaBioSys
Vendor
Probe design algorithms will cause changes in the
expression pattern
Once a platform is chosen all future comparisons should
be performed on the same platform
Cross-platform comparisons as a means of validation
Differences Across Commercial
Platforms
P<0.001
Nucleic Acids Research, 2003, Vol. 31, No. 19, 5676-5684
Controlling Variability
Experimental Plan
Increased Quality Control
Probe QC
Array QC
Total RNA QC
– denaturing agarose gel
– Agilent Bioanalyzer
22.5
6
20.0
5
17.5
15.0
4
Fluorescence
Fluorescence
12.5
3
10.0
2
7.5
5.0
1
24
29
34
39
44
49
Time (seconds)
Labeling QC
54
59
64
69
19
24
29
34
39
44
28S
0.0
18S
18S
19
28S
2.5
0
49
Time (seconds)
54
59
64
69
Controlling biological and technical variability with replication
Replicates = 2
5
5
4
4
3
3
2
2
M log2(T/C)
M log2(T/C)
Single array
1
0
-1
1
0
-1
-2
-2
-3
-3
-4
-4
1
2
3
4
5
6
1
7
2
3
Replicates = 5
5
6
7
5
6
7
Replicates = 10
4
4
3
3
2
2
M log2(T/C)
M log2(T/C)
4
A log10 (sqrt(T*C))
A log10(sqrt(T*C))
1
0
-1
1
0
-1
-2
-2
-3
-3
-4
-4
1
2
3
4
5
6
1
A log10(sqrt(T*C))
Integrin alpha 2b
Average
Pro-platelet basic protein
across replicates
Essential to the estimation of variance
Critical for valid statistical analysis
2
3
4
A log10(sqrt(T*C))
Controlling Dye Effects
T
Dye-Swap
T
C
C
Controlling Variability through
Experimental Design
Replication
– Spot
– Multiple arrays per sample comparison (technical)
• Dye swap
– Multiple samples per treatment group (biological)
Increased precision and quality control
Estimate measurement error
Estimate biological variation
Pooling
– Reduce biological variation
Controlling Variability through
Experimental Design –cont.
Normalize data to correct for systematic
differences (spot intensity, location on array,
hybridization,dye,scanner, scanner parameters…)
on the same slide or between slides, which is not a
result of biological variation between mRNA
samples
Minimize printing differences by using a contiguous
series of slides from the same print run
If wanting to do historical comparisons, use the
same platform
Planning your experiment
Experimental Aim
– Specific questions and priorities among them
– How will the experiments answer the questions posed?
Experimental logistics
– Types of total RNA samples
• Reference, control, cell line, tissue sample, treatment A….
• How will the samples be compared?
• Number of arrays needed
Other Considerations
– Plan of experimental process prior to hybridization:
• Sample isolation, RNA extraction, amplification, pooling,
labeling
– Limitations: number of arrays, amount of material
– Extensibility (linking)
Planning your Experiment- cont
Other Considerations-cont
– Controls: positive, negative, in-spike controls
– Methods of verification:
• QRT-PCR, Northern, in situ hybridization,…
Performing the experiment
– Reagents (arrays-from same print run), equipment
(scanners), order of hybridizations
Controls
Positive Controls
– used to ensure that target DNAs are labeled to an acceptable
specific activity
– single pool of all probe elements on array
Negative Controls
– used to assess the degree of non-specific cross- hybridization
– probes derived from organisms with no known homologs/paralogs
to the organism of study
– derived in silico (alien sequences)
In-spike controls
– Known amounts of polyadenylated mRNAs added to each labeling
reaction
– Should not cross-hybridize with with any probe sequences
• Alien sequences
• Spot-report (Stratagene)
• Lucidea ScoreCard (Amersham Biosciences)
– Can be used to assess dynamic range of the system
Validation
If
you have failed to
validate your array data,
you have NOT completed
your analysis
ParaBioSys
has developed
Primer Bank for QRT-PCR
primer sequences
http://pga.mgh.harvard.edu/primerbank/
Many thanks for your attention
https://dnacore.mgh.harvard.edu
http://pga.mgh.harvard.edu
Glenn Short
Microarray Core
Massachusetts General Hospital