PowerPoint - Parabiosys

Download Report

Transcript PowerPoint - Parabiosys

Plotting the path from RNA to
microarray: the importance of
experimental planning and
methods
Glenn Short
Microarray Core Facility/Lipid Metabolism Unit
Massachusetts General Hospital
Talk Outline

Why perform a microarray experiment?
 Choosing a microarray platform
 Sources of variability that lend to
experimental considerations
 Overcoming experimental variability
Why perform a microarray
experiment?

Genomic vantage point
– Detect gene expression
– Compare gene expression levels
• Over time
• Over treatment course
– Map genes to phenotypes
– Map deleted or duplicated regions
– Identify genes that modulate other
genes

Binary decision-making
When not to perform a
Microarray Experiment

Interested in a small number of specific
genes
QRT-PCR, Northern blots
 Desire quantitative results
 Low tolerance of variability
 Cannot afford to perform experiment with
adequate replication
Asking a Specific Question

The most fundamental; the MOST
IMPORTANT
 Simplifies experimental design
 Empowers interpretation of data
Simplicity, simplicity, simplicity! I say let your affairs
be as one, two, three and to a hundred or a
thousand… We are happy in proportion to the things
we can do without.--Henry David Thoreau
Considerations of Microarray
Experimental Design

Which microarray platform will be used?
 What is the end goal of the experiment?
 What is the specific question being asked?
 What are the most pertinent comparisons?
 What controls will be applied to the experiments?
 Which statistical methods will be used during data
analysis?
 What methods will be used to verify results from the
microarrays?
Choosing a Microarray Platform







Are genes of interest included on the
array?
Are genes replicated?
Tiling of genes that undergo splicing
Controls on array
Quantity of RNA needed for testing
Are the arrays adequately QC’d?
Cost
Affymetrix Platform
Affymetrix Platform
Affymetrix Platform

Pro’s
– standardized production
– gene replication
– probe tiling across gene
– Reproducible
– Affymetrix custom
database user-friendly

Con’s
– Expensive
– Annotation differences
– single sample per chip
cDNA Platform


Pro’s
– Genome sequence
independent
– High stringency
hybridization
– Little need for signal
amplification
cDNA clones
(probes)
Con’s
– Clone handling
– Clone authentication
– cDNA resources difficult to
access and often crosscontaminated
1. PCR product
amplification
2. Purification
3. Printing
PCR products
used as probes
Spotted oligonucleotide Platform
Synthesized
oligonucleotides
in 384 well
 Pro’s
plates
– Complete control over oligo
sequences
– Absence of contamination
– Additional probes may be
added when needed
– Flexibility of design, probe
replication, and tiling
– Inexpensive, enabling
experimental replication
 Con’s
– Sequence data required for
probe design
– No consensus set of probe
design algorithms
– Must have arraying
instrumentation
1.
2.
3.
Purification
QC
Printing
Oligonucleotides
used as probes
Spotted Oligonucleotide vs Affymetrix
Arrays
Oligonulceotide
Probe design and synthesis
probe set
Affymetrix
ParaBioSys Platform

Long Oligonucleotides, 70mer
 Designed and synthesized in-house
 5’-amine modified
 Extensively QC’d
 Probes designed to the 5’-orf
 Set is updated as known orf list grows
– Currently 20,000 probes
ParaBioSys probe design and synthesis

Probe design using OligoPicker
– based on gen-pept database
– Tm’s of selected oligos approx. the
same
– improved specificity
Oligonucleotide Quality Control
pass
fail

Capillary Electrophoresis
–
Identifies relative
abundance of fulllength product

Use of mass spectral
analysis
–
–
Identifies relative
abundance
Ensures probe is of the
expected mass based
upon sequence
Array Quality Control

Spotted probes are 3’labeled with dCTP-Cy3
using terminal
deoxynucleotidyl
transferase

First and last array of
the print-run are QC’d
Understanding sources of variability
in microarray experiments
Sources of Variation






Differences in identical treatments
Intrinsic biological variation
Technical variation in extraction and labeling
of RNA samples
Technical variation in hybridization
Spot size variation
Measurement error in scanning
When graphing expression data, use log
0
5
10
15
ratio (T/C)
20
-4
-2
0
2
log2 ratio (T/C)
4
M
log2 C
Plotting expression data
A
log2 T
log 2T
vs log 2C
M  log 2T / C
vs A 
1
log 2TC
2
M= log ratio vs A=log geometric
mean
Expression data-cont
Genes expressed
up relative to
reference by a
factor of 32.
1
log 2(TiCi )
2
Low expressed
Highly expressed
Genes expressed
down relative to
reference by a factor
of 1/32.
Differences Due to Treatment

RNA isolation protocol differences
 Cell-culture media changes
 Expression differences over time
– Cell cycle genes (synchronization)

Variables need to be minimized!
Biological Variability


Self-self hybridizations of four independent biological replicates
Biological variability of inhibitory PAS domain protein
Sample 3
Sample 2
Technical Variability
Sample 1

Sample 1
Self-self hybridization (Cerebellar vs cerebellar)
– Sample 1 and 2 labeled together and hybridized on separate
slides
– Sample 3 labeled separately

Arises from differences in labeling, efficiency in RT,
hybridization, arrays, etc.
Dye Effects
Environmental Health Perspectives • VOLUME 112 | NUMBER 4 | March 2004



Variation in quantum yield of fluorophores
Variation in the incorporation efficiency
Differential dye effects on hybridization
Hybridization Variability
Printing Variability
Differences in Probe Performance
Academic_1
Academic_2
ParaBioSys
Vendor



Probe design algorithms will cause changes in the
expression pattern
Once a platform is chosen all future comparisons should
be performed on the same platform
Cross-platform comparisons as a means of validation
Differences Across Commercial
Platforms
P<0.001
Nucleic Acids Research, 2003, Vol. 31, No. 19, 5676-5684
Controlling Variability
Experimental Plan
Increased Quality Control
Probe QC
Array QC
Total RNA QC
– denaturing agarose gel
– Agilent Bioanalyzer



22.5
6
20.0
5
17.5
15.0
4
Fluorescence
Fluorescence
12.5
3
10.0
2
7.5
5.0
1
24
29
34
39
44
49
Time (seconds)

Labeling QC
54
59
64
69
19
24
29
34
39
44
28S
0.0
18S
18S
19
28S
2.5
0
49
Time (seconds)
54
59
64
69
Controlling biological and technical variability with replication
Replicates = 2
5
5
4
4
3
3
2
2
M log2(T/C)
M log2(T/C)
Single array
1
0
-1
1
0
-1
-2
-2
-3
-3
-4
-4
1
2
3
4
5
6
1
7
2
3
Replicates = 5
5
6
7
5
6
7
Replicates = 10
4
4
3
3
2
2
M log2(T/C)
M log2(T/C)
4
A log10 (sqrt(T*C))
A log10(sqrt(T*C))
1
0
-1
1
0
-1
-2
-2
-3
-3
-4
-4
1
2
3
4
5
6
1
A log10(sqrt(T*C))
Integrin alpha 2b
Average
Pro-platelet basic protein
across replicates
Essential to the estimation of variance
Critical for valid statistical analysis
2
3
4
A log10(sqrt(T*C))
Controlling Dye Effects
T
Dye-Swap
T
C
C
Controlling Variability through
Experimental Design

Replication
– Spot
– Multiple arrays per sample comparison (technical)
• Dye swap
– Multiple samples per treatment group (biological)

Increased precision and quality control
 Estimate measurement error
 Estimate biological variation
 Pooling
– Reduce biological variation
Controlling Variability through
Experimental Design –cont.



Normalize data to correct for systematic
differences (spot intensity, location on array,
hybridization,dye,scanner, scanner parameters…)
on the same slide or between slides, which is not a
result of biological variation between mRNA
samples
Minimize printing differences by using a contiguous
series of slides from the same print run
If wanting to do historical comparisons, use the
same platform
Planning your experiment

Experimental Aim
– Specific questions and priorities among them
– How will the experiments answer the questions posed?

Experimental logistics
– Types of total RNA samples
• Reference, control, cell line, tissue sample, treatment A….
• How will the samples be compared?
• Number of arrays needed

Other Considerations
– Plan of experimental process prior to hybridization:
• Sample isolation, RNA extraction, amplification, pooling,
labeling
– Limitations: number of arrays, amount of material
– Extensibility (linking)
Planning your Experiment- cont

Other Considerations-cont
– Controls: positive, negative, in-spike controls
– Methods of verification:
• QRT-PCR, Northern, in situ hybridization,…

Performing the experiment
– Reagents (arrays-from same print run), equipment
(scanners), order of hybridizations
Controls

Positive Controls
– used to ensure that target DNAs are labeled to an acceptable
specific activity
– single pool of all probe elements on array
 Negative Controls
– used to assess the degree of non-specific cross- hybridization
– probes derived from organisms with no known homologs/paralogs
to the organism of study
– derived in silico (alien sequences)
 In-spike controls
– Known amounts of polyadenylated mRNAs added to each labeling
reaction
– Should not cross-hybridize with with any probe sequences
• Alien sequences
• Spot-report (Stratagene)
• Lucidea ScoreCard (Amersham Biosciences)
– Can be used to assess dynamic range of the system
Validation
If
you have failed to
validate your array data,
you have NOT completed
your analysis
ParaBioSys
has developed
Primer Bank for QRT-PCR
primer sequences
http://pga.mgh.harvard.edu/primerbank/
Many thanks for your attention
https://dnacore.mgh.harvard.edu
http://pga.mgh.harvard.edu
Glenn Short
Microarray Core
Massachusetts General Hospital