Transcript PPS - VCU

Introduction to DNA Microarrays
Michael F. Miles, M.D., Ph.D.
Depts. of Pharmacology/Toxicology and
Neurology and the Center for Study of
Biological Complexity
[email protected]
225-4054
Biological Regulation:
“You are what you express”
• Levels of regulation
• Methods of measurement
• Concept of genomics
Regulation of Gene Expression
• Transcriptional
– Altered DNA binding protein complex abundance or function
• Post-transcriptional
– mRNA stability
– mRNA processing (alternative splicing)
• Translational
– RNA trafficking
– RNA binding proteins
• Post-translational
– Many forms!
Regulation of Gene Expression
• Genes are expressed when they are transcribed into RNA
• Amount of mRNA indicates gene activity
•
Some genes expressed in all tissues -- but are still
regulated!
•
Some genes expressed selectively depending on tissue,
disease, environment
• Dynamic regulation of gene expression allows long term
responses to environment
 Mesolimbic dopamine
? Other
Acute Drug Use
Reinforcement
Intoxication
Altered Signaling
Gene Expression
Tolerance
Dependence
?Synaptic Remodeling
Sensitization
Chronic Drug Use
?Synaptic Remodeling
Persistent Gene Exp.
Compulsive Drug
Use
“Addiction”
Progress in Studies on Gene Regulation
1960
1970
1980
1990
2000
mRNA,
tRNA discovered
Nucleic acid hybridization,
protein/RNA electrophoresis
Molecular cloning;
Southern, Northern &
Western blots; 2-D gels
Subtractive
Hybridization, PCR,
Differential Display,
MALDI/TOF MS
Genome Sequencing
DNA/Protein
Microarrays
Nucleic Acid Hybridization:
How It Works
Primer on Nucleic Acid
Hybridization
• Hybridization rate depends on time,the
concentration of nucleic acids, and the
reassociation constant for the nucleic acid:
C/Co = 1/(1+kCot)
High Density DNA Microarrays
A Bit of History
~1992-1996: Oligo arrays developed by Fodor, Stryer,
Lockhart, others at Stanford/Affymetrix and Southern in
Great Britain
~1994-1995: cDNA arrays usually attributed to Pat Brown
and Dari Shalon at Stanford who first used a robot to print
the arrays. In 1994, Shalon started Synteni which was
bought by Incyte in 1998.
However, in 1982 Augenlicht and Korbin proposed a
DNA array (Cancer Research) and in 1984 they made a
4000 element array to interrogate human cancer cells.
(Rejected by Science, Nature and the NIH)
Biological Networks
Types of Biological Networks
Gene Regulation Network
Examining Biological Networks:
Experimental Design
Examining Biological Networks
AvgDiff
Use of Sscore in
Hierarchical
Clustering
of Brain
Regional
Expression
Patterns
S-score
-2
0
+2
relative change
Expression Profiling: A Non-biased, Genomic Approach to
Resolving the Mechanisms of Addiction
Candidate
Gene Studies
Cycles of
Expression
Profiling
Merge with
Biological
Databases
Utility of Expression Profiling
•
•
•
•
Non-biased, genome-wide
Hypothesis generating
Gene hunting
Pattern identification:
– Insight into gene function
– Molecular classification
– Phenotypic mechanisms
Comparisons
(S-score, dchip)
De-noise
GE Database
(SQL Server)
Statistical
Filtering
(e.g. SAM)
Hybridization
and Scanning
Clustering
Techniques
Experimental
Design
Behavioral
Validation
Provisional
Gene
“Patterns”
Molecular
Validation
(RT-PCR, in
situ, Western)
Candidate
Genes
Filtered Gene
Lists
Overlay
Biological
Databases
(PubGen,
GenMAPP,
QTL, etc.)
Experimental Design with DNA
Microarrays
High Density DNA Microarrays
Synthesis and Analysis of 2-color
Spotted cDNA Arrays: “Brown Chips”
Comparative Hybridization with
Spotted cDNA Microarrays
Synthesis of High Density Oligonucleotide
Arrays by Photolithography/Photochemistry
GeneChip Features
• Parallel analysis of >30K human, rat or
mouse genes/EST clusters with 15-20
oligos (25 mer) per gene/EST
• entire genome analysis (human, yeast,
mouse)
• 3-4 orders of magnitude dynamic range
(1-10,000 copies/cell)
• quantitative for changes >25% ??
• SNP analysis
Oligonucleotide Array Analysis
Total RNA
5’
AAAA
Rtase/
Pol II
dsDNA
AAAA-T7
TTTT-T7
T7 pol
Biotin-cRNA
TTTT-5’
CTP-biotin
Oligo(dT)-T7
Hybridization
Scanning
PM
MM
Steptavidinphycoerythrin
Stepwise Analysis of
Microarray Data
• Low-level analysis -- image analysis,
expression quantitation
• Primary analysis -- is there a change in
expression?
• Secondary analysis -- what genes show
correlated patterns of expression?
(supervised vs. unsupervised)
• Tertiary analysis -- is there a phenotypic
“trace” for a given expression pattern?
Affymetrix Arrays: Image
Analysis
Affymetrix Arrays: Image Analysis
“.DAT” file
“.CEL” file
Affymetrix Arrays: PM-MM
Difference Calculation
Probe pairs control for non-specific hybridization of oligonucleotides
Variability and Error in DNA
Microarray Hybridizations
Variability in Ln(FC)
Ln(FC1)
(a)
Ln(FC2)
•
Position Dependent Nearest Neighbor (PDNN) - 2003
Zhang, Miles and Aldape, (2003) A model of molecular interactions on short
oligogonucleotide microarrays: implications for probe design and data analysis.
Nature Biotech. In Press.
Chip Normalization Procedures
• Whole chip intensity
– Assumes relatively few changes, uniform
error/noise across chip and abundance classes
• Spiked standards
– Requires exquisite technical control, assumes
uniform behavior
• Internal Standards
– Assumes no significant regulation
• “Piece-wise” linear normalization
S-score
Normalization Confounds:
Non-uniform Chip Behavior
Gene
Normalization Confounds:
Non-linearity
Slide Normalization: Pieces and Pins
“Lowess” normalization,
Pin-specific Profiles
After Print-tip Normalization
http://www.ipam.ucla.edu/publications/fg2000/fgt_tspeed9.pdf
See also: Schuchhardt, J. et al., NAR 28: e47 (2000)
Quality Assessment
• Gene specific: R/G correlation, %BG,
%spot
• Array specific: normalization factor, %
genes present, linearity, control/spike
performance (e.g. 5’/3’ ratio, intensity)
• Across arrays: linearity, correlation,
background, normalization factors, noise
Statistical Analysis of Microarrays:
“Not Your Father’s Oldsmobile”
Normal vs. Normal
Normal vs. Tumor
Sources of Variability
• Target Preparation
– Group target preps
• Chip Run
– Minor, BUT…
– Be aware of processing order
• Chip Lot
– Stagger lots across experiment if necessary
• Chip Scanning Order
– Cross and block chip scanning order
Secondary Analysis: Expression
Patterns
• Supervised multivariate analyses
– Support vector machines
• Non-supervised clustering methods
– Hierarchical
– K-means
– SOM
AvgDif
f
Use of Sscore in
Hierarchica
l Clustering
of Brain
Regional
Expression
Patterns
Sscore
-2
0
+2
relative change
Expression Profiling
Prot-Prot
Interactions
BioMed Lit
Relations
Expression Networks
HomoloGen
e
Ontology
Pharmacology
Genetics
Behavior
Array Analysis: Conclusions
• Be careful! Assess quality control
parameters rigorously
• Single arrays or experiments are of limited
value
• Normalization and weighting for noise are
critical procedures
• Across investigator/platform/species
comparisons will most easily be done with
relative data
Comparison of Primary Analysis Algorithms II
Spotted cDNA Microarrays