Transcript Spotted
MICROARRAYS
D’EXPRESSIÓ
ESTUDI DEL FACTOR DE
TRANSCRIPCIÓ ASH2
M. Corominas: [email protected]
Spotted array experiment
1. Prepare sample.
Test
4. Print microarray.
Reference
2. Label with fluorescent dyes.
5. Hybridize to microarray.
3. Combine cDNAs.
6. Scan.
Spotted microarrays rely on delivery technologies to
place biologic material (purified cDNA, oligonucleotides)
onto allocated locations of the chip.
(competitive hybridization: Cy3 vs Cy5)
Drosophila melanogaster
Wolpert (2001)
ash2
- member of a trithorax group
Belongs to multiprotein chromatin remodeling complexes
-Polycomb (PcG) : transcriptional repression
-trithorax (trxG) : transcripcional activation
Transcriptional Regulator
TYPES OF MICROARRAYS
1) From full length cDNA
Plates from the Berkeley Drosophila Gene Collection
with 384 wells (clones) each: DGC1.0 and 2.0
Aprox. 12000 genes in total
Direct PCR from Bacterial
Growth using vector-specific
primers
Analysis of PCR results
by electrophoresis
Spotting on
slide
TYPES OF MICROARRAYS
2) From 400 bp amplicons
a) correspond to approximately 75% of genes
predicted in release 3.1 (gene specific primers
kindly donated by Incyte Genomics and
Brian Oliver, NIH).
b) based on a novel annotation of the fly genome.
It contains 21376 gene- specific probes.
Performed and available from Eurogentec.
Carried out in collaboration (ZMBH, Univ.
of
Heidelberg; DKFZ, MPI Molecular
Genetics
Computational Molecular Biology,
Germany)
3) From oligonucleotides
a) INDAC project: International Drosophila Array
Consortium
www.indac.net
70 mer oligonucleotides designed towards the 3’ end
of the genes (based on the 3.1 release) with specific
algorithms and synthesized by Illumina.
b) Qiagen/Operon oligo set
70 mer oligonucleotides representing 13,664
genes designed from release 3.1
already available in the Plataforma de Transcriptòmica
Serveis Científico-Tècnics UB- PCB
- RNA samples:
- total RNA
- polyA+ RNA
- T7 polymerase amplified RNA
- labeling method (competitive hybridization):
- direct
- indirect
- positive and negative controls
MIAME describes the Minimum Information About a
Microarray Experiment that is needed to enable the
interpretation of the results of the experiment
unambiguously and potentially to reproduce the
experiment.
http://www.mged.org/Workgroups/MIAME/miame.html
Production of cDNA chips
17 plates from the Berkeley Drosophila Gene Collection
with 384 wells (clones) each.
Aprox. 5000 genes in total
Direct PCR from
Bacterial Growth
Analysis of PCR results
by electrophoresis
Spotting on
slide
Hybridization of Chips
mutant flies (ash2)
wild-type flies
Trizol RNA Extraction
& Poly A+ Purification
mRNA
mRNA
Two-Step Fluorescent
Labelling
Cy5 test sample
Cy3 control sample
Hybridize Slide
Scanning of Chips
532 nm
Scan Slide
fluorescent intensities for GenePix
each cDNA, spot or gene
-Integrate Data
-Filter Data
-Adjust dye bias
635nm
fluorescent intensities for
each cDNA, spot or gene
-Calculate Ratios
-Adjust Data
-Set Thresholds
“Bad” Spots Filtering
- Is the process in which spots that don’t look right are
discarded according to different criteria
GenePix discards data according to internal filters like:
x % pixels > Median Background intensity
Convert Data 3.33 to further filter data.
Spots were flagged as OK if:
medianFx > mBx +/- XSD
- Spots must pass filtering for both channels
log (F Median - B)
17.8
17
16.2
15.4
14.6
13.8
13
12.2
11.4
10.6
9.8
9
8.2
7.4
6.6
5.8
5
Number in each class
80
60
40
20
0
log(F532Median-B532)
log(F635Median-B635)
120
Distribution for Good spots
at both wavelengths
100
Adjusting Ratios
- A Ratio measures how much sample cDNA over control
cDNA we have of a given gene. This is:
Ratio = Intensity sample / Intensity control
- Different measures for the ratios:
- Ratio of Medians
- Ratio of Means
- Regression Ratio
-Log (base 2) the ratios :
•Makes variation of intensities and ratios of
intensities more independent of absolute
magnitude.
•Gives a more realistic sense of variation.
Multiple Experiment
Comparison
Modify data the same way in all
experiments:
- bad spots filtering methods
- ratio (eg. Ratio of Medians)
- adjust ratios:
- mean centering
- Normalization
- main class centering
- We expect:
- few genes upregulated
- few genes downregulated
- most genes unchanged (log2 Ratio = 0)
-Therefore:
- a Normal distribution
- with mean (all log2 Ratio ) = 0
-Draw distribution of Ratios and check mean:
- if really not N: filter bad spots better
try to Normalize (mean = 0; SD = 1)
discard experiment
- if close to N: adjust mean (product or sum)
Normalize (0; 1)
Multiple Experiment Comparison
Norm log Ratio of Medians
Experiment 3
Experiment 4
7
6
5
4
3
2
log Ratio of Medians Class
6.2
5.1
4
2.9
1.8
0.7
-0.4
-1.5
-2.6
-3.7
-4.8
0
-5.9
1
-7
% Genes in Class
Experiment 1
Experiment 2
Set method to select up
or downregulated genes
- higly subjective method like fold-change (eg. two, three)
- semi-statistical method like Mean ± xSD
- statistical method like SAM:
- missing values imputed using a K-nearest Neighbor
- computes a statistic
- set threshold for statistic (to call significant genes)
- will give you a FDR
- set fold-change threshold
Results
5139 different genes
with FBgn in total
Mean Corr. Coef
0.88
4163 different genes
with FBgn
(SAM INPUT)
SAM 2.5% FDR
1.75 Foldchange
95
140
Filtering of Bad
Spots
-If a gene is downregulated in the
mutant (ash2I1):
Ratio = F sample / F control <1
log2 Ratio <0, because log2 1=0
ash2 is in activation pathway
- If a gene remains unchanged in the
mutant (ash2I1):
Ratio = F sample / F control = 1
log2 Ratio = 0, because log2 1=0
ash2 is not regulating this gene
-If a gene is upregulated in the
mutant (ash2I1):
Ratio = F sample / F control > 1
log2 Ratio > 0, because log2 1=0
ash2 is in repression pathway
Controls and Quality
assesment
- Sequencing of some clones from the Collection plates
- RT-PCR of some genes in a semiquantitative way
- Western Blot
- in situ hybridization
- Northern Blot
- inmunolocalization
- Clonal Analysis
RT-PCR
+ = wt
ASH2
- = ash2
Classification according
to GO (Gene Ontology)
- Gene Ontology is a “controlled vocabulary that can be
applied to all eukaryotes “. Each gene product is classified in one or more categories.
- Is distribution of missexpressed genes significantly
different from the one of our initial set of genes?
- maybe ash2 acts predominantly upon a group
of genes of similar function or pathway
20.0
17.5
15.0
12.5
7.5
5.0
2.5
0.0
19
24
29
34
39
3
18S
28S
Fluorescence
10.0
44
Time (seconds)
49
54
59
64
69
Operon D. melanogaster Array
16416 spots
14593 70mer probes representing 13664 genes and 17899 transcripts
POSITIVE CONTROLS
- 10 A. thaliana oligos (TIGR spikes) - each printed 4 times by pin = 640 spots
- 12 D. melanogaster oligos - each printed 17 times = 204 spots
NEGATIVE CONTROLS
- 12 Randomly Generated Negative Controls – printed several times = 188 spots
- 352 Empty spots
- 449 Buffer spots
(hybridized with aRNA ISOash2I1 vs ISO)
ANALYSIS LAYOUT
2 TIFF images (Cy3 & Cy5)
GAL file (gene matrix)
Input
GenePix Pro 4.0 Image analysis
Output
1 GPR file for experiment
Input
TIGR Express Converter 1.4.1
Output
1 MEV file for experiment
1 MEV file for experiment (total=5)
Input
TIGR MIDAS
- Each experiment analyzed independently
- Background filter applied
- Normalization applied: Lowess (LOC) for each experiment
independently
Input
EXCEL & TIGR MEV
- Spike-in, negative and positive control Check
- MA Plots
- Experiment Comparison (Scatter Plots)
- Relevant Genes Finding
TIGR spike-in Mix
We can use the spikes to assess quality of experiment and analysis
On chip: 10 A. thaliana oligos spotted 64 times each (4 times by pin)
To add to labeling reaction: In vitro synthesized RNA from each
gene at different proportions and quantities:
GENE
RCA
Cab
RbcL
Ltp4
Ltp6
PRK
TIM
Nac
RCP
XCP
Ratio
1 to 1
1 to 1
1 to 1
1 to 1
2 to 1
2 to 1
2 to 1
1 to 3
1 to 3
1 to 3
pg in 2 ul of:
Mix A
Mix B
5000
5000
2000
2000
500
5000
20
20
3000
1500
500
250
100
50
10
30
200
600
1000
3000
For Amplification experiments
we use the spikes diluted 1:500
TIGR spikes MA plot from an
experiment with total RNA
Experimental procedure and analysis seems good
(spikes fall where expected)
DOO-016TIGR Spikes MA Plot
3
2
RCA (11)
CAB (11)
1
rbcL (11)
log2(Cy5*Cy3)
LTP4 (11)
XCP2 (13)
RCP1 (13)
0
NAC1 (13)
Ltp6 (21)
PRKase (21)
-1
TIM (21)
3 to 1 ratio
1 to 2 ratio
-2
-3
27
32
37
log2(Cy5/Cy3)
42
47
Operon Arrays Insets
ISO ash2I1 vs ISO
L3 total RNA
- 60ug indirectly labelled
aRNA from L3 total RNA
- 2 ug amplified to 70ug in 4h
- 20ug of labelled aRNA
TIGR Spikes
Amplification Test:
totalRNA vs aRNA log2ratios
Correlation coef = 0.94
Biological Replicates
REPLICATE 1
REPLICATE 2
35
35
30
30
25
25
20
Fluorescence
20
Fluorescence
15
15
10
10
5
5
0
0
24
29
34
39
44
18S
28S
18S
28S
19
49
54
59
64
69
19
Time (seconds)
29
34
39
44
49
54
59
64
69
Time (seconds)
12.5
10.0
10.0
7.5
7.5
Fluorescence
12.5
Fluorescence
24
5.0
2.5
5.0
2.5
0.0
0.0
19
24
29
34
39
44
Time (seconds)
49
54
59
64
69
19
24
29
34
39
44
Time (seconds)
49
54
59
64
69
Biological Replicates
Microarray Insets
REPLICATE 1
REPLICATE 2
Amplified TIGR spikes
(diluted 1:100) together with probes
Biological Replicates
Replicate 1 vs Rplicate 2 log2ratios
Correlation coef = 0.92