RADical microarray data: standards, databases, and analysis

Download Report

Transcript RADical microarray data: standards, databases, and analysis

RADical microarray data:
standards, databases,
and analysis
Chris Stoeckert, Ph.D.
University of Pennsylvania
Yale Microarray Data Analysis
Workshop
December 5, 2003
Science 298:601-604, 2002
Science 298:597-600, 2002
Very few “stemness” genes were
common between the two
studies. Why?

Inherent problem of testing the stemness
hypothesis using a profiling approach?


Summary by Fortunel et al. (Science 2003)
who did a third study and found only one
common “stemness” gene.
Or did experimental and computational
differences reduce the overlap?

~ 66% overlap if just consider hematopoietic
bone marrow samples (Ivanova et al. Science
2003)
To compare experiments, you need some minimum
information about the microarray experiments.
Ivanova et al. Science 2003
MIAME formalizes that minimum information
MIAME and MAGE are Defined Standards from
the Microarray Gene Expression Data (MGED)
Society



MIAME - a document which outlines the minimum information
that should be reported about a microarray experiment to enable
its unambiguous interpretation and reproduction
 www.mged.org/miame
 Nature Genetics (2001), 29: 365-371.
MAGE - MAGE consists of three parts: An object model (MAGEOM), a document exchange format, which is derived directly
from the object model (MAGE-ML), and software toolkits
(MAGE-stk), which seek to enable users to create MAGE-ML
 www.mged.org/mage
 Genome Biology (2002), 3: research0046.1-0046.9.
In addition, the MGED Ontology provides the language
(vocabulary and relationships) for MIAME and MAGE.
 www.mged.org/ontology
 Comparative & Functional Genomics (2003), 4: 127-132.
Applying MGED Standards

Experiment design:


Name: cell_comparison_design
Type:




development_or_differentiation_design
species_design
cell_type_comparison_design
Experiment Factors:

hematopoietic cell population (LT-HSC, ST-HSC, HSC, LCP, MBC)


mouse developmental stage (fetal, adult)


Type: BioMaterialCharacteristicCategory: developmental_stage
species (human, mouse)


Type: BioMaterialCharacteristicCategory: targeted_cell_type
Type: BioMaterialCharacteristicCategory: organism
stem cell type (hematopoietic, embryonic, neural)

Type: BioMaterialCharacteristicCategory: cell_type
MIAME/MAGE info MGED Ontology terms
RAD Enables Use of MGED Standards

RNA Abundance Database (RAD)


Can search for experiments/studies based on
annotations




Graphs automatically generated of study
RAD Study-Annotator for entering annotations


http://www.cbil.upenn.edu/RAD
MIAME-based
Incorporates the MGED Ontology
MR_T for exporting in MAGE
Get RAD

All source code available
RAD view of stem cell study
RAD view of stem cell study
RAD view of stem cell study
RAD Study-Annotator collects MIAME
and Uses the MGED Ontology
RAD helps you publish!
Study-Annotator
MAGE-RAD Translator
RAD
ArrayExpress
Journals are requiring deposition of microarray experiments in a public repository.
Patterns of Differential Gene Expression
PaGE

PaGE stands for Patterns from Gene Expression.

A goal is to compare patterns across more than 2 groups to look at coregulation.


PaGE was developed by our group at Penn!


Manduchi et al. Bioinformatics 2000.
PaGE uses the False Discovery Rate (FDR).




Focuses on fold-change significance as t-statistics not really applicable to
describing co-regulation
FDR = # false positives/(# false + true positives)
PaGE takes a minimum confidence level as a parameter, and finds all
genes which exceed this confidence.
Each gene is reported with its own confidence. FDR = 1- Confidence
PaGE uses ratios of means. B , C , D
A A A
Where A, B, C, and D are group means for each gene and A is the reference
group.

Use permutations to generate the random distribution of ratios.
Mouse Hematopoietic Stem Cell PaGEs
Group B/1
Group C/2
Group D/3
Group A/0
Mouse Hematopoietic Stem Cell PaGEs
StemCellDB: http://stemcell.princeton.edu/v2/
Available real soon!
Summary

Standards


Databases


Using MIAME, MAGE, and the MGED
Ontology improves your experiment
Databases like RAD facilitate using standards
Analysis

PaGE provides profiles using differential
expression with False Discovery Rate based
on ratios.
Acknowledgements

MGED


RAD


Elisabetta Manduchi, Trish Whetzel, Junmin Liu,
Angel Pizarro, Greg Grant, Hongxian He, Matt
Mailman
PaGE


MIAME, MAGE, and Ontology Working Groups
Greg Grant, Junmin Liu, Elisabetta Manduchi
Stem cells


Ihor Lemischka, Kateri Moore, Natalia Ivanova, Jason
Hackney, Laurie Kramer
Hongxian He, Greg Grant, Lyle Ungar