ppt - University of Pennsylvania

Download Report

Transcript ppt - University of Pennsylvania

Sharing Microarray
Experiment Knowledge
Chips to Hits Oct. 28, 2002
Chris Stoeckert, Ph.D.
Dept. of Genetics & Center for Bioinformatics
University of Pennsylvania
Nature, October 3, 2002
http://plasmodb.org/
David Roos, Jessie Kissinger, Bindu Gajria, Martin Fraunholz, Jules Milgram, Phil
Labo, Amit Bahl, Dave Pearson, Dinesh Gupta, Hagai Ginsburg
Jonathan Crabtree, Jonathan Schug, Brian Brunk, Greg Grant, Trish Whetzel, Matt
Mailman, Li Li
Desirable Microarray Queries
• Return all experiments using developmental
stage X.
– Sort by platform type
– Which are untreated? Treated?
• Treated by what
• How comparable are these?
• What can these experiments tell me?
Microarray Information to be Shared
Figure from:
David J. Duggan et al. (1999) Expression Profiling using cDNA microarrays. Nature Genetics 21: 10-14
The Computational View of Microarray Information
Need an ontology to unambiguously represent this information.
What is an Ontology?
• In philosophy, an ontology is a systematic account
of Existence.
• In AI, an ontology is a systematic account of what
can be represented.
• The knowledge of a domain is represented in a
declarative formalism.
– Classes, relations, functions, or other objects are
defined with human-readable text describing what the
names mean, and formal axioms that constrain the
interpretation.
• A common ontology defines the vocabulary with
which queries and assertions are exchanged.
Excerpted and adapted from:
http://www-ksl.stanford.edu/kst/what-is-an-ontology.html
An Experimental Ontology
• An ontology for microarray experiments
– Not an ontology of life but of experiments
– Parts are applicable to describing experiments in
general
• Our approach to interfacing with other ontologies
is “experimental”
– Not mapping terms from related ontologies
– Provide a framework to hang other ontologies off of
• Know where to find different types of annotation
• How to interpret that annotation
http://www.mged.org
Relationship of MGED Efforts
Software and database developers
MIAME
DB
MAGE
MGED Ontology
External
Ontologies/CVs
Investigators annotating experiments
MIAME
DB
The MGED Ontology Home Page
http://www.cbil.upenn.edu/Ontology
The MGED Ontology Home Page
http://mged.sourceforge.net/ontologies/
The MGED Ontology Provides a Listing of Resources for Many Species
The MGED Ontology Organizes the Resources According to Concepts
The MGED Ontology is Structured in
DAML+OIL using OILed 3.4
MGED Ontology: BiomaterialDescription:
BiosourceProperty: Age
MGED Ontology: BiosourceOntologyEntry:
DiseaseState
MGED Ontology
©-BioMaterialDescription
External References
Instances
©-Biosource Property
©-Organism
NCBI Taxonomy
©-Age
Mus musculus musculus id: 39442
7 weeks after birth
©-DevelopmentStage
©-Sex
©-StrainOrLine
Mouse Anatomical Dictionary
International Committee on Standardized
Genetic Nomenclature for Mice
©-BiosourceProvider
©-OrganismPart
Stage 28
Female
C57BL/6N
Charles River, Japan
Mouse Anatomical Dictionary
Liver
©-BioMaterialManipulation
©-EnvironmentalHistory
©-CultureCondition
©-Temperature
22  2C
©-Humidity
55  5%
©-Light
12 hours light/dark cycle
©-PathogenTests
Specified pathogen free conditions
©-Water
ad libitum
©-Nutrients
MF, Oriental Yeast, Tokyo, Japan
©-Treatment
©-CompoundBasedTreatment
(Compound)
(Treatment_application)
(Measurement)
ChemIDplus
Fenofibrate, CAS 49562-28-9
in vivo, oral gavage
100mg/kg body weight
An example of microarray sample annotation using the MGED ontology
Susanna A. Sansone, Helen Parkinson, Philippe Rocca-Serra,
Chris Stoeckert and Alvis Brazma
The MGED Ontology in Action: MIAMExpress
Journals are Adopting the MGED Standards
Use of Minimal Information About Microarray Experiment (MIAME)
The MGED Ontology in Action: RAD
Generating Forms from the
MGED Ontology
RAD3
PHP/SQL WWW
OntologyEntry
RAD Forms
SRES
MGED Ontology
ExternalDatabases
MGED Ontology
Anatomy
DevelopmentalStage
Disease
Lineage
PATOAttribute
Phenotype
Taxon
Using the MGED standards in RAD
• RAD: RNA Abundance Database
– Stoeckert et al.(2000) Bioinformatics
• RAD 3.0
– MIAME compliant and MAGE supportive
– Building Importers, exporters for MAGE
• Incorporates MGED ontology
– Uses OntologyEntry to point to internal tables and
external resources
• Expand processing and analysis information storage
– Driven by experience and new approaches
RAD schema uses MAGE/MIAME
0..*
MAGE
Experiment
Array
BioMaterial
BioAssay
BioAssayData
Protocol, Descr.
HigherLevelAnalysis
StudyAssay
1
Array
1
1
0..*
1
Assay
0..*
1
1
0..*
Study
1
1
1
1
1
0..*
1
0..*
0..*
1
StudyDesignAssay
ArrayAnnotation
StudyDesign
1
0..*
0..*
0..*
Control
ElementAnnotation
0..*
0..1
0..*
1
1
BioMaterialCharacteristic
0..*
BioMaterialImp
1
ElementImp
1
StudyFactor
0..*
1
0..*
0..*
0..*
0..*
0..*
StudyDesignDescription
0..*
StudyFactorValue
AssayLabeledExtract
0..*
1
Channel
CompositeElementImp
1
1
10..1
0..*
0..*
0..*
0..*
BioMaterialMeasurement
0..*
0..1
1
0..*
1
0..*
1
0..1
0..*
Acquisition
1
1
1
0..*
0..*
1
LabelMethod
RelatedAcquisition
0..*
1
0..*
CompositeElementAnnotation
1
0..*
0..*
1
OntologyEntry
Treatment
0..*
0..1
AcquisitionParam
0..*
0..*
0..1
ElementResultImp
0..1
0..1
CompositeElementResultImp
0..*
0..*
0..*
1
ProcessResult
Quantification
0..*
0..*
1
1
1
MAGEDocumentation
RelatedQuantification
0..*
ProtocolParam
0..*
ProcessIO
1
MAGE_ML
QuantificationParam
0..*
1
0..1
0..*
1
MIAME
Protocol
1
0..*
Experimental Design
Array design
Samples
Hybridization, Measure
Normalization
.
0..*
1
0..*
0..*
1
AnalysisInput
0..*
1
1
ProcessInvocation
ProcessInvocationParam
ProcessImplementationParam
1
0..*
0..*
1
0..*
AnalysisInvocation
AnalysisInvocationParam
1
0..*
AnalysisOutput
1
ProcessImplementation
0..*
1
1
Analysis
0..*
0..*
AnalysisImplementation
1
0..*AnalysisImplementatio
nParam
0..*
RAD is now part of GUS-3.0
GUS has 5 name spaces compartmentalizing different
types of information.
Namespace
Domain
Features
Core
Data Provenance
Workflows
Sres
Shared resorurces
Ontologies
DoTS
sequence and
annotation
Central dogma
RAD
Gene expresssion
MIAME/MAGE
TESS
Gene regulation
Grammars
Data Integration
Core
Data
Provenance
• Ownership
• Protection
• AlgorithmsDoTS
• Similarity
• Versioning Genomic
• Workflow Sequence
SRes
Ontologies
• GO
• Species
• Tissue
• Dev. Stage
Transcribed
Sequence
Protein
Sequence
• Genes, gene models
• STSs, repeats, etc
• Cross-species analysis
RAD
Transcript
• Characterize transcripts
Expression
• RH mapping
• Library analysis
• Cross-species analysis
• DOTS
• Domains
• Function
• Structure
• Cross-species analysis
•Arrays
•SAGE
•Conditions
TESS
Gene
Regulation
• Binding Sites
• Patterns
• Grammars
Transcription factors up-regulated in acute myeloid leukemia
with sequence similarity to c-fos and common promoter motifs
GUS Supports Multiple Projects
AllGenes
PlasmoDB
EPConDB
Java Servlets
DoTS RAD TESS SRES Core
Oracle RDBMS
Object Layer for Data Loading
Other sites,
Other projects,
e.g. GeneDB
Available at http://www.gusdb.org
Summary
• The MGED ontology is being developed within the
microarray community to provide consistent terminology
for experiments.
– Make it easier and more accurate to annotate a
microarray experiment.
– Use structured fields and controlled terms to query
databases.
• This community effort has resulted in a list of multiple
resources for many species and a machine-readable
document of microarray concepts, definitions, and values.
– The MGED Ontology is a work in progress but can be
used now to build forms for databases
• RAD has incorporated the MGED ontology for forms
– Can export data from RAD into MAGE
– RAD as part of GUS provides integration of gene
expression, annotation, and sequence.
Acknowledgements
• RAD/GUS
• MGED Ontology
– Helen Parkinson (EBI)
– Trish Whetzel
– The MGED Ontology Working
Group
– MAGE working group
www.mged.org
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
Brian Brunk
Jonathan Crabtree
Steve Fischer
Yongchang Gan
Greg Grant
Hongxian He
Li Li
Junmin Liu
Matt Mailman
Elizabetta Manduchi
Joan Mazzarelli
Shannon McWeeney (OHSU)
Debbie Pinney
Angel Pizarro
Jonathan Schug
Trish Whetzel
www.cbil.upenn.edu
http://www.ebi.ac.uk/SOFG