PPT - FGED.org Archive

Download Report

Transcript PPT - FGED.org Archive

MAGE-TAB - The ArrayExpress Production
Experience
Helen Parkinson, PhD
EBI is an Outstation of the European Molecular Biology Laboratory.
Content
•
•
•
•
•
•
•
All change at ArrayExpress
Data acquisition
Validation
Extension
Downloads
Long Term Future
Tutorial – submitting in MAGETAB format
www.ebi.ac.uk/arrayexpress
MAGEML
AE
M.EXPRESS
MAGETABULATOR
Tracking
M.EXPRESS
MAGETABULATOR
AE
2
www.ebi.ac.uk/arrayexpress
MAGETAB
MIGRATION
Data acquisition
• MAGETAB data acquisition is integrated with existing
tab2mage submissions
• MAGETAB export is being added to the MIAMExpress
system
• All MAGE-ML submissions will be converted to MAGETAB
• We will unify data acquisition on MAGETAB
• We decided to do most curation/validation/ontology
matching at the end for MAGETAB submissions
• MAGETAB makes curator edit and user update much easier
• Human readable tab delimited formats=efficient curation
• 1600 Experiments processed (1600/3700)
• All curated
• Subset of ArrayExpress MAGETAB data will be re-curated at
migration
www.ebi.ac.uk/arrayexpress
Automated processing and validation
•
•
•
•
•
•
•
•
Sections
MAGETAB Column Headers
MAGTAB Column Orders
MAGETAB Content – length, terms
External data files – released monthly
vs. ArrayExpress content
MIAME score
DW candidates
www.ebi.ac.uk/arrayexpress
Extensibility
• Solexa data
• Proteomics
• Metabolomics
• Array Genotype data (Gen2Phen)
• Association study data (Gen2Phen, Engage)
• Locus specific SNP data
• Clinical Data
• …..
www.ebi.ac.uk/arrayexpress
Downloads
• All ArrayExpress data will be available in MAGETAB
format now (exported direct from AE)
• ~90% is currently available and passes checks (issues
with MAGE-OM->MAGETAB)
• More ontology term sources will be added incrementally –
NCI thesaurus/OBI/ArrayExpress Factor Ontology
• Beta MAGETAB ArrayExpress Bioconductor Module
(Huber, Kauffman)
• All MAGETAB generation code is available
• All validation code is available
www.ebi.ac.uk/arrayexpress
Ontologies
• Working to develop OBI to replace MGED ontology
• Generating a sample/factor ontology for ArrayExpress
based on data content
• Developed in Protégé/OWL format
• Will be served from OLS
• Also mapping to external ontologies for samples e.g NCI
thesaurus
• Text mining to annotate external data using dictionaries
based on NCI thesaurus and some custom ones
(GEOimporter, tab2mage->MAGETAB)
• Data import, meta analysis
www.ebi.ac.uk/arrayexpress
Future: ArrayExpress and Community
•
•
•
•
•
•
•
•
•
•
•
ArrayExpress Submission in MAGETAB ADF format
All ArrayExpress ADF in MAGETAB format
Alpha ArrayExpress-MAGETAB BioConductor MAGETAB importer
AE2
AE2 data migration
More people post their MAGETAB examples and we agree on a gold
std validated set for typical cases
Community lists of MAGETAB supportive tools where people can
register their interests and describe their applications (like GO tools)
Addressing HLA
MAGETAB model, firm up the spec
Decide what factors really are, and whether the MAGE case is still
valid – controlled vs uncontrolled variables instead?
Issues with global variables - inter experiment comparison of
compounds needs to know dose even if dose doesn’t vary in an
experiment
www.ebi.ac.uk/arrayexpress
Acknowledgments
•
•
•
•
•
•
•
•
•
•
•
Anna Farne
Ele Holloway
James Malone
Margus Lukk
Helen Parkinson
Tim Rayner
Faisal Rezwan
Eleanor Williams
Mengyao Zhao
Holly Zheng
Mohammad Shojatalab
ArrayExpress Production Team
ArrayExpress Development Team
• Funding
EC - FELICS, EMERALD, Gen2Phen, MUGEN
NIH - MAGE grant
www.ebi.ac.uk/arrayexpress
Tutorial
• Creation of MAGETAB templates
• Completion of a pre-made template
• Curation
• Scoring and validation templates
• Viewing Data in ArrayExpress
• Backend of the template generation/tracking system
• www.ebi.ac.uk/~parkinso/MAGETAB_tutorial/
www.ebi.ac.uk/arrayexpress