Ontology_Group_Report

Download Report

Transcript Ontology_Group_Report

“Ontology” Group Report: Summary
Xiaoshu, John, Vinay, Duncan, Robert, Amit,
Alfredo, Vipul
- An attempt to summarize and organize …
Outline
•
•
•
•
•
•
•
•
Use Cases
What is an “ontology”?
What knowledge can/should they represent?
How should they represent knowledge?
How are ontologies created?
How are ontologies maintained?
How is their quality evaluated?
How may they be used? In which applications?
Use Cases
•
•
•
•
•
•
•
•
•
•
A clinical researcher wants to explore Mass Spectrometry Data and the results of the analysis for clinical
use
Ability to share patient labs, test results, observations, etc. across various systems..
The ability of a researcher to reuse gene information across various data sets (URI/Identifier mapping)
Pathologist proposes a rating of a diagnostic test. These ratings need to be reused by:
—
Billing for revenue generation
—
Cancer Registry for registering patients for trials, support groups etc.
—
Surgeon for determining course of treatment
Brain Atlas project: Data related to the brain model of a male mouse is strored and annotated.. A sleep
disorder researcher wants to use these results to propose cures for sleep walking
A bioinformaticist discovers a DNA sequence which is uncharacterized.. Which web services will help get
information related to characterizing the DNA sequence, say for clinical significance…
A high throughput experiment using Microarray Data is performed for environmental reasons. When using
terms from an Environmental ontology, I discover that I need terms from the toxicology ontology. I discover
that some terms are missing… I make suggestions for those missing terms and request them to be included
in the ontology ..
A glycomic researcher wants experimental data annotated and combined with other annotated data
generated at a different center
A diabetic patient visits a clinic in an emergency. Need to measure sugar and insulin levels.. In order to
speed up the process, we want the patient the ability to provide information
—
Need a patient centric ontology, a clinical ontology and mappings between them
Chemotherapy at home
What is an ontology?
•
Model of use v/s model of meaning….
— Need to respect and assimilate current usage models…
•
Some “ontologies” out there…
— Thesauri, Controlled Vocabularies
— Taxonomies
— Database Schemas
— Metadata models
— Ontologies
— First Order Theories ….
•
Need to look at current W3C definition of an ontology (does it have
one?) and “specialize” it for HCLS…
Need for a common shared vision
•
•
•
•
•
•
•
HCLSIG
HL7
CDISC
NLM
FDA
…
Take pointers from the “Ecosystem” Group …
What knowledge should they represent?
•
•
•
•
•
•
•
Terminologies: Snomed, GO
Information Models
—
Various Genomic Artifacts: Genes, Proteins, Variants, Clinical Significances, Gene Test Result
Reporting Templates
Various Clinical Artifacts: Documentation Templates, Clinical Decision Support Rules, …
—
Process Models
—
Pathways
—
Clinical Guidelines
—
Clinical Care Protocols
—
Clinical/Genomic Research Protocols
—
Web Services Annotation Models
Webservices for Ontologies v/s Ontologies for WebServices
Namespaces
Mappings to underlying heterogeneous database schemas?
ID/Value Mappings?
—
Gene X has ID1 in GeneBank and ID2 in NCBI OMIM
—
Identifier mapping algorithms?
How should the knowledge be
represented?
•
•
•
•
Best practices related to use of RDF, OWL, SWRL … and
any other relevant information
Probabilistic Information
— Uncertainty in data: Uncertainty in genotyping data from
affymetrix chip
— Uncertainty in evidence
— Uncertainty in hypotheses
— Quality/Value judgements/Trust… e.g., I trust HCM
results from Lab X more than from Lab Y
Should we propose OWL/RDF extensions for these?
Or can the current standards accommodate these issues?
How should ontologies be
created?
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Collaborative Ontology Development
—
In the context of a specific use case? – Application Requirement
—
Reasoning/Inferencing Requirements
—
E.g., interleaving the processes of annotating data with the process of creating the ontology… (typically there are
independent?)
Need to distinguish between various actors:
—
Subject Matter Experts (create “knowledge”)
—
Information Modelers (create “models” or “ontologies”)
—
Consumers (evaluate “goodness” of ontologies indirectly via how well does the application performs)
Enable, faciliate collaboration processes
Community v/s Collaborative Ontology Development
—
Sociological issues, Spheres of Influences
NLP, Data Mining approaches to create ontologies
Best practice guidelines
Recommendations for namespaces, identifiers?
Human language descriptions of various pieces of knowledge
When to use RDF/OWL/SWRL, etc.
Provide quality guidance?
Provide guidance related to modularity
Building blocks and templates for HCLS?
—
For e.g., foundational biomedical relations by Barry Smith.
—
We could be the Q/A Testing group for these?.
Ontology Registries
Identifier Registries
How should Ontologies be
maintained…
•
•
•
•
•
•
•
•
Evolution
— Use of old data against a new ontology
— Use of new data against an old ontology
— Evolution of Mappings…
Versioning
History/Diffs
Merging/Partitioning
Provenance
Reason for the ontology
Dependency Propagation
Ontology Lifecycles
How should ontologies be evaluated?
•
General Principles of
— Sound ontology design (from KR literature)
— Taxonomy Design (from Library Sciences)
• Quality of Ontologies?
— Content
— Application performance (indirect)
• Quality of Mappings?
• Can this be used to provide guidance to the
ontology development process
How should ontologies be used?
•
•
•
•
•
•
Scalability of ontologies and applications using them… ontologies
with 100,000s of concepts and relationships
Used in tools, exposed as web services
Web Services for Ontologies v/s Ontologies for Web Services
Ontologies for Data Mining
Ontologies for creating Social Communication Structures
What’s special with HCLS?
— Specific vs exclusive
— We have problems in “spades”: Rapidly changing knowledg
— Legacy of ontology development and use… (e.g., Linnaeus
classification) … Better chances of adoption/acceptance
Deliverables
•
•
•
•
•
Best Practice Guidelines
Use Cases
Solution Design for a particular use Case
— Conversion of a subset of Snomed+GO+MedRA into OWL
— Creation of mappings of the subset ontology to well known
databases GeneBank, SwissProt and some clinical data…?
— Design some queries against these data sources ..
— Prototype?
Collaborative Ontology Development Wiki?
Wiki of Wikis that could include:
— HCLSIG Wiki
— BioPortal
— … other Wikis …