amia panel - National Center for Biomedical Ontology

Download Report

Transcript amia panel - National Center for Biomedical Ontology

Ontology: The New Era
Barry Smith
http://ncbo.us
1
humans with SHH mutations can
suffer midline defects: cleft palate,
holoprosencephaly
http://ncbo.us
2
but holoprosencephaly can also
appear in individuals with normal
SHH
Question: due to what other factors?
Answer: Let’s look at orthologs of SHH in
other model organisms
http://ncbo.us
3
what we find
mutations in shh, the zebrafish ortholog of
SHH, yield analogous defects
but so do mutations in oep, another
zebrafish gene
molecular identification of oep allowed
discovery of mutations in the human oep
ortholog (TDGF1) which could be shown
to cause holoprosencephaly
http://ncbo.us
4
… but this took four years
http://ncbo.us
5
The holy grail
What would it take to detect patterns of
similarity between human phenotypes
and those model organism phenotypes
for which we have potentially useful
molecule-level data and to isolate those
automatically ?
First: good (realistic, scientific) data
sources
http://ncbo.us
6
genes associated with cleft palate
445 genes
http://ncbo.us
7
abnormal proteins associated with cleft palate
http://ncbo.us
8
http://ncbo.us
9
Second: good (realistic, scientific) ontologies
http://ncbo.us
10
Finding shared cross-species
phenotypic features with implications
for our understanding of human
diseases is like finding needles in
haystacks
http://ncbo.us
11
haystack with needle
http://ncbo.us
12
haystack without needle
http://ncbo.us
13
http://ncbo.us
14
Needle with haystack as represented
in a good, realist ontology
http://ncbo.us
15
Good (scientific, realist) ontologies
require hard work and staying
power
http://ncbo.us
16
the haystack ontology John built last
Tuesday
http://ncbo.us
17
Sally’s haystack ontology she did this morning
http://ncbo.us
18
So: to detect cross-species
similarities
we can’t just google our way across John’s,
and Sally’s, and Bill’s, and Tom’s ontologies
• they are all still fragments
• they are syntactically unregimented
• none are interoperable with the FMA (or with
anything else)
• none allows automatic error-checking or
automatic reasoning
http://ncbo.us
19
all are full of weird artefacts
Bill confuses portions of tissue with limbs
Sally thinks gland is identical with observation of
a gland
Tom thinks blood pressure is an act of
measurement
Singupta thinks the heart is an ordered pair
consisting of a preferred term and a concept
unique identifier
Jim thinks there are exactly 26 kinds of chemicals
Olivier wastes half his life constructing mappings
between these various bits of nonsense
http://ncbo.us
20
Finding shared cross-species
phenotypic features is like finding
needles in haystacks
where our search is constrained by
the need to reason back and forth
across heterogeneous data
sources relating to entities at
different levels of granularity
http://ncbo.us
21
genes associated with cleft palate
445 genes
http://ncbo.us
22
abnormal proteins associated with cleft palate
http://ncbo.us
23
medical records
Referent tracking data
SNOMED codes
http://ncbo.us
24
We know that high-quality
ontologies can help
in creating high-quality mappings between
human and model organism phenotypes
http://ncbo.us
25
OWL is not enough
The use of a common syntax and logical
machinery and the careful separating out
of ontologies into namespaces does not
solve the problem of ontology integration
And it certainly does not solve the problem
of ontology quality.
http://ncbo.us
26
“Alignment of Multiple Ontologies of
Anatomy: Deriving Indirect Mappings from
Direct Mappings to a Reference Ontology”
Songmao Zhang
Olivier Bodenreider
AMIA 2005
http://ncbo.us
27
http://ncbo.us
28
Robin McEntire, GSK
What we need is a strong push toward
"industrial-strength" ontologies. … ontologies
with a consistent and and rich representation
formalism that are amenable for use as an
integration framework, and support reasoning
capabilities. We anticipate that pharma's need to
bring together mountains of data and information
and to properly analyse that information all
depend on having a stable, well-developed
semantic framework that links information/data
and that allows reasoning systems to perform
some of our more "mundane" analysis work.
http://ncbo.us
29
Scientific, rigorously tested
reference ontologies
in anatomy
in physiology
in pathology
in chemistry
a reformed GO for genes and
gene products
…
http://ncbo.us
30
One Central Goal of the National
Center for Biomedical Ontology
apply the scientific method to the
development of biomedical ontologies
treat them not as word lists for hobbyists
but as scientific theories which are
subject to empirical testing against realworld benchmarks
able to support tools for automatic
reasoning and error-checking
http://ncbo.us
31
To realize this goal we will
1. organize hands-on workshops* with
different community groups, coaxing
and cajoling them to incorporate
ontology best practices in their work
and to foster mutual learning and
coordination
*Workshop in Schloss Dagstuhl, Germany
http://ncbo.us
32
2. use top-level reference ontologies
as constraints on Open Biomedical
Ontologies (OBO) library*
http://obo.sourceforge.net
to create the conditions for a stepby-step evolution towards highquality interoperable reference
ontologies in the biomedical domain
*elite GOLD membership category
http://ncbo.us
33