T1_BadZwischenahn9_0.. - Buffalo Ontology Site

Download Report

Transcript T1_BadZwischenahn9_0.. - Buffalo Ontology Site

Introduction to Applied and Theoretical
Ontology
Barry Smith
http://ontologist.com
1
2
The Challenge of Biomedical
Research
Each (clinical, pathological, genetic,
proteomic, pharmacological …) information
system uses its own terminology and
category system
But biomedical research demands the
ability to navigate through all such
information systems
How can we overcome the incompatibilities
which become apparent when data from
different sources is combined?
3
Database and terminology
standardization
is desparately needed in medical and
bioinformatics
to enable the huge amounts of existing
data to be fused together automatically
4
Current standard solution
The Unified Medical Language System (UMLS)
a metathesaurus of some 100 source vocabularies:
SNOMED
ICD-10
MeSH – Medical Subject Headings
Foundational Model of Anatomy
LOINC (Logical Observation Identifiers Names
and Codes)
Gene Ontology
HL7
5
UMLS Metathesaurus
> 1,800,000 Concepts
> 10,000,000 Relations
compiled by US National Library of
Medicine, Bethesda MD
6
Problem
The Source Vocabularies contain
bad coding
7
MeSH
Organisms
Plants
Plant Components
Plant Components, Aerial
Flowering Tops
Flowers
Pollen
Confusion of mass and count senses of ‘substance’
8
SNOMED
both_testes is_a testis
9
Problem
The UMLS Source Vocabularies
are Mutually Inconsistent
10
Representation of Blood in SNOMED
Blood is_a Tissue
11
Representation of Blood in MeSH
Blood is_a Bodily Fluid
12
How to make ONE SYSTEM out of
different source terminologies?
Through the UMLS Semantic
Network
134 Semantic Types
55 Links (is_a, part_of, etc.)
built by linguists for the sake of your
health and well-being …
13
built by Saussurian linguists
AND MANDATED BY THE US
FEDERAL GOVERNMENT
for the sake of your health and
well-being …
14
15
16
UMLS Semantic Network
entity
physical
object
event
conceptual
entity
organism
17
conceptual entity
Organism Attribute
Finding
Idea or Concept
Occupation or Discipline
Organization
Group
Group Attribute
Intellectual Product
Language
18
conceptual entity
Organism Attribute
Finding
Idea or Concept
Occupation or Discipline
Organization
Group
Group Attribute
Intellectual Product
Language
19
Idea or Concept
Functional Concept
Qualitative Concept
Quantitative Concept
Spatial Concept
Body Location or Region
Body Space or Junction
Geographic Area
Molecular Sequence
Amino Acid Sequence
Carbohydrate Sequence
Nucleotide Sequence
20
Bad Zwischenahn
is an Idea or Concept
21
Idea or Concept
Functional Concept
Qualitative Concept
Quantitative Concept
Spatial Concept
Body Location or Region
Body Space or Junction
Geographic Area
Molecular Sequence
Amino Acid Sequence
Carbohydrate Sequence
Nucleotide Sequence
22
23
24
entity
physical
object
organism
conceptual
entity
anatomical structure
fully formed anatomical structure
body part, organ or organ component
25
entity
physical
object
conceptual
entity
idea or concept
functional concept
body system
26
Body System
Circulatory System
Nervous System
Immune System
Musculo-Skeletal System
etc.
27
Your digestive
system,
according to
UMLS, is a
conceptual
entity
28
GO: the Gene Ontology
3 large telephone directories of standardized
designations for gene functions and products
designed to cover the whole of biology
model for
fungal ontology,
plant ontology,
drosophila ontology,
etc.
29
GO: the Gene Ontology
GO organized into 3 hierarchies via is_a
and part_of
30
The intended meaning of part-of
as explained in the GO Usage Guide is:
« can be a part of »
GO axiom: flagellum part-of cell,
means: “a flagellum is part-of some cells”
31
GO divided into three disjoint
term hierarchies
cellular
component
ontology
molecular
function
ontology
biological
process
ontology
flagellum,
chromosome,
cell
ice nucleation,
binding, protein
stabilization
glycolysis,
death
32
GO divided into three disjoint
term hierarchies
= no is_a and no part_of relations between
them
cellular
molecular
component function
ontology
ontology
biological
process
ontology
How are functions and processes linked
together?
33
Definition of «Function»
UMLS Semantic Network:
Functional Concept =df A concept which is
of interest because it pertains to the
carrying out of a process or activity.
GO:
Molecular Function =df the action
characteristic of a gene product.
34
UMLS brings clarity
On March 2003 all nodes in the Molecular
Function ontology (except the root) had
‘activity’ added to their names
Function = activity
35
Confusion of Function and Activity
If function = activity (= functioning)
But then how deal with dormant/suppressed
functions?
36
How are the ontologies related?
Function = “the action characteristic of a
gene product.”
Process = “phenomenon marked by
changes that lead to a particular result,
mediated by one or more gene products”
37
Result:
constant coding errors result
from lack of clear principles as
concerns what basic notions
like ‘function’, ‘process’, ‘part’
mean
38
Examples of GO Molecular
Functions
anti-coagulant (defined as: “a substance
that retards or prevents coagulation”)
enzyme (defined as: “a substance that
catalyzes”)
structural molecule (defined as: “the action
of a molecule that contributes to structural
integrity”)
39
Problems with Bioinformatics
Terminology Systems
Circular definitions
Confusion of use and mention
Confusion of concepts and objects
Confusion of concepts and classes
Confusion of terms and objects
Confusion knowledge with what is known
Simple stupidity
… all of which lead to poor coding
40
These problems are derived
1. from the drive for rapid population of
bioinformatics databases
-- for funding bodies quantity overwhelms
quality (KR mentality: quick and dirty)
2. from ignorance of the basic principles
of ontology (and logic)
3. from relativism/reductionism of
linguists
41
42
The problem
Different communities of medical
researchers use different and often
incompatible category systems in
expressing the results of their work
43
The solution
“ONTOLOGY”
Remove “Ontology Impedance”
But what does “ontology” mean?
44
Two alternative readings
Ontologies are oriented around terms or
concepts = currently popular IT conception
Ontologies are oriented around the entities in
reality = traditional philosophical conception,
embraced also by IFOMIS
45
Ontology as a branch of
philosophy
seeks to establish
the science of the kinds and structures
of objects, properties, events,
processes and relations in every
domain of reality
46
Ontology a kind of generalized
chemistry or zoology
(Aristotle’s ontology grew out of
biological classification)
47
Aristotle
world’s first ontologist
48
World‘s first ontology
(from Porphyry’s Commentary on Aristotle’s Categories)
49
Linnaean Ontology
50
Ontology is distinguished from
the special sciences
it seeks to study all of the various
types of entities existing at all
levels of granularity
51
and to establish how they
hang together to form a single
whole (‘reality’ or ‘being’)
52
different concept/terminology
systems
53
need not interconnect at all
for example they may relate to
entities of different granularity
54
we cannot make incompatible
terminology-systems interconnect
just by looking at concepts,
or knowledge or language
55
we cannot make incompatible
terminology-systems interconnect
by staring at the terminology
systems themselves
56
to decide which of a plurality of
competing definitions to accept
we need some tertium quid
57
we need, in other words,
to take the world itself into account
58
BFO
= basic formal ontology
59
BFO
ontology is defined not as the
‘standardization’ or ‘specification’ of
conceptualizations
(not as a branch of knowledge or concept
engineering)
but as an inventory of the entities existing
in reality
60
The BFO framework
will solve the problem of ontological
impedance and provide tools for qualitycontrol on the output of computer
applications
61
BFO not a computer application
but a Reference Ontology
(something like old-fashioned
metaphysics)
62
Reference Ontology
a theory of a domain of entities in the world
63
BFO
not just a system of categories
but a formal theory
with definitions, axioms, theorems
designed to provide the resources for
reference ontologies for specific domains
of sufficient richness that terminological
incompatibilities can be resolved
intelligently rather than by brute force
64
Proposed solution
distinguish two separate tasks:
- the task of developing computer
applications capable of running in real time
- the task of developing an expressively rich
framework of a sort which will allow us to
resolve incompatibilities between
definitions
and formulate intuitive and reliable
principles for database curation
65
Reference Ontology
a theory of the tertium quid
– called reality –
needed to hand-callibrate
database/terminology systems
66
Methodology
Get ontology right first
(realism; descriptive adequacy; rather
powerful logic);
solve tractability problems later
67
The Reference Ontology
Community
IFOMIS (Leipzig)
Laboratories for Applied Ontology (Trento/Rome,
Turin)
Foundational Ontology Project (Leeds)
Ontology Works (Baltimore)
Ontek Corporation (Buffalo/Leeds)
Language and Computing (L&C)
(Belgium/Philadelphia)
68
Domains of Current Work
IFOMIS Leipzig: Medicine, Bioinformatics
Laboratories for Applied Ontology
Trento/Rome: Ontology of Cognition/Language
Turin: Law
Foundational Ontology Project: Space, Physics
Ontology Works: Genetics, Molecular Biology
Ontek Corporation: Biological Systematics
Language and Computing: Natural Language
Understanding
69
THE END
70