Transcript Document

R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
CHSS Data Center Work Weekend
Ontology, Terminology, and
Cardiovascular Surgery
Nov 21, 2008 – Toronto, Canada
Werner CEUSTERS, MD
Center of Excellence in Bioinformatics and Life Sciences, and
National Center for Biomedical Ontology, University at Buffalo, NY, USA
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
1977
1959
-
2008
2006
Short personal
history
1989
2004
1992
2002
1995
1993
1998
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Structure of this presentation
•
•
•
•
•
•
Data and where they (should) come from
Realism-based ontology
Referent Tracking
How to build ontologies from terminologies
How to link to patient data
How can disparate views been accommodated
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The central hypothesis
•
For disease registries to facilitate meaningful
multi-institutional outcomes analysis, there must
be:
1. Common language = nomenclature,
2. Mechanism of data collection (database or registry) with an
established uniform core data set,
3. Mechanism of evaluating case complexity,
4. Mechanism to ensure and verify data completeness and
accuracy,
5. Collaboration between medical subspecialties.
JP Jacobs et.al. Nomenclature and Databases — The Past, the Present, and the Future:
A Primer for the Congenital Heart Surgeon. Pediatr Cardiol (2007)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Would this do ?
•
?
For disease registries to facilitate meaningful
multi-institutional outcomes analysis, there must
be:
1. Whatever sort of Common language = nomenclature,
2. Whatever sort of Mechanism of data collection (database or
registry) with an established uniform core data set,
3. Whatever sort of Mechanism of evaluating case complexity,
4. Whatever sort of Mechanism to ensure and verify data
completeness and accuracy,
5. Whatever sort of Collaboration between medical
subspecialties.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The answer is clearly …
• … No !
• There are
–
–
–
–
many such animals
of various sorts,
which all have shortcomings,
and therefore lead to the creation of even more such
animals,
– which finally end up suffering – more or less - from the
same flaws.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Mesh 2008: congenital heart defects
All MeSH Categories
Diseases Category
Congenital, Hereditary, and Neonatal
Diseases and Abnormalities
Congenital Abnormalities
Cardiovascular Abnormalities
Heart Defects, Congenital
Alagille Syndrome
Aortic Coarctation
Arrhythmogenic RV Dysplasia
Cor Triatriatum
...
All MeSH Categories
Diseases Category
Cardiovascular Diseases
Cardiovascular Abnormalities
Heart Defects, Congenital
Alagille Syndrome
Aortic Coarctation
Arrhythmogenic RV Dysplasia
Cor Triatriatum
...
?
All MeSH Categories
Diseases Category
Cardiovascular Diseases
Heart Diseases
Heart Defects, Congenital
Aortic Coarctation
Arrhythmogenic RV Dysplasia
Cor Triatriatum
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
SNOMED-CT version 2008.01.7AC
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
SNOMED-CT’s
‘Fallot’s trilogy’
versus
‘Fallot’s triad’
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Trilogy of Fallot
• Definition:
– Combination of pulmonary valve stenosis and atrial septal
defect with right ventricular hypertrophy.
• Typical representational mistake:
– From (correctly, if the definition is right) :
• ‘a patient which has Fallot’s triad
– has a pulmonary valve stenosis,
– has an atrial septal defect,
– has a right ventricular hypertrophy.’
– To (wrong, even if the definition is right) :
• ‘a Fallot’s triad
– is a pulmonary valve stenosis,
– is an atrial septal defect,
– is a right ventricular hypertrophy.’
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
In general: some alarming publications
• Why most published research findings are false.
Ioannidis JPA (2005). PLoS Med 2(8): e124.
– Institute for Clinical Research and Health Policy Studies, Department of
Medicine, Tufts-New England Medical Center, Tufts University School of
Medicine, Boston, Massachusetts.
• Why Current Publication Practices May Distort Science.
Young NS, Ioannidis JPA, Al-Ubaydli O (2008, October 7) PLoS
Med 5(10): e201. doi:10.1371/journal.pmed.0050201.
– Hematology Branch, National Heart, Lung, and Blood Institute, National
Institutes of Health, Bethesda, Maryland,
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Key question:
Why is this ?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
‘The spectrum of the Health Sciences’
Turning data in knowledge
http://www.uvm.edu/~ccts
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What is missing here ?
Turning data in knowledge
?
http://www.uvm.edu/~ccts
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Source of all data
Reality !
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Today’s data generation and use
observation &
measurement
data
organization
model
development
use
=
outcome
add
Δ
(instrument and
study optimization)
verify
further R&D
Generic
beliefs
application
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Key components
data
information
generates
generates
• Players
• HIT
• Outcomes
generates
influences
reality
knowledge
hypotheses
about
representation
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Current deficiencies
• At the level of reality:
– Desired outcomes different for distinct players
• Competing interests
– Multitude of HIT applications and paradigms
• At the level of representations:
– Variety of formats
– Silo formation
– Doubtful semantics
• In their interplay:
– Very poor provenance or history keeping
– No formal link with that what the data are about
– Low quality
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Where should we go?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ultimate goal (at least mine)
A digital copy of the world
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Requirements for this digital copy
• R1:
• R2
A faithful representation of reality
… of everything that is digitally registered,
what is generic  scientific theories
what is specific  what individual entities exist and how they
relate
• R3:
• R4
… throughout reality’s entire history,
… which is computable in order to …
… allow queries over the world’s past and present,
… make predictions,
… fill in gaps,
… identify mistakes,
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
In fact … the ultimate crystal ball
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The ‘binding’ wall
A cartoon of the world
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Major problems
Solutions
1. A mismatch between what
is - and has been - the
case in reality, and
representations thereof in:
P
h
Philosophical
i
realism
l
o
s
Realism-based Ontology o
p
h
y
a) (generic) Knowledge
repositories, and
b) (specific) Data and
Information
repositories.
2. An inadequate integration
of a) and b).
Referent Tracking
H
I
T
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Realism-based Ontology
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
‘Ontology’: one word, two meanings
• In philosophy:
– Ontology (no plural) is the study of what entities exist and how they
relate to each other;
• In computer science and (biomedical informatics)
applications:
– An ontology (plural: ontologies) is a shared and agreed upon
conceptualization of a domain;
• Our ‘realist’ view within the Ontology Research Group
combines the two:
– We use realism, a specific theory of ontology, as the basis for
building high quality ontologies, using reality as benchmark.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Realism-based ontology
• Basic assumptions:
1. reality exists objectively in itself, i.e. independent of
the perceptions or beliefs of cognitive beings;
2. reality, including its structure, is accessible to us,
and can be discovered through (scientific) research;
3. the quality of an ontology is at least determined by
the accuracy with which its structure mimics the
pre-existing structure of reality.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
However: the dominant view in Comp Sc is conceptualism
concept
Embedded in
Terminology
Semantic
Triangle
object
term
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The concept-based view
isa
P P P P
P P P P
P P P P
class
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The realism-based view
e.g. human
extension-of
universal
instance-of
e.g. all humans
member-of
class
P P P P
P P P P
P P P P
e.g. all humans in this room
Defined
class
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ontology
e.g. human
extension-of
universal
instance-of
e.g. all humans
member-of
class
P P P P
P P P P
P P P P
e.g. all humans in this room
Terminology
instance-of
P P P P
P P P P
P P P P
class/
concept
Defined
class
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The ‘terminology / ontology divide’
• Terminology:
– solves certain issues related to language use, i.e. with respect to
how we talk about entities in reality (if any);
• Relations between terms / concepts
– does not provide an adequate means to represent independent of
use what we talk about, i.e. how reality is structured;
• Women, Fire and Dangerous Things (Lakoff).
• Ontology (of the right sort):
– Language and perception neutral view on reality.
• Relations between entities in first-order reality
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Terminological versus Ontological approach
• The terminologist defines:
– ‘a clinical drug is a pharmaceutical product given to (or taken
by) a patient with a therapeutic or diagnostic intent’. (RxNorm)
• The ontologist thinks:
– Does ‘given’ includes ‘prescribed’?
– Is manufactured with the intent to … not sufficient?
• Are newly marketed products – available in the pharmacy, but not yet
prescribed – not clinical drugs?
• Are products stolen from a pharmacy not clinical drugs?
• What about such products taken by persons that are not patients?
– e.g. children mistaking tablets for candies.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Cardiovascular surgery examples
• Systemic venous anomaly, SVC, Bilateral SVC
• Systemic venous anomaly, SVC, Bilateral SVC, Innominate absent
• Systemic venous anomaly, SVC, Bilateral SVC, Innominate present
•
•
•
•
•
•
•
•
VA valve overriding
VA valve overriding, Aortic valve
VA valve overriding, Left sided VA Valve
VA valve overriding, Pulmonary valve
VA valve overriding, Right sided VA Valve
VA valve overriding-modifier for degree of override, Override of VA valve ,50%
VA valve overriding-modifier for degree of override, Override of VA valve .90%
VA valve overriding-modifier for degree of override, Override of VA valve 50–90%
JP. Jacobs et.al. The nomenclature, definition and classification of cardiac structures in the setting of heterotaxy.
Cardiol Young 2007; 17(Suppl. 2): 1–28
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The semantic triangle revisited
Representation and Reference
concepts
concepts
terms
about
objects
terms
First Order Reality
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Terminology
Realist Ontology
Representation and Reference
terms
concepts
representational units
about
objects
universals
First Order Reality
particulars
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Terminology
Realist Ontology
Representation and Reference
terms
concepts
representational units
about
objects
universals
First Order Reality
particulars
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Terminology
Realist Ontology
Representation and Reference
representational units
terms
concepts
cognitive
units
communicative
units
about
objects
universals
First Order Reality
particulars
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Three
Terminology
levels of reality in Realist Ontology
Representation and Reference
representational units
(3) Representational units in various
forms about (1), (2) or (3)
cognitive
units
communicative
units
universals
particulars
(2) Cognitive entities which are our
beliefs about (1)
(1) Entities with objective existence
which are not about anything
First Order Reality
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The three levels in medical practice
Generic
3. Representation
2. Beliefs
(knowledge)
1.
First-order
reality
‘atrial septal defect’
DIAGNOSIS
INDICATION
PATHOLOGICAL
STRUCTURE
DRUG
MOLECULE
Specific
‘W. Ceusters’ ‘my heart defect’
my doctor’s
work plan
my doctor’s
computer
my doctor
PERSON
DISEASE
BLOOD
PRESSURE
me
my doctor’s
diagnosis
my ASD
my blood
pressure
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Terminology is too reductionist
What concepts do we need?
How do we name concepts properly?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The power of realism in ontology design
Reality as benchmark !
1. Is the scientific ‘state of the art’
consistent with biomedical reality ?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The power of realism in ontology design
Reality as benchmark !
2. Is my doctor’s knowledge up to date?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The power of realism in ontology design
Reality as benchmark !
3. Does my doctor have an accurate
assessment of my health status?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The power of realism in ontology design
Reality as benchmark !
4. Is our terminology rich enough
to communicate about all three levels?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The power of realism in ontology design
Reality as benchmark !
5. How can we use case studies better
to advance the state of the art?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Referent Tracking
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The same type of location code used
in relation
to three
different
events
Another problem to solve:
how
many
disorders?
might or might not refer
to the same
PtID
Date
ObsCode
Narrative
location. of
5572
04/07/1990
26442006
closed
fracture of shaft of femur
Three references
hypertension
for the same
5572
04/07/1990
81134009
Fracture,
closed,
spiral
patient denote three
times
the
same disease.
5572
12/07/1990
26442006
closed fracture of shaft of femur
5572
12/07/1990
9001224
Accident in public building (supermarket)
5572
04/07/1990
79001
Essential hypertension
0939
If the same fracture
21/03/1992
26442006
closed fracturecode
of shaft of
femur
is
used
for the
If
two
different
fracture
codes
21/03/1992
9001224
Accident in public building (supermarket)
same patient
on
are
used
in
relation
to
03/04/1993
58298795 tumor codes Other
on other specified region
If
two different
are lesion
used
The same observations
fracture
code
used
indates,
relation
different
then
made
on
the
same
17/05/1993
79001
Essential hypertension
in relation to observations
made
onsame
different
to
two
different
patients
can
notthey
refer
to or
day
for
the
patient,
these
might
22/08/1993
2909872
Closed fracture
of radialcodes
head
dates
for the
same patient,
might
refer tomight
the building
same
22/08/1993
9001224 the same fracure.
Accident in public
(supermarket)
notfracture
refer to the
they
may 26442006
still refer to the same
01/04/1997
closedtumor.
fracture of shaft
of femur
same
fracture.
24/12/1991
255174002
benign polyp of biliary tract
5572
01/04/1997
79001
Essential hypertension
0939
20/12/1998
255087006
malignant polyp of biliary tract
2309
2309
47804
5572
298
298
5572
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Requirements for a digital copy of the world
• R1:
• R2
A faithful representation of reality
… of everything that is digitally registered,
what is generic  scientific theories  realism-based ontologies
what is specific  what individual entities exist and how they
relate
• R3:
• R4
… throughout reality’s entire history,
… which is computable in order to …
… allow queries over the world’s past and present,
… make predictions,
… fill in gaps,
… identify mistakes,
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The reality: a digital copy of part of the world
Applying the grid should not give a
distorted representation of reality, but only
an incomplete representation !!!
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Key issue: keeping track of what the bits denote
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Fundamental goal of Referent Tracking
• explicit reference to the
concrete individual entities
relevant to the accurate
description of each patient’s
condition, therapies,
outcomes, ...
Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records.
J Biomed Inform. 2006 Jun;39(3):362-78.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Method: numbers instead of words
– Introduce an Instance
Unique Identifier (IUI)
for each relevant
particular (individual)
entity
78
Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records.
J Biomed Inform. 2006 Jun;39(3):362-78.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The essence of
Referent Tracking
• Keeping track of particulars
• By means of singular and globally unique
identifiers (#1, #2, #3, …)
• That function as surrogates for these entities in
information systems, documents, etc
• And are managed IN a referent tracking system.
Ceusters W. and Smith B. Tracking Referents in Electronic Health Records. In: Engelbrecht R. et
al. (eds.) Medical Informatics Europe, IOS Press, Amsterdam, 2005;:71-76
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The principle of Referent Tracking
‘John Doe’s
person
inst-of
at t2
#10 ‘John Smith’s
instance-of
at t1
liver
inst-of
at t2
#20 liver
instance-of
at t1
tumor
inst-of
at t2
#30 tumor
#1
instance-of
at t1
liver
#2
tumor
#3
was treated
#4
instance-of
treating
inst-of
with
RPCI’s
irradiation device’
#40
was treated
with
#5
clinic
instance-of
at t1
#6
device
#5
inst-of
at t2
#6
RPCI’s
irradiation device’
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
EHR – Ontology “collaboration”
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Reasoning over instances and universals
instance-of at t
caused
#105
by
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Codes for types AND identifiers for instances
PtID
Date
ObsCode
Narrative
5572
04/07/1990
26442006
IUI-001
closed fracture of shaft of femur
5572
04/07/1990
81134009
IUI-001
Fracture, closed, spiral
5572
12/07/1990
26442006
IUI-001
closed fracture of shaft of femur
5572
12/07/1990
9001224
IUI-007
Accident in public building (supermarket)
5572
04/07/1990
79001
IUI-005
Essential hypertension
0939
24/12/1991
255174002
IUI-004
benign polyp of biliary tract
2309
21/03/1992
26442006
IUI-002
closed fracture of shaft of femur
2309
21/03/1992
9001224
IUI-007
Accident in public building (supermarket)
47804
03/04/1993
58298795
IUI-006
Other lesion on other specified region
5572
17/05/1993
79001
IUI-005
Essential hypertension
298
22/08/1993
2909872
IUI-003
Closed fracture of radial head
298
22/08/1993
9001224
IUI-007
Accident in public building (supermarket)
5572
01/04/1997
26442006
IUI-012
closed fracture of shaft of femur
5572
01/04/1997
79001
IUI-005
Essential hypertension
IUI-004
malignant polyp of biliary tract
0939
20/12/1998
255087006
7 distinct
disorders
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Requirements for a digital copy of the world
• R1:
• R2
A faithful representation of reality
… of everything that is digitally registered,
what is generic  scientific theories
what is specific  what individual entities exist and how they
relate
• R3:
• R4
… throughout reality’s entire history,
… which is computable in order to …
… allow queries over the world’s past and present,
… make predictions,
… fill in gaps,
… identify mistakes,
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Eternal memory
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Accept that everything may change:
1. changes in the underlying reality:
• Particulars come, change and go
R T U New York State
Center of Excellence in
Identity
& instantiation
Bioinformatics
& Life
Sciences
child
adult
person
t
Living
creature
animal
caterpillar
butterfly
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Accept that everything may change:
1. changes in the underlying reality:
• Particulars come, change and go
2. changes in our (scientific) understanding:
• The plant Vulcan does not exist
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Reality and representation: both in evolution
t
U1: benign tumor
Reality
U2: malignant tumor
p3
IUI-#3
O-#0: diabolic possession
Repr.
O-#2: ‘cancer’
O-#1: ‘benign tumor’
= “denotes”
= what constitutes the meaning of representational units
…. Therefore: O-#0 is meaningless
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Accept that everything may change:
1. changes in the underlying reality:
• Particulars come, change and go
2. changes in our (scientific) understanding:
• The plant Vulcan does not exist
3. reassessments of what is considered to be
relevant for inclusion (notion of purpose).
4. encoding mistakes introduced during data entry
or ontology development.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Changes over time
• In John Smith’s Electronic Health Record:
– At t1: “male”
at t2: “female”
• What are the possibilities ?
• Change in reality:
• transgender surgery
• change in legal self-identification
• Change in understanding: it was female from the very
beginning but interpreted wrongly
• Correction of data entry mistake: it was understood as
male, but wrongly transcribed
•
(Change in word meaning)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Requirements for a digital copy of the world
• R1:
• R2
A faithful representation of reality
… of everything that is digitally registered,
what is generic  scientific theories
what is specific  what individual entities exist and how they
relate
• R3:
• R4
… throughout reality’s entire history,
… which is computable in order to …
… allow queries over the world’s past and present,
… make predictions,
… fill in gaps,
… identify mistakes,
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Referent Tracking System Components
• Referent Tracking Software
Manipulation of statements about facts and beliefs
• Referent Tracking Datastore:
• IUI repository
A collection of globally unique singular identifiers
denoting particulars
• Referent Tracking Database
A collection of facts and beliefs about the particulars
denoted in the IUI repository
Manzoor S, Ceusters W, Rudnicki R. Implementation of a Referent Tracking System.
International Journal of Healthcare Information Systems and Informatics 2007;2(4):41-58.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Place in the Health IT arena
Ontology
continuant
disorder
person
CAG repeat
EHR
Juvenile HD
#IUI-1 ‘affects’ #IUI-2
#IUI-3 ‘affects’ #IUI-2
#IUI-1 ‘causes’ #IUI-3
...
Referent Tracking
Database
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
How to build an ontology from a
terminology?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Steps in ontology building
1. For all terms identified in the terminology, find the
entities in reality that are directly denoted;
2. Determine the top categories these entities belong to;
3. Determine for any dependent entity:
•
•
If process: the continuants that participate in it
If dependent continuant: the continuant upon which it depends
4. For any entity determined in step 3, go to step 2.
Rudnicki R, Ceusters W, Manzoor S, Smith B. What Particulars are Referred to in EHR Data? A Case
Study in Integrating Referent Tracking into an Electronic Health Record Application. In Teich JM,
Suermondt J, Hripcsak C. (eds.), American Medical Informatics Association 2007 Annual Symposium
Proceedings, Biomedical and Health Informatics: From Foundations to Applications to Policy, Chicago
IL, 2007;:630-634.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Building the Ontology underlying a terminology (MDS)
BFO
U16
Class-relations
U1
U17
U2
U9
U8
U3
MDS
Ontology
U14
U7
U10
U4
U12
U6
U5
U13
U11
MDS terms
MDS1
MDS2
MDS3
MDS4
MDS5
MDS6
…
R T U New York State Adding another terminology
Center of Excellence in
Bioinformatics & Life Sciences
BFO
U16
U1
U17
U2
U9
U8
U3
U…
U7
U10
U4
U12
U6
U5
MDS1
MDS2
OPO
Ontology
(MDS + CARE +…)
U14
U13
U11
…
MDS3
MDS4
MDS5
MDS terms
MDS6
…
New York State
R
T
U
Adding another terminology
CARE terms
Center of Excellence in
Bioinformatics & Life Sciences
U15
U1
U17
U2
U…
U9
U8
U3
BFO
U16
U7
U10
U4
U12
U6
U5
…
MDS1
MDS2
OPO
Ontology
(MDS + CARE +…)
U14
U13
U11
…
MDS3
MDS4
MDS5
MDS terms
MDS6
…
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
How to link to patient data ?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Semantic integration of data expressed in
distinct terminologies
• Purpose:
– Better comparability
– Statistical validation of the ontology
• Explanation of observed correlations between assessment data elements
• Finding patient subpopulations exhibiting correlations which are nearsignificant without the ontology, but significant with the ontology
• Two level integration:
– Type level : poor man’s linkage
– Particular level: rich man’s linkage
R T U New York StateU
‘Poor man’s’ data linkage
16
Center of Excellence
in
Bioinformatics & Life Sciences
U1
U17
U2
U9
U8
U3
U…
U7
U10
U4
U12
U6
U5
MDS1
MDS
Ontology
U14
U13
U11
…
MDS2
MDS3
pt4
MDS4
MDS5
pt3
MDS terms
MDS6
…
Patient
data
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Data linkage using multiple instruments
CARE terms
U15
1
A
R
E
s
C
X
C
A
R
E
X
U1
U17
C
U3
X
U7
U10
U12
4
C
X
U6
X
X
MDS1
X
…
MDS3
MDS2
MDS4
X
X
X
X
X
X
U13
U11
U5
…
OPO
Ontology
(MDS + CARE +…)
U14
U4
A
R
E
U9
U8
3
E
U2
A
R
X
U…
2
nt
tie
a
P
BFO
U16
X
X
MDS5
X
MDS terms
MDS6
…
Patient 1
Patient 2
Patient 3
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Problems with this level
• Exclusive focus on universals, ignoring that in
data collection (almost) everything is about
particulars.
• Therefore Referent Tracking must be brought in
the picture.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Referent Tracking solves this problem:
• It is true that:
– (1) ‘All Americans have one mother’
– (2) ‘All Americans have one president’
• But:
– (1) ‘all Americans have a distinct mother’
– (2) ‘all Americans have a (numerically) identical
president’
R T U New York StateU
From ‘poor man’s’ to
16
Center of Excellence
in
‘rich man’s’ data linkage
Bioinformatics & Life Sciences
U1
U17
U2
U9
U8
U3
U7
U10
U4
U12
U6
formula
MDS1
MDS
Ontology
U14
U5
U13
U11
MDS terms
MDS2
MDS3
pt4
MDS4
MDS5
pt3
MDS6
…
Patient
data
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rich man’s data linkage: focus on particulars
U6
U6
MDS3
pt4
U11
U11
MDS4
Instance-of
IUI-1
IUI-2
IUI-3
IUI-4
Particular
relations
pt3
pt4
pt3
IUI-5
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Many more combinations possible
U6
IUI-1
IUI-2
pt4
IUI-3
U6
U11
IUI-4
IUI-5
pt3
• The terms used in MDS4
denote distinct particulars
related to both patients
IUI-1
IUI-3
pt4
IUI-2
U11
IUI-5
pt3
• One of the terms used in
MDS4 denotes the same
particular for both patients
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What has worked ?
How have disparate views been
accommodated?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Definitions for ‘Adverse Event’
D4
D5
D6
D7
D8
D9
an observation of a change in the state of a subject assessed as being untoward
by one or more interested parties within the context of a protocol-driven research
or public health.
an event that results in unintended harm to the patient by an act of commission or
omission rather than by the underlying disease or condition of the patient
any unfavorable and unintended sign (including an abnormal laboratory finding),
symptom, or disease temporally associated with the use of a medical treatment or
procedure that may or may not be considered related to the medical treatment or
procedure
any untoward medical occurrence in a patient or clinical investigation subject
administered a pharmaceutical product and which does not necessarily have to
have a causal relationship with this treatment
an untoward, undesirable, and usually unanticipated event, such as death of a
patient, an employee, or a visitor in a health care organization. Incidents such as
patient falls or improper administration of medications are also considered
adverse events even if there is no permanent effect on the patient.
an injury that was caused by medical management and that results in measurable
disability.
BRIDG
IOM
NCI
CDISC
JTC
QUIC
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
At least one argument
• There is no entity which would be such that, were it placed before
these authors, they would each in turn be able to point to it and
respectively say – faithfully and honestly –
– “that is an observation” (definition D4),
– “that is an injury” (definition D9),
– “that is a laboratory finding” (definition D6).
• Clearly,
– nothing which is an injury can be a laboratory finding, although, of course,
laboratory findings can aid in diagnosing an injury or in monitoring its
evolution.
– nothing which is a laboratory finding, can be an observation, although, of
course, some observation must have been made if we are to arrive at a
laboratory finding.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Hypothesis
• Because …
– all the authors of the mentioned definitions use the term
‘adverse event’ in some context for a variety of distinct entities,
and
– these contexts look quite similar
• in each of them, more or less the same sort of entities seem to be involved
• … there is some common ground (some portion of
reality) which is such that the entities within it can be
used as referents for the various meanings of ‘adverse
event’.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Why does this matter ?
• Be precise about what representational units in
either an ontology or data repository stand for.
• Each such unit in an ontology should come with
additional information on whether it denotes:
– an entity at level 1, level 2 or level 3
and
– a universal, or a defined or composite class
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Examples from our adverse event domain ontology
Denotation
Class Type
Particular Type
Description (role in adverse event scenario)
Level 1
C1
subject of care
DC
independent
continuant
person to whom harm might have been done through an act under scrutiny
C2
act under scrutiny
DC
act of care
act of care that might have caused harm to the subject of care
C7
structure change
U
process
change in an anatomical structure of a person
C8
structure integrity
U
dependent
continuant
aspect of an anatomical structure deviation from which would bring it about that
the anatomical structure would either (1) itself become dysfunctional or (2) cause
dysfunction in another anatomical structure
C12
subject
investigation
DC
process
looking for a structure change in the subject of care
Level 2
C15
observation
DC
dependent
continuant
cognitive representation of a structure change resulting from an act of
perception within a subject investigation
C16
harm diagnosis
DC
dependent
continuant
cognitive representation, resulting from a harm assessment, and involving an
assertion to the effect that a structure change is or is not a harm
Level 3
C18
care reference
DC
information
entity
concretized (through text, diagram, …) piece of knowledge drawn from state of
the art principles that can be used to support the appropriateness of (or
correctness with which) processes are performed involving a subject of care
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Representing particular cases
• Is the generic representation of the portion of
reality adequate enough for the description of
particular cases?
• Example: a patient
– born at time t0
– undergoing anti-inflammatory treatment and
physiotherapy since t2
– for an arthrosis present since t1
– develops a stomach ulcer at t3.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Anti-inflammatory treatment with ulcer development
IUI
Particular description
Properties
#1
the patient who is treated
#1 member C1 since t2
#2
#1’s treatment
#2 instance_of C3
#2 has_agent #3 since t2
#3
the physician responsible for #2
#3 member C4 since t2
#4
#1’s arthrosis
#4 member C5 since t1
#5
#1’s anti-inflammatory treatment
#5 part_of #2
#6
#1’s physiotherapy
#6 part_of #2
#7
#1’s stomach
#7 member C6 since t2
#8
#7’s structure integrity
#8 instance_of C8 since t0
#9
#1’s stomach ulcer
#9 part_of #7 since t3
#10
coming into existence of #9
#10 has_participant #9 at t3
#11
change brought about by #9
#11 has_agent #9 since t3
#11 instance_of C10 at t3
#11 has_participant #8 since t3
#12
noticing the presence of #9
#12 has_participant #9 at t3+x
#12 has_agent #3 at t3+x
#13
cognitive representation in #3 about #9
#13 is_about #9 since t3+x
#2 has_participant #1 since t2
#5 member C2 since t3
#8 inheres_in #7 since t0
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Advantage 1: reduce ambiguity in definitions
•
E.g. ‘adverse drug reaction: an undesirable response associated
with use of a drug that either compromises therapeutic efficacy,
enhances toxicity, or both.’ (Joint Technical Committee)
–
–
–
May denote something on level 1, e.g. a realizable entity which exists
objectively as an increased health risk; in this sense any event ‘that either
compromises therapeutic efficacy, enhances toxicity, or both’ is
undesirable;
May denote something on level 2, so that, amongst all of those events
which influence therapeutic efficacy or toxicity, only some are considered
undesirable (for whatever reason) by either the patient, the caregiver or
both; or
May denote something relating to level 3, so a particular event occurring
on level 1 is undesirable only when it is an instance of a type of event that
is listed in some guideline, good practice management handbook, i.e. in
some published statement of the state of the art in relevant matters.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Advantage 2: reveal hidden assumptions
• E.g.: ‘adverse event: an event that results in unintended
harm to the patient by an act of commission or omission
rather than by the underlying disease or condition of the
patient’ (IOM)
• But:
– An ‘act of omission’ is under the realist agenda not an entity that
exist at level 1, but rather a level 3 entity denoting a
configuration in which not was done what good practice requires
to be done,
– Something what not exist at level 1, cannot cause harm by itself,
– Thus it must be the underlying disease.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Conclusion
• Health data management involves many actors and IT
systems: semantic interoperability is thus a key issue.
• Ontologies (of the right sort) provide a deep level of
semantic interoperability between IT systems, thereby
keeping track:
– of what is the case;
– of what is known by some actor(s);
– of what has been and still needs to be done.
• Realism-based ontology, as a discipline, helps in creating
ontologies of the right sort.