Werner Ceusters - Buffalo Ontology Site

Download Report

Transcript Werner Ceusters - Buffalo Ontology Site

R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Clinical Trial Ontology Meeting
How to build an Ontology ?
Some basic principles
NIH, May 16-17, 2007
Werner CEUSTERS, MD
Center of Excellence in Bioinformatics and Life Sciences
Department of Psychiatry, University at Buffalo, NY, USA
http://www.org.buffalo.edu/RTU
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
1977
1959
-
2006
Short personal
history
1989
2004
1992
2002
1998
2
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Mainstream interpretations of “ontology”
• An explicit specification of an agreed upon
conceptualization of a domain
– Tom Grüber
• Anything what is given the name ‘ontology’ and
that can be described in terms of 6 axes:
expressiveness, structure, intended use,
granularity, automated reasoning,
prescriptive/descriptive
– Ontology Summit 2007
3
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Problems with mainstream ontologies
• Based upon the confusing notion of “concept”
– Unit of thought or knowledge concerning anything
perceivable or conceivable
– The meaning of a term
–…
• Confuse information representation with domain
representation
Information about X
part_of information about Y
X part of Y
4
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What I mean with the word “ontology”
• A representation of some pre-existing domain of reality (a
portion of reality) which
1. reflects the properties of the entities within its
domain in such a way that there obtains a systematic
correlation between reality and the representation
itself,
2. is intelligible to a domain expert
3. is formalized in a way that allows it to support
automatic information processing
5
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Three levels of reality
1. The world exists ‘as it is’ prior to a cognitive
agent’s perception thereof;
2. Cognitive agents build up ‘in their minds’
cognitive representations of the world;
3. To make these representations publicly
accessible in some enduring fashion, they create
representational artifacts that are fixed in some
medium.
Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and
Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA
6
R T U New York State
Center of Excellence
in
Represent
what exist
and is relevant
Bioinformatics & Life Sciences
RU1
RU1O1
B1
B
Cognitive
representation
concretization
O
R
1st level reality
7
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some characteristics of representational units
1. each unit is assumed by the creators of the
representation to be veridical, i.e. to conform to
some relevant POR as conceived on the best
current scientific understanding;
2. several units may correspond to the same POR
by presenting different though still veridical
views or perspectives;
3. what is to be represented by the units in a
representation depends on the purposes which
the representation is designed to serve.
8
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some characteristics of an optimal ontology
• Each representational unit in such an ontology
would designate
– (1) a single portion of reality (POR), which is
– (2) relevant to the purposes of the ontology and such
that
– (3) the authors of the ontology intended to use this unit
to designate this POR, and
– (4) there would be no PORs objectively relevant to
these purposes that are not referred to in the ontology.
9
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Three types of ontologies
• Upper level ontologies:
– (should) describe the most generic structure of reality
• Domain ontologies:
– (should) describe the portion of reality that is dealt
with in some domain
– Special case: reference ontologies
• Application ontologies:
– To be used in a specific context and to support some
specific application
10
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Clinical trial ontologies
• As domain ontologies:
– Cover all entity types relevant in the clinical trial
domain
• As application ontologies:
– A subset of the above which is large enough to support
all functions the application has to serve:
•
•
•
•
CT protocol development
Study management
Data analysis
…
11
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Key question
How to build an optimal clinical trial
domain ontology ?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 1:
Analyze the domain
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 2a:
Try to be lazy:
re-use what others have done.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The BRIDG (domain analysis) model
• NOT an ontology
• A computable clinical trials protocol
representation
– that supports the entire life-cycle of clinical trial
protocols, and
– that will serve as a foundation for caBIG modules
• that support all phases of the clinical trials life cycle,
(including protocol authoring) and
• be developed to meet user needs and requirements.
The BRIDG Project: Creating a model of the semantics of clinical trials research. Douglas B.
Fridsma. July 26, 2006
15
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Reasons for selecting BRIDG
• BRIDG tries to solve an important problem
• Does not completely ignore reality as many other
initiatives do:
(although one has to search hard to find evidence and sometimes it looks as if some
contributors observed reality from outer space)
– If the tools and models don’t work with reality, it is
probably the tools and the models that need to change
• The BRIDG Project: Creating a model of the semantics of clinical trials research. Douglas B. Fridsma.
July 26, 2006
• Intended to become the next best thing on earth
(after HL7, I assume)
16
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
http://www.bridgproject.org/status.html
17
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
BRIDG_Model_V1_49
18
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
BRIDG model organization
Image from: The BRIDG Project: Creating a model of the semantics of clinical trials research.
Douglas B. Fridsma. July 26, 2006
19
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 2b:
Try to be lazy: re-use what others have done,
But… remain critical at all times!
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Being critical ≠ being negative
RFQ-NCI-60001-NG: Review of NCI Thesaurus and
Development of Plan to Achieve OBO-Compliance
Grant to Apelon (H. Solbrig) to improve NCIT
21
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 3:
Don’t have a blind trust in the power
of representation and modeling
languages, and certainly not in UML
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
‘Death by UML Fever’
• It is important to emphasize that UML itself is not the direct cause
of any maladies described herein.
• Instead, UML is largely an innocent victim caught in the midst of
poor process, no process, or sheer incompetence of its users.
• UML sometimes does amplify the symptoms of some fevers as the
result of the often divine-like aura attached to it.
• For example, it is not uncommon for people to believe that no
matter what task they may be engaged in, mere usage of UML
somehow legitimizes their efforts or guarantees the value of the
artifacts produced.
Alex E. Bell. Death by UML Fever. Queue 2(1), March 2004, ACM Press, 72 – 80, 2004
23
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Who would not be impressed ?
•
Fig. 10: BRIDG Comprehensive Class and attribute diagram - (Logical diagram), p99
24
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
I’m not !
• I have come to appreciate domain modeling in
UML as an implementation-independent approach
which is more likely to uncover “the truth” about
the underlying semantics.
– Dr. Diane Wold. Modeling Trial Design with BRIDG. July 26, 2006
• The UML diagram helped us to keep separate an
activity, which exists independent of any schedule,
and an activity-at-a-visit, (the X), which is a
plan to perform that activity at a particular time.
25
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 4:
Limit the number of
developers/contributors
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Contributors to the BRIDG model
A chain
is as
strong as
its
weakest
link
Image from: The BRIDG Project: Creating a model of the semantics of clinical trials research.
Douglas B. Fridsma. July 26, 2006
27
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 5:
• Be consistent in what you describe:
– either representational units, or
– the entities represented by them.
• Thus: keep the levels of reality all the time in
mind
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
LivingSubject (BRIDG logical model p1031)
•
•
•
•
Type: Class
Status: . Version . Phase .
Package: Entities and Roles Keywords:
Detail: Created on 02/09/2006. Last modified on
02/09/2006.
• GUID: {7C04F8D8-30B9-4942-B2A8-4CF93E8913D9}
• An object representing an organism or complex animal,
alive or not. Examples: person, dog, microorganism, plant
of any taxonomic group, tissue sample, bacteria, fungi,
and viruses.
29
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
SubstanceAdministration (BRIDG logical model p84)
•
•
•
•
Type: Class PerformedActivity
Status: Proposed. Version 1.0. Phase 1.0.
Package: CTOM Elements Keywords:
Detail: Created on 01/05/2005. Last modified on
12/14/2006.
• GUID: {2289C0E8-855D-42e3-86FA2ECBE59D8982}
• The description of applying, dispensing or giving
agents or medications to subjects.
30
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Person (BRIDG logical model p106 a.f., HE!)
•
•
•
•
Type: Class
Status: Proposed. Version 1.0. Phase 1.0.
Package: Clinical Research Entities Keywords:
Detail: Created on 06/09/2005. Last modified on
01/13/2007.
• GUID: {6F49F110-7B36-4c03-A7EAF456CE1E739D}
• A human being.
31
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some Person Attributes
• administrativeGenderCode (p107)
– The classification of the sex or gender role of the
patient. Values include: Female, Male, and Unknown.
• genderCode (p108)
– The text that describes the assemblage of physical
properties or qualities by which male is distinguished
from female; the physical difference between male and
female within a person. [Explanatory Comment:
Identification of sex is usually based upon self-report
and may come from a form, questionnaire, interview,
etc.]
32
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
A better example:
Clinical Trial Ontology under DOLCE
Crenguta Bogdan, Daniela Luzi, Fabrizio L. Ricci, Luca D. Serbanati. Towards a Clinical Trial
Ontology using a Concern-Oriented Approach. W.P. n. 10, October 2006.
33
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 6:
Use a Realism-based Upper Ontology
to classify the representational units
in your Domain Ontology
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Realism in Basic Formal Ontology (BFO)
• The world consists of
– entities that are
• Either particulars or universals;
• Either occurrents or continuants;
• Either dependent or independent;
and,
– relationships between these entities of the form
• <particular , universal>
• <particular , particular>
• <universal , universal>
e.g. is-instance-of,
e.g. is-member-of
e.g. isa (is-subtype-of)
Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and
Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA 35
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Only what exists (or existed) can be represented
• Anything else can be imagined
• Examples of what exist:
–
–
–
–
–
Body parts
Disorders
Abortions
Women with prevented abortions
Plans about my future activities
• What does not exist
– Prevented abortions
– My future activities
36
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
PlannedActivity (BRIDG logical model p202, HE!)
37
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 7:
•
Use formal ontological methods to:
–
–
distinguish distinct entities
assess in what way distinct entities are
distinct
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Organism (BRIDG logical model p160, HE!)
•
•
•
•
Type: Class
Status: Proposed. Version 1.0. Phase 1.0.
Package: Clinical Research Roles Keywords:
Detail: Created on 12/13/2006. Last modified on
01/19/2007.
• GUID: {B9F321DB-365F-4155-B8F6-3D….
• The role that a biological entity has, and that role
participates in a microbiology test in two ways: first, it
can be identified as the result of a microbiology test. It
can also participate as a specimen in the microbiology
test. [HL7 Perspective]
39
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
An example: ONTOCLEAN
• Identity, essence, unity,
dependence
C. Welty, N. Guarino"Supporting ontological analysis of taxonomic relationships", Data and
Knowledge Engineering vol. 39, no. 1, pp. 51-74, 2001
40
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 8:
• Don’t confuse reality with our means to
access that reality, f.i.:
• Don’t confuse the observation of an
entity with the entity observed
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
AdverseEvent (BRIDG logical model p168, HE!)
•
•
•
•
Type: Class Assessment
Status: Proposed. Version 1.0. Phase 1.0.
Package: Clinical Research Activities Keywords:
Detail: Created on 05/24/2006. Last modified on
01/26/2007.
• GUID: {CD620136-3CB9-4382-802B-F6CA82F98C10}
• An observation of a change in the state of a subject that is
assessed as being untoward by one or more interested
parties within the context of protocol-driven research or
public health.
42
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: medical ‘findings’ and ‘observations’
•
A particular pathological entity may at a certain
time be undetectable by any observation method or
technique available to an observer, including the
person exhibiting the pathological entity itself.
43
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: medical ‘findings’ and ‘observations’ (1)
•
•
A particular pathological entity may at a certain
time be undetectable by any observation method or
technique available to an observer, including the
person exhibiting the pathological entity itself.
A particular observation (‘act of looking’) may
produce false results and thus simulate the
existence of a pathological entity.
44
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: medical ‘findings’ and ‘observations’ (1)
•
•
•
A particular pathological entity may at a certain
time be undetectable by any observation method or
technique available to an observer, including the
person exhibiting the pathological entity itself.
A particular observation may produce false results
and thus simulate the existence of a pathological
entity.
An observer may observe or fail to observe a
detectable particular pathological entity.
45
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
On ‘findings’ and ‘observations’ (2)
• When an observer perceives a particular pathological
entity, he might judge it
– (1) to be an instance of the universal of which it is indeed an
instance in reality,
– (2) to be an instance of another universal (and thus be in error),
or
– (3) he might be not able to make an association with any
universal at all.
• Distinct manifestations of ‘the same type’ may be
pathological or not:
– Singing naked under the shower versus in front of The White
House
• ...
46
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 9:
Do not accept silly suggestions,
whomever they come from
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Device (BRIDG logical model, p100, HE!)
•
•
•
•
Type: Class Material
Status: Proposed. Version 1.0. Phase 1.0.
Package: Clinical Research Entities Keywords:
Detail: Created on 02/22/2006. Last modified on
01/04/2007.
• GUID: {3546A977-C51F-4860-A09A2ADAE896D74B}
• <PROPOSED> A therapeutic or diagnostic intervention
utilizing a piece of equipment or a mechanism designed to
serve a special purpose or perform a special function
whose basic characteristics are not altered in the course of
the intervention.
48
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The latter could also go under other rules:
•
•
•
•
Stop working when you are tired
Be careful with cut and paste
Proof-read your work
…
49
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 10:
Use distinct names for distinct
representational units that denote
distinct entities
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
AdverseEvent (BRIDG logical model p504)
•
•
•
•
Type: Class HealthProblem
Status: Proposed. Version 1.0. Phase 1.0.
Package: Adverse Event Keywords:
Detail: Created on 05/01/2006. Last modified on
05/02/2006.
• GUID: {6783F6F2-8837-4b7d-B81BA25206D36689}
• A toxic reaction to a medical therapy, or to an
experience such as consuming a meal.
51
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
AdverseEvent (BRIDG logical model p91)
•
•
•
•
•
•
Type: Class
Status: Proposed. Version 1.0. Phase 1.0.
Package: SDTM Keywords:
Detail: Created on 12/14/2005. Last modified on 12/28/2006.
GUID: {F1786F01-F973-426d-B765-0107B5823A18}
Any untoward medical occurrence in a patient or clinical
investigation subject administered a pharmaceutical product and
which does not necessarily have a causal relationship with this
treatment. An adverse event (AE) can therefore be any unintended
sign (including an abnormal laboratory finding), symptom, or
disease temporally associated with the use of a medicinal
(investigational) product, whether or not related to the medicinal
investigational) product.
52
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
AdverseEvent (BRIDG logical model p36)
•
•
•
•
•
•
Type: Class Assessment
Status: Proposed. Version 1.0. Phase 1.0.
Package: CTOM (imported package) Keywords:
Detail: Created on 01/05/2005. Last modified on 09/26/2005.
GUID: {C0F30FE6-EE1E-443e-A7AB-256342B193B3}
An unfavorable and unintended reaction, symptom, syndrome, or
disease encountered by a subject while on a clinical trial regardless
of whether or not it is considered related to the product or
procedure. . The concept refers to assessments that could be
medically related, dose related, route related, patient related, caused
by an interaction with another therapy or procedure, or dose
escalation.
53
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 11:
Avoid contradictions
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
ObjectiveResult (BRIDG logical model p191, HE!)
•
•
•
•
Type: Class InvestigativeResult, Observation
Status: Proposed. Version 1.0. Phase 1.0.
Package: Clinical Research Activities Keywords:
Detail: Created on 01/20/2005. Last modified on
12/28/2006.
• GUID: {F388CFB0-77DE-4008-B222-EB…
• An act of monitoring, recognizing and noting
reproducible measurement of some magnitude with
suitable instruments or established scientific processes.
• <EXAMPLE> A laboratory test with standardized
instruments, ECG measurement or question on a validated
questionnaire such as SF36.
55
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some attributes of ObjectiveResult
• missedIndicator boolean
– This is an indicator flag that flags a performed observation as
"not done". (default: CDISC) …… p193
• missedReason
– This captures SDTM's ---REASND. In HL7, there is a list of
permissible missing value types, and we need to ensure that
HL7's list is a superset of what is needed by SDTM.
– <EXAMPLE> A planned observation was not done because the
equipment failed, so the corresponding "performed observation"
exists as a placeholder to describe why that performed
observation was not done…….……p193
– Default: [CDISC SDTM IG v3.1.1 = REASND ]
56
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 12:
Avoid circular definitions
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ingredient (BRIDG logical model p507)
• Status: Proposed. Version 1.0. Phase 1.0.
• Package: Adverse Event Keywords:
• Detail: Created on 03/01/2006. Last modified on
03/01/2006.
• GUID: {7D53B2A1-CEC4-49ae-8BD6611E2CF4D862}
• A substance that acts as an ingredient within a
product. Note, that ingredients may also have
ingredients.
58
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 13:
Do not use names with a precise meaning
in general language to designate entities
which are of a more specific or totally
different type in the context of a specific
application
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Animal (BRIDG logical model p526)
•
•
•
•
Type: Class InvestigatedParty
Status: Proposed. Version 1.0. Phase 1.0.
Package: InvestigatedSubject Keywords:
Detail: Created on 03/10/2006. Last modified on
03/10/2006.
• GUID: {996CB91C-04EC-4b1d-9AFF-57B878D532D7}
• A non-person living entity which is chosen to be the
subject of an investigation, or which is the subject of an
• implicated act.
60
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Rule 14:
Provide a mechanism to let the
ontology evolve in line with changes
in reality and in are understanding
thereof
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Reality versus beliefs, both in evolution
t
U1
U2
Reality
p3
IUI-#3
O-#0
Belief
O-#2
O-#1
= “denotes”
= what constitutes the meaning of representational units
…. Therefore: O-#0 is meaningless
62
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Changes in reality, beliefs, representations
t
U1
U2
R
p3
IUI-#3
O-#0
B
O-#2
O-#1
Relationships amongst universals (R) or
beliefs therein (B)
63
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Mistakes, discoveries, being lucky, having bad luck
Mistakes
t
U1
U2
R
p3
IUI-#3
O-#0
B
O-#2
O-#1
64
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Mistakes, discoveries
discoveries, being lucky, having bad luck
t
U1
U2
R
p3
IUI-#3
O-#0
B
O-#2
O-#1
65
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Mistakes, discoveries, being lucky, having bad luck
t
U1
U2
R
p3
IUI-#3
O-#0
B
O-#2
O-#1
66
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Mistakes, discoveries, being lucky, having bad luck
t
U1
U2
R
p3
IUI-#3
O-#0
B
O-#2
O-#1
67
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Key requirement for versioning
Any change in an ontology or
data repository should be
associated with the reason for
that change to be able to
assess later what kind of
mistake has been made !
68
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: a person’s gender in the EHR
• In John Smith’s EHR:
– At t1: “male”
at t2: “female”
69
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: a person’s gender in the EHR
• In John Smith’s EHR:
– At t1: “male”
at t2: “female”
• What are the possibilities ?
70
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: a person’s gender in the EHR
• In John Smith’s EHR:
– At t1: “male”
at t2: “female”
• What are the possibilities ?
• Change in reality:
71
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: a person’s gender in the EHR
• In John Smith’s EHR:
– At t1: “male”
at t2: “female”
• What are the possibilities ?
• Change in reality:
• transgender surgery
72
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: a person’s gender in the EHR
• In John Smith’s EHR:
– At t1: “male”
at t2: “female”
• What are the possibilities ?
• Change in reality:
• transgender surgery
• change in legal self-identification
73
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: a person’s gender in the EHR
• In John Smith’s EHR:
– At t1: “male”
at t2: “female”
• What are the possibilities ?
• Change in reality:
• transgender surgery
• change in legal self-identification
• Change in understanding: it was female from the very
beginning but interpreted wrongly
74
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: a person’s gender in the EHR
• In John Smith’s EHR:
– At t1: “male”
at t2: “female”
• What are the possibilities ?
• Change in reality:
• transgender surgery
• change in legal self-identification
• Change in understanding: it was female from the very
beginning but interpreted wrongly
• Correction of data entry mistake: it was understood as
male, but wrongly transcribed
75
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Conclusion
• Building high quality ontologies is hard.
• Experts in driving cars are not necessarily experts in car
mechanics (and the other way round).
– Good computer scientists are usually lousy ontologists
• Ontologies should represent the state of the art in a
domain, i.e. the science.
– Science is not a matter of consensus or democracy.
• Natural language relates more to how humans talk about
reality or perceive it, than to how reality is structured.
• No high quality ontology without the involvement of
ontologists.
76