Transcript Document

R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The Role of Terminologies and
Ontologies in the Context of the
Electronic Health Record
Dagstuhl May 23th, 2006
Werner Ceusters, MD
Ontology Research Group
Center of Excellence in Bioinformatics & Life Sciences
SUNY at Buffalo, NY
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Electronic Health Records
• ISO/TS 18308:2003
– Electronic Health Record (EHR):
• A repository of information regarding the health of a subject of care, in
computer processable form.
– EHR system:
• the set of components that form the mechanism by which electronic
health records are created, used, stored, and retrieved. It includes people,
data, rules and procedures, processing and storage devices, and
communication and support facilities.
• More common meaning of EHR system:
– only the “software being executed”
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
A replacement for
This
and
that
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
typical
EHR
screen
www.comchart.com
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Current US GOV eHealth goals & strategies
• Goal 1: Inform Clinical Practice:
– S1. Provide incentives for EHR adoption.
– S2. Reduce risk of EHR investment.
– S3. Promote EHR diffusion in rural and underserved areas.
• Goal 2: Interconnect Clinicians.
– S1. Regional collaborations.
– S2. Develop a national health information network.
– S3. Coordinate federal health information systems.
• Goal 3: Personalize Care.
– S1. Encourage use of Personal Health Records.
– S2. Enhance informed consumer choice.
– S3. Promote use of telehealth systems.
• Goal 4: Improve Population Health.
– S1. Unify public health surveillance architectures.
– S2. Streamline quality and health status monitoring.
– S3. Accelerate research and dissemination of evidence.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Functions to be supported (HL7)
• Direct Care
– functions that enable hands-on delivery of health care and offer
clinical decision support.
• Care Support
– functions that are not used for direct care of patients, but assist
with the administrative, financial, research, public health, and
quality monitoring aspects of an EHR-S
• Information Infrastructure
– functions that provide the framework for proper operation of all
Direct Care and Supportive functions.
HL7 EHR System Functional Model. Draft May 2006
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Direct Care Functions
• DC.1 Care Management
– ordering medications
– creating clinical documentation
• DC.2 Clinical Decision Support
– alerting the provider that immunizations are due or
drug interactions are indicated.
• DC.3 Operations Management and Communication
– ???
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Care support functions
• S.1 Clinical Support
• S.2 Measurement, Analysis, Research and
Reports
• S.3 Administrative and Financial
– verifying insurance eligibility
– reporting encounter data to public health systems
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Information Infrastructure Functions
• Information Infrastructure
–
–
–
–
–
–
–
I.1
I.2
I.3
I.4
I.5
I.6
I.7
Security
Health Record Information and Management
Identity, Registry, & Directory Services
Terminology Standards & Services
Standards-based Interoperability
Business Rules Management
Workflow Management
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
‘Terminology’
1) The discipline of terminology management
–
–
homonymous with terminology
synonymous with terminology work (used in ISO
704)
2) The set of designations used in the special
language of a subject field, such as the
terminology of chemistry
–
–
Used in in both the singular and plural
Used with an article in the singular: a terminology
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Fundamental Activities of Terminology Work
• Identifying ‘concepts’ and ‘concept relations’;
– Analyzing and modeling concept systems on the basis of
identified concepts and concept relations;
– Establishing representations of concept systems through concept
diagrams;
– Crafting concept-oriented definitions;
– Attributing designations (predominantly terms) to each concept
in one or more languages; and,
– Recording andThis
presenting
terminological
is not
the rightdata, principally in
terminological entries stored in print and electronic media
approach to ontology !
(terminography).
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Reason for our rejection: The terminological View
• Objects
• perceived or conceived, concrete or abstract
• abstracted or conceptualized into concepts
• Concepts
?
• depict or correspond to a set of objects based on a defined set of
characteristics
• represented or expressed in language by designations or by definitions
• organized into concept systems
• Terminology
Designations is a tool for dealing with language,
• represented as terms, names (appellations) or symbols
not
one
for
representing
reality.
• designate or represent a concept
• attributed to a concept by consensus within a special language
community
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Peirce, Ogden & Richards, …
~ Universal ???
Unit of Thinking (Concept)
(Unit of Thought,
Unit of Knowledge)
Universal
Designation
(Symbol, Sign,
Term, Formula
etc.)
Referent
(Concrete Object,
Real Thing,
Conceived Object)
Particular
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Success of concept-based view in healthcare IT
Concept ‘dog’
Chien
Dog
Hond
Hund
…
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Why terminologies ?
• As such ?
– Fixing/stabilizing the language within a domain and a
linguistic community;
– Unambiguous communication.
• In relation to EHRs ?
– Semantic Indexing;
– Information exchange and linking between
heterogeneous systems;
– Terminologies as basis for coding and classification
systems
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some systems and their purpose
• Remuneration
– ICD9/10-CM in US for insurance and medicare for diseases
– Clinical Procedures Terminology (CPT) for surgical procedures
• Public Health Reporting
– ICD9/10
• Clinical Recording
– Read 1-3, SNOMED-CT, ICPC
• Indexing publications
– MeSH (MedLine/PubMed), EMTree (EMBASE)
• Support for applications and decision support
– GALEN, FMA
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
‘Traditional’ semantic indexing
• Statement:
– ‘ Joe Smith has a fracture of the left tibia ’
• Becomes indexed as :
–
#12
M-2xg41
A-2t68
– M-2xg41 code in SnowMeat with terms:
– fracture, fractures, fracture NOS, broken, ...
– A-2t68 ibidem associated with:
– left tibia, left tibia NEC, ...
– Additional terms through
– hierarchy: bone, bones, os, ...
– associations: lower leg, limb, body part, ...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Classification: ICD
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
...
Chapter II:
Neoplasms (C00-D48)
Chapter III:
Diseases of the Blood and Blood-forming organs and certain disorders
involving the immune mechanism (D50-D89)
Excludes :
auto-immune disease (systemic) NOS (M35.9)
....
Nutritional Anemias (D50-D53)
D50
Iron deficiency anaemia
Includes: ...
D50.0 Iron deficiency anaemia secondary to blood loss (chronic)
Excludes : ...
D50.1
...
D51
Vit B12 deficiency anaemia
Haemolytic Anemias (D55-D59)
...
Chapter IV:
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Coding versus classification
• Coding:
– Annotate terms in the EHR with codes from a coding
system
•  synonyms, translations, hierarchies
• Classification:
– Assign patients exhibiting certain features to a
predefined class
•  purpose oriented, culture dependent
• Frequently mixed up !
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Fractured
nose
= ???
Fracture
of
nose
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Coding / classification confusion
• “patient with fractured nose”
=
“patient with fracture of nose”
• But therefor not
“fractured nose”
=
“fracture of nose” !
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Classification: culture dependent
Dyirbal classification of objects in the universe,
• Bayi: men, kangaroos, possums, bats, most snakes, most fishes,
some birds, most insects, the moon, storms, rainbows, boomerangs,
some
spears, etc. derived through analysis of the
Categories
• Balan: women, anything connected with water or fire, bandicoots,
structure of the language used by these people.
dogs, platypus, echidna, some snakes, some fishes, most birds,
fireflies, scorpions, crickets, the stars, shields, some spears, some
trees, etc.
 Language is NOT a thrustworthy basis for
• Balam: all edible fruit and the plants that bear them, tubers, ferns,
(realist)
development.
honey, cigarettes,
wine, ontology
cake.
• Bala: parts of the body, meat, bees, wind, yamsticks, some spears,
most trees, grass, mud, stones, noises, language, etc.
Lakoff 1987. Women, fire and dangerous things
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The “exploding bicycle” (J. Rogers)
• 10 things to hit…
– Pedestrian / cycle / motorbike / car / HGV / train / unpowered
vehicle / a tree / other
• 5 roles for the injured…
– Driving / passenger / cyclist / getting in / other
• 5 activities when injured…
– resting / at work / sporting / at leisure / other
• 2 contexts…
– In traffic / not in traffic
 V12.24 Pedal cyclist injured in collision with two- or threewheeled motor vehicle, unspecified pedal cyclist, nontraffic
accident, while resting, sleeping, eating or engaging in other vital
activities
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Border’s classification of Medicine
• Medicine
– Mental health
– Internal medicine
• Endocrinology
– Oversized endocrinology
• Gastro-enterology
• ...
– Pediatrics
– ...
– Oversized medicine
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ambituous claims have been made …
• The Unified Medical Language System (UMLS) is
designed to “facilitate the development of
computer systems that behave as if they
‘understand’ the meaning of the language of
biomedicine and health”.
UMLS fact sheet, updated 7 May 2004
(http://www.nlm.nih.gov/pubs/factsheets/umls.html).
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Mesh: Medical Subject Headings
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Mesh: Medical Subject Headings
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
MeSH: typing myocardial infarction
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
H
i
e
r
a
r
c
h
i
c
a
l
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
MeSH: Different context, different meaning ?
???
???
???
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
MeSH Tree Structures - 2004
• Body Regions [A01]
– Extremities [A01.378]
• Lower Extremity [A01.378.610]
– Buttocks [A01.378.610.100]
– Foot [A01.378.610.250]
» Ankle [A01.378.610.250.149]
» Forefoot, Human [A01.378.610.250.300] +
» Heel [A01.378.610.250.510]
– Hip [A01.378.610.400]
– Knee [A01.378.610.450]
The most abundant
– Leg [A01.378.610.500]
sort of mistakes if used
– Thigh [A01.378.610.750]
as an ontology!
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Intermediate conclusion (1)
• Concept-based terminology (and standardisation
thereof) is there as a mechanism to improve
understanding of messages by humans.
• It is NOT the right device
– to explain why reality is what it is, how it is organised,
etc., (although it is needed to allow communication),
– to reason about reality,
– to make machines understand what is real,
– to integrate across different views, languages,
conceptualisations, ...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Why not ?
• Does not take care of universals and particulars
appropriately
• Concepts not necessarily correspond to something that
(will) exist(ed)
– Sorcerer, unicorn, leprechaun, ...
• Definitions set the conditions under which terms may be
used, and may not be abused as conditions an entity must
satisfy to be what it is
• Language can make strings of words look as if it were
terms
– “Middle lobe of left lung”
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ok, then Description Logics and OWL will save us ... ?
Description logics:
• A decidable fragment of FOL
• A propositional modal logic
• A classes and properties (concepts and roles) oriented KR
language
• Subsumption and satisfiability (consistency) are the key
inferences
• Most DLs are supersets of ALC
– Boolean operators on concepts
– Existential and Universal quantifiers
• OWL-DL is a large superset (SHOIN):
– Property hierarchies & Transitive roles (SH)
– Inverse (I)
– Nominals (O) (hasValue and one of)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
SNOMED and DL
SNOMED-RT (2000)
SNOMED-CT (2003)
DL don’t guarantee you to get parthood right !
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
NCI Thesaurus
• a biomedical thesaurus created
specifically to meet the needs of the
National Cancer Institute.
• semantically modeled cancer-related
terminology built using description logics
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
NCI Thesaurus Root concepts
Anatomic
Or
? Does Substance
Structure,
the NCI not
Anatomic
? know
If yes,towhy
System,
which
is gene
category
or
Anatomic
Any
product
itemnot
classified
Substance
subsumed
there
?by belongs
it ? If no,? why are
drugs and chemicals not subsumed by it ?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Definition of “cancer gene”
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Terminologies and ontologies
for EHR use:
the quest for principles
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Requirements for clinical vocabularies (1)
• Domain completeness: coverage of all possible
terms that lie within a vocabulary’s domain
• Non-vagueness: the term should represent the
concept behind it as close as possible
• Non-ambiguity: the same term cannot refer to
more than one concept
• Non-redundancy: each concept must be
represented by one unique identifier
(Cimino, 1989)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Requirements for clinical vocabularies (2)
• Synonomy: multiple ways for expressing a word
(or concept) must be allowed
• Multiple classification: concepts must be allowed
to be classified in multiple hierarchies
• Consistency of view: concepts must have the
same relationships in all views
• Explicit relationships: all relationships (e.g. class,
synonymy,…) must be explicitly labelled.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The Desiderata Revisited
• Concept orientation - what is the alternative?
• Concept permanence and graceful evolution - version
control
• Formal definitions - add to knowledge vs. recognize
change
• Reject NEC - store what the patient has and classify later
• Multiple granularities - patient level vs. reuse
• Representing context - the implicit meaning in the EMR
design
Cimino 2003, Rome Ontology Workshop (pushed by Smith)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
New desiderate for biomedical terminologies
• Provide identifiers for meanings we want to apply
to the patient
• Make sure the semantics are universally
understood, separate from linguistics
• Make sure that, as our understanding changes,
original meaning is not forgotten
• Provide a bridge between what we record and how
we reason
Cimino 2003, Rome Ontology Workshop (pushed by Smith)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Desiderata for Controlled Medical Data
I - Capture what is known about the patient
II - No information loss
III - No false implications
IV - Support retrieval
V - Support reuse
VI - Support aggregation
VII - Support inference
Cimino 2003, Rome Ontology Workshop (pushed by Smith)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Take off of ontology in biomedical informatics
• Concept/terminology-based systems make implicit
knowledge explicit
• Ontologies aim to push explicitness further:
– reasoning by machines
• Classification
• Prediction
• Triggering of alerts
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
A practical example
However !
• At <timestamp> lab reports <procedure> with id <ID>
and value <value> for <patient>
• At <timestamp> <clinician> interprets <ID> as
indicating <condition> for <patient>
Is this a procedure or
• At <timestamp> <clinician> orders pharmacy item
documentation
of a for <patient>
<formularythe
item>
with order id <ID>
Is this condition
really a
procedure
?
• At <timestamp> pharmacy
<inventory
patientdelivers
condition
or justitem>
an
with inventory id <ID> for order id <ID> for <patient>
idea
? suggests
• AtHow
<timestamp>
decision
support
system
are these
<condition>
relatedfor
? <patient>
Cimino 2003, Rome Ontology Workshop
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The dispute between …
• “Practical engineers”:
– If it works for our purposes, it is ok
• Good philosophers:
– If it works always, it is ok,
and
– It can only always work if it represents the relevant
portion of reality faithfully.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ontology desiderata (C. Goble) for engineers
Precision
formal, unambiguous
high fidelity
Flexibility
Explicitness
expressivity, evolution
clarity, commitment,
reuse
Systematic
control, quality, clarity
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ontology description space (C. Goble)
Coverage
upper, domain general, domain specific
Knowledge
representation
languages and
models
words, OO, frames,
logics
Inference mechanisms
classification, coherency
Expressivity
taxonomy, relationships, axioms
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
But not to forget: change management
The reasons for changes in ontologies AND health
records should be explicitly motivated, possibilities
being
1. changes in the underlying reality (does the appearance or
disappearance of an entry relate to the appearance or disappearance
of entities or of relationships among entities in reality?);
2. changes in our scientific understanding;
3. reassessments of what is considered to be relevant for inclusion ;
4. corrections of encoding mistakes introduced during ontology
curation or data entry
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Conclusions
• Main role of:
– Terminologies: standardise language use
– Ontologies: represent what is generic in reality
– EHR: document what is specifically related to particulars
(patients directly, (sub)populations indirectly)
• Role of terminologies in the context of the EHR:
– Make the documentation intelligable to humans other than those
who entered the data
• Role of ontologies in the context of the EHR:
– Ensure that the regimentation imposed by the EHR system does
not interfere with the re-usability of the data for a variety of
purposes, other than patient documentation.