Transcript Document

R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The Role of Terminologies and
Ontologies in the Context of the
Electronic Health Record
Dagstuhl May 23th, 2006
Werner Ceusters, MD
Ontology Research Group
Center of Excellence in Bioinformatics & Life Sciences
SUNY at Buffalo, NY
Electronic Health Records
• ISO/TS 18308:2003
– Electronic Health Record (EHR):
• A repository of information regarding the health of a subject of care, in
computer processable form.
– EHR system:
• the set of components that form the mechanism by which electronic
health records are created, used, stored, and retrieved. It includes people,
data, rules and procedures, processing and storage devices, and
communication and support facilities.
• More common meaning of EHR system:
– only the “software being executed”
A replacement for
Current US GOV eHealth goals & strategies
• Goal 1: Inform Clinical Practice:
– S1. Provide incentives for EHR adoption.
– S2. Reduce risk of EHR investment.
– S3. Promote EHR diffusion in rural and underserved areas.
• Goal 2: Interconnect Clinicians.
– S1. Regional collaborations.
– S2. Develop a national health information network.
– S3. Coordinate federal health information systems.
• Goal 3: Personalize Care.
– S1. Encourage use of Personal Health Records.
– S2. Enhance informed consumer choice.
– S3. Promote use of telehealth systems.
• Goal 4: Improve Population Health.
– S1. Unify public health surveillance architectures.
– S2. Streamline quality and health status monitoring.
– S3. Accelerate research and dissemination of evidence.
Functions to be supported (HL7)
• Direct Care
– functions that enable hands-on delivery of health care and offer
clinical decision support.
• Care Support
– functions that are not used for direct care of patients, but assist
with the administrative, financial, research, public health, and
quality monitoring aspects of an EHR-S
• Information Infrastructure
– functions that provide the framework for proper operation of all
Direct Care and Supportive functions.
HL7 EHR System Functional Model. Draft May 2006
Direct Care Functions
• DC.1 Care Management
– ordering medications
– creating clinical documentation
• DC.2 Clinical Decision Support
– alerting the provider that immunizations are due or
drug interactions are indicated.
• DC.3 Operations Management and Communication
– ???
Care support functions
• S.1 Clinical Support
• S.2 Measurement, Analysis, Research and
• S.3 Administrative and Financial
– verifying insurance eligibility
– reporting encounter data to public health systems
Information Infrastructure Functions
• Information Infrastructure
Health Record Information and Management
Identity, Registry, & Directory Services
Terminology Standards & Services
Standards-based Interoperability
Business Rules Management
Workflow Management
1) The discipline of terminology management
homonymous with terminology
synonymous with terminology work (used in ISO
2) The set of designations used in the special
language of a subject field, such as the
terminology of chemistry
Used in in both the singular and plural
Used with an article in the singular: a terminology
Fundamental Activities of Terminology Work
• Identifying ‘concepts’ and ‘concept relations’;
– Analyzing and modeling concept systems on the basis of
identified concepts and concept relations;
– Establishing representations of concept systems through concept
– Crafting concept-oriented definitions;
– Attributing designations (predominantly terms) to each concept
in one or more languages; and,
Recording and
is not
the rightdata, principally in
terminological entries stored in print and electronic media
approach to ontology !
Reason for our rejection: The terminological View
• Objects
• perceived or conceived, concrete or abstract
• abstracted or conceptualized into concepts
• Concepts
• depict or correspond to a set of objects based on a defined set of
• represented or expressed in language by designations or by definitions
• organized into concept systems
• Terminology
Designations
• represented as terms, names (appellations) or symbols
• designate or represent a concept
• attributed to a concept by consensus within a special language
Peirce, Ogden & Richards, …
~ Universal ???
Unit of Thinking (Concept)
(Unit of Thought,
Unit of Knowledge)
(Symbol, Sign,
Term, Formula
(Concrete Object,
Real Thing,
Conceived Object)
Success of concept-based view in healthcare IT
Concept ‘dog’
Why terminologies ?
• As such ?
– Fixing/stabilizing the language within a domain and a
linguistic community;
– Unambiguous communication.
• In relation to EHRs ?
– Semantic Indexing;
– Information exchange and linking between
heterogeneous systems;
– Terminologies as basis for coding and classification
Some systems and their purpose
• Remuneration
– ICD9/10-CM in US for insurance and medicare for diseases
– Clinical Procedures Terminology (CPT) for surgical procedures
• Public Health Reporting
– ICD9/10
• Clinical Recording
– Read 1-3, SNOMED-CT, ICPC
• Indexing publications
– MeSH (MedLine/PubMed), EMTree (EMBASE)
• Support for applications and decision support
‘Traditional’ semantic indexing
• Statement:
– ‘ Joe Smith has a fracture of the left tibia ’
• Becomes indexed as :
– M-2xg41 code in SnowMeat with terms:
– fracture, fractures, fracture NOS, broken, ...
– A-2t68 ibidem associated with:
– left tibia, left tibia NEC, ...
– Additional terms through
– hierarchy: bone, bones, os, ...
– associations: lower leg, limb, body part, ...
Classification: ICD
Chapter II:
Neoplasms (C00-D48)
Chapter III:
Diseases of the Blood and Blood-forming organs and certain disorders
involving the immune mechanism (D50-D89)
Excludes :
auto-immune disease (systemic) NOS (M35.9)
Nutritional Anemias (D50-D53)
Iron deficiency anaemia
Includes: ...
D50.0 Iron deficiency anaemia secondary to blood loss (chronic)
Excludes : ...
Vit B12 deficiency anaemia
Haemolytic Anemias (D55-D59)
Chapter IV:
Coding versus classification
• Coding:
– Annotate terms in the EHR with codes from a coding
•  synonyms, translations, hierarchies
• Classification:
– Assign patients exhibiting certain features to a
predefined class
•  purpose oriented, culture dependent
• Frequently mixed up !
= ???
Coding / classification confusion
• “patient with fractured nose”
“patient with fracture of nose”
• But therefor not
“fractured nose”
“fracture of nose” !
Classification: culture dependent
Dyirbal classification of objects in the universe,
• Bayi: men, kangaroos, possums, bats, most snakes, most fishes,
some birds, most insects, the moon, storms, rainbows, boomerangs,
spears, etc. derived through analysis of the
• Balan: women, anything connected with water or fire, bandicoots,
structure of the language used by these people.
dogs, platypus, echidna, some snakes, some fishes, most birds,
fireflies, scorpions, crickets, the stars, shields, some spears, some
trees, etc.
 Language is NOT a thrustworthy basis for
• Balam: all edible fruit and the plants that bear them, tubers, ferns,
honey, cigarettes,
wine, ontology
• Bala: parts of the body, meat, bees, wind, yamsticks, some spears,
most trees, grass, mud, stones, noises, language, etc.
Lakoff 1987. Women, fire and dangerous things
The “exploding bicycle” (J. Rogers)
• 10 things to hit…
– Pedestrian / cycle / motorbike / car / HGV / train / unpowered
vehicle / a tree / other
• 5 roles for the injured…
– Driving / passenger / cyclist / getting in / other
• 5 activities when injured…
– resting / at work / sporting / at leisure / other
• 2 contexts…
– In traffic / not in traffic
 V12.24 Pedal cyclist injured in collision with two- or threewheeled motor vehicle, unspecified pedal cyclist, nontraffic
accident, while resting, sleeping, eating or engaging in other vital
Border’s classification of Medicine
• Medicine
– Mental health
– Internal medicine
• Endocrinology
– Oversized endocrinology
• Gastro-enterology
• ...
– Pediatrics
– ...
– Oversized medicine
Ambituous claims have been made …
• The Unified Medical Language System (UMLS) is
designed to “facilitate the development of
computer systems that behave as if they
‘understand’ the meaning of the language of
biomedicine and health”.
UMLS fact sheet, updated 7 May 2004
Mesh: Medical Subject Headings
Mesh: Medical Subject Headings
R T U New York State
MeSH: typing myocardial infarction
MeSH: Different context, different meaning ?
MeSH Tree Structures - 2004
• Body Regions [A01]
– Extremities [A01.378]
• Lower Extremity [A01.378.610]
– Buttocks [A01.378.610.100]
– Foot [A01.378.610.250]
» Ankle [A01.378.610.250.149]
» Forefoot, Human [A01.378.610.250.300] +
» Heel [A01.378.610.250.510]
– Hip [A01.378.610.400]
– Knee [A01.378.610.450]
The most abundant
– Leg [A01.378.610.500]
sort of mistakes if used
– Thigh [A01.378.610.750]
as an ontology!
Intermediate conclusion (1)
• Concept-based terminology (and standardisation
thereof) is there as a mechanism to improve
understanding of messages by humans.
• It is NOT the right device
– to explain why reality is what it is, how it is organised,
etc., (although it is needed to allow communication),
– to reason about reality,
– to make machines understand what is real,
– to integrate across different views, languages,
conceptualisations, ...
Why not ?
• Does not take care of universals and particulars
• Concepts not necessarily correspond to something that
(will) exist(ed)
– Sorcerer, unicorn, leprechaun, ...
• Definitions set the conditions under which terms may be
used, and may not be abused as conditions an entity must
satisfy to be what it is
• Language can make strings of words look as if it were
– “Middle lobe of left lung”
Ok, then Description Logics and OWL will save us ... ?
Description logics:
• A decidable fragment of FOL
• A propositional modal logic
• A classes and properties (concepts and roles) oriented KR
• Subsumption and satisfiability (consistency) are the key
• Most DLs are supersets of ALC
– Boolean operators on concepts
– Existential and Universal quantifiers
• OWL-DL is a large superset (SHOIN):
– Property hierarchies & Transitive roles (SH)
– Inverse (I)
– Nominals (O) (hasValue and one of)
SNOMED-RT (2000)
SNOMED-CT (2003)
DL don’t guarantee you to get parthood right !
NCI Thesaurus
• a biomedical thesaurus created
specifically to meet the needs of the
National Cancer Institute.
• semantically modeled cancer-related
terminology built using description logics
NCI Thesaurus Root concepts
? Does Substance
the NCI not
? know
If yes,towhy
is gene
?by belongs
it ? If no,? why are
drugs and chemicals not subsumed by it ?
Definition of “cancer gene”
Terminologies and ontologies
for EHR use:
the quest for principles
Requirements for clinical vocabularies (1)
• Domain completeness: coverage of all possible
terms that lie within a vocabulary’s domain
• Non-vagueness: the term should represent the
concept behind it as close as possible
• Non-ambiguity: the same term cannot refer to
more than one concept
• Non-redundancy: each concept must be
represented by one unique identifier
(Cimino, 1989)
Requirements for clinical vocabularies (2)
• Synonomy: multiple ways for expressing a word
(or concept) must be allowed
• Multiple classification: concepts must be allowed
to be classified in multiple hierarchies
• Consistency of view: concepts must have the
same relationships in all views
• Explicit relationships: all relationships (e.g. class,
synonymy,…) must be explicitly labelled.
The Desiderata Revisited
• Concept orientation - what is the alternative?
• Concept permanence and graceful evolution - version
• Formal definitions - add to knowledge vs. recognize
• Reject NEC - store what the patient has and classify later
• Multiple granularities - patient level vs. reuse
• Representing context - the implicit meaning in the EMR
Cimino 2003, Rome Ontology Workshop (pushed by Smith)
New desiderate for biomedical terminologies
• Provide identifiers for meanings we want to apply
to the patient
• Make sure the semantics are universally
understood, separate from linguistics
• Make sure that, as our understanding changes,
original meaning is not forgotten
• Provide a bridge between what we record and how
we reason
Cimino 2003, Rome Ontology Workshop (pushed by Smith)
Desiderata for Controlled Medical Data
I - Capture what is known about the patient
II - No information loss
III - No false implications
IV - Support retrieval
V - Support reuse
VI - Support aggregation
VII - Support inference
Cimino 2003, Rome Ontology Workshop (pushed by Smith)
Take off of ontology in biomedical informatics
• Concept/terminology-based systems make implicit
knowledge explicit
• Ontologies aim to push explicitness further:
– reasoning by machines
• Classification
• Prediction
• Triggering of alerts
A practical example
However !
• At <timestamp> lab reports <procedure> with id <ID>
and value <value> for <patient>
• At <timestamp> <clinician> interprets <ID> as
indicating <condition> for <patient>
Is this a procedure or
• At <timestamp> <clinician> orders pharmacy item
of a for <patient>
with order id <ID>
Is this condition
really a
• At <timestamp> pharmacy
or justitem>
with inventory id <ID> for order id <ID> for <patient>
? suggests
• AtHow
are these
? <patient>
Cimino 2003, Rome Ontology Workshop
The dispute between …
• “Practical engineers”:
– If it works for our purposes, it is ok
• Good philosophers:
– If it works always, it is ok,
– It can only always work if it represents the relevant
portion of reality faithfully.
Ontology desiderata (C. Goble) for engineers
formal, unambiguous
high fidelity
expressivity, evolution
clarity, commitment,
control, quality, clarity
Ontology description space (C. Goble)
upper, domain general, domain specific
languages and
words, OO, frames,
Inference mechanisms
classification, coherency
taxonomy, relationships, axioms
But not to forget: change management
The reasons for changes in ontologies AND health
records should be explicitly motivated, possibilities
1. changes in the underlying reality (does the appearance or
disappearance of an entry relate to the appearance or disappearance
of entities or of relationships among entities in reality?);
2. changes in our scientific understanding;
3. reassessments of what is considered to be relevant for inclusion ;
4. corrections of encoding mistakes introduced during ontology
curation or data entry
• Main role of:
– Terminologies: standardise language use
– Ontologies: represent what is generic in reality
– EHR: document what is specifically related to particulars
(patients directly, (sub)populations indirectly)
• Role of terminologies in the context of the EHR:
– Make the documentation intelligable to humans other than those
who entered the data
• Role of ontologies in the context of the EHR:
– Ensure that the regimentation imposed by the EHR system does
not interfere with the re-usability of the data for a variety of
purposes, other than patient documentation.