slides - Vrije Universiteit Brussel

Download Report

Transcript slides - Vrije Universiteit Brussel

R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
VUB Leerstoel 2009-2010
Theme: Ontology for Ontologies, theory and applications
Ontologies in healthcare and the vision of
personalized medicine
May 19, 2010; 17h00-19h00
Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels
Room D2.01
Prof. Werner CEUSTERS, MD
Ontology Research Group, Center of Excellence in Bioinformatics and Life Sciences
and
Department of Psychiatry, University at Buffalo, NY, USA
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Context of this lecture series
Knowledge Representation
Informatics
Linguistics
Computational Linguistics
Medical Natural
Language Understanding
Electronic
Health Records
Translational
Research
Medicine
Biology
Ontology
Philosophy
Realism-Based
Ontology
Referent
Tracking
Pharmacogenomics
Pharmacology
Performing
Arts
Defense &
Intelligence
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Today’s topic
• May 19: ontologies in healthcare and the vision
of personalized medicine
– Open Biomedical Ontologies Foundry
– Example ontologies for eHealth
– An ontologist’s view on data models
Electronic
Health Records
Translational
Research
Medicine
Biology
Pharmacogenomics
Pharmacology
Realism-Based
Ontology
Referent
Tracking
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Data, information and
what it is about
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
A general belief:
Better
information
Better
care
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
‘Information’ versus ‘informing’
Being better informed
Better
information
Better
care
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Being
A general
better informed
belief:
• Concerns primarily the delivery of information:
Being better informed
Better
information
Better
care
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Being
A general
better informed
belief:
• Concerns primarily the delivery of information:
–
–
–
–
Timely,
Where required (e.g. bed-side computing),
What is permitted,
What is needed.
• Involves:
– Connecting systems,
– Making systems interoperable:
• Syntactically,
• Semantically.
pretty
well
covered
long way to go
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Today’s data generation and use
observation &
measurement
data
organization
model
development
use
=
outcome
add
Δ
(instrument and
study optimization)
verify
further R&D
Generic
beliefs
application
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example 1: clinician
observation &
measurement
data
organization
diagnosis
use
=
outcome
verify
add
Δ
Generic
beliefs
treatment
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example 2: researcher
observation &
measurement
data
organization
hypothesis
use
=
outcome
add
Δ
(instrument and
study optimization)
verify
further R&D
Generic
beliefs
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example 3: device manufacturer / supplier
observation &
measurement
data
organization
model
development
use
=
outcome
add
Δ
(instrument and
study optimization)
verify
further R&D
Generic
beliefs
application
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Slightly different: payer / health plan
data
organization
model
development
use
=
outcome
add
Δ
$
verify
window
dressing
Generic
beliefs
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
“Better Information” must cover …
1
Patient-specific information
3
Scientific “knowledge”
2
• EHR-EMR-ENR-…
• PHR
• Various modality related
databases
– Lab, imaging, …
• Textbooks
• Classification systems
• Terminologies
• Ontologies
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Means to structure the available information
Key question:
on what should
the structure be
based ?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What is the structure based on ? (1)
• Classification
systems:
on ‘properties’
of patients
which are of
importance for
the purposes the
system has been
designed
http://www.who.int/classifications/en/
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What is the structure based on ? (2)
• Terminologies:
– on ‘concepts’
• But terminologists
fail to give a good
answer on what a
concept is
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What is the structure based on ? (3)
• Ontologies (mainstream view):
– on ‘concepts’
• when designed by terminologists
– on ‘classes’
• when designed by software engineers and computer
scientists
– a class is a construct that is used as a blueprint to create objects of
that class. ?
– a class is a cohesive package that consists of a particular kind
of metadata. ??
– a class usually represents a noun, such as a person ???
http://en.wikipedia.org/wiki/Class_(computer_science)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Patients become victims of bad IT design
• ‘The Data Model That Nearly Killed Me’:
– Joe Bugajski, http://tiny.cc/S1HWo
• “If data cannot be made reliably available across
silos in a single EHR, then this data cannot be
made reliably available to a huge, heterogeneous
collection of networked systems.”
• ‘Are Health IT designers, testers and
purchasers trying to kill people?’
– Scot M. Silverstein, http://tiny.cc/CKIW1
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Our view: on realism-based ontology !
• In philosophy:
– Ontology (no plural) is the study of what entities exist and how they
relate to each other;
• In computer science and (biomedical informatics)
applications:
– An ontology (plural: ontologies) is a shared and agreed upon
conceptualization of a domain;
• Our ‘realist’ view within the Ontology Research Group
combines the two:
– We use realism, a specific theory of ontology, as the basis for
building high quality ontologies, using reality as benchmark.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
OBO and OBO-Foundry
A reaction to inadequacy
R T U New York State
US
Center of Excellence in
Bioinformatics
& Life
National
Centers
forSciences
Biomedical
Computing
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
OBO Website
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The OBO Foundry
• a family of interoperable biomedical reference
ontologies built around the Gene Ontology at its
core and using the same principles
• a modular annotation catalogue of English phrases
• each module created by experts from the
corresponding scientific community
• http://obofoundry.org
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
OBO Foundry principles (1)
• The ontology must be open and available to be used by all
without any constraint other than
– (a) its origin must be acknowledged and
– (b) it is not to be altered and subsequently redistributed under
the original name or with the same identifiers.
• The ontology is in, or can be expressed in, a common
shared syntax. This may be either the OBO syntax,
extensions of this syntax, or OWL.
• Each Foundry ontology should be built on the basis of
BFO top-level distinctions
– cave: OWL-DL is not capable of representing all BFO aspects
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
OBO Foundry principles (2)
• The ontologies have a unique identifier space within the
OBO Foundry.
• The source of a representational unit (RU) from any
ontology can be immediately identified by the prefix of
the identifier of each term.
• The ontology provider has procedures for identifying
distinct successive versions.
• The ontology has a clearly specified and clearly
delineated content.
– The ontology must be orthogonal to other ontologies already
lodged within OBO.
– community acceptance of a single ontology for one domain
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
OBO Foundry principles (3)
• The ontologies include textual definitions for all RUs.
– RUs should be defined so that their precise meaning within the
context of a particular ontology is clear to a human reader.
– Textual definitions will use the genus-species form: An A =def. a
B which Cs, where B is the parent of the defined term A and C is
the defining characteristic of those Bs which are also As.
• The ontology uses relations which are unambiguously
defined following the pattern of definitions laid down in
the OBO Relation Ontology.
• The ontology is well documented.
• The ontology has a plurality of independent users.
• The ontology will be developed collaboratively with other
OBO Foundry members.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
OBO Foundry principles (4)
• Single is_a inheritance: ontologies will distinguish
a backbone ('asserted') is_a hierarchy subject to
the principle of single inheritance (each term in the
ontology has maximally one is_a parent in this
asserted hierarchy).
• Instantiability: RUs in an ontology should
correspond to instances in reality.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Idea grew out of the Gene Ontology
what cellular component?
what molecular function?
what biological process?
33
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
OBO Foundry ontologies in BFO-dress
RELATION
TO TIME
GRANULARITY
CONTINUANT
INDEPENDENT
ORGAN AND
ORGANISM
Organism
(NCBI
Taxonomy)
CELL AND
CELLULAR
COMPONENT
Cell
(CL)
MOLECULE
DEPENDENT
Anatomical
Organ
Entity
Function
(FMA,
(FMP, CPRO) Phenotypic
CARO)
Quality
(PaTO)
Cellular
Cellular
Component Function
(FMA, GO)
(GO)
Molecule
(ChEBI, SO,
RnaO, PrO)
OCCURRENT
Molecular Function
(GO)
Biological
Process
(GO)
Molecular Process
(GO)
34
Ontology
Scope
R T U New York State
Cell Ontology
(CL)
URL
Centercellof
Excellence in obo.sourceforge.net/cgitypes from prokaryotes
to mammals
bin/detail.cgi?cell
Bioinformatics
& Life Sciences
Chemical Entities of Biological Interest (ChEBI)
molecular entities
Common Anatomy Reference Ontology (CARO)
Custodians
Jonathan Bard, Michael
Ashburner, Oliver Hofman
ebi.ac.uk/chebi
Paula Dematos,
Rafael Alcantara
anatomical structures in
human and model organisms
(under development)
Melissa Haendel, Terry
Hayamizu, Cornelius Rosse,
David Sutherland,
Foundational Model of
Anatomy (FMA)
structure of the human body
fma.biostr.washington.
edu
JLV Mejino Jr.,
Cornelius Rosse
Functional Genomics
Investigation Ontology
(FuGO)
design, protocol, data
instrumentation, and analysis
fugo.sf.net
FuGO Working Group
Gene Ontology
(GO)
cellular components,
molecular functions,
biological processes
www.geneontology.org
Gene Ontology Consortium
Phenotypic Quality
Ontology
(PaTO)
qualities of biomedical entities
obo.sourceforge.net/cgi
-bin/ detail.cgi?
attribute_and_value
Michael Ashburner, Suzanna
Lewis, Georgios Gkoutos
Protein Ontology
(PrO)
protein types and
modifications
(under development)
Protein Ontology Consortium
Relation Ontology (RO)
relations
obo.sf.net/relationship
Barry Smith, Chris Mungall
RNA Ontology
(RnaO)
three-dimensional RNA
structures
(under development)
RNA Ontology Consortium
Sequence Ontology
properties and features of
nucleic sequences
song.sf.net
Karen Eilbeck
http://ontologist.com
(SO)
35
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ontology of General Medical Science
First ontology in which the
L1/L2/L3 distinction is used
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Remember Basic Formal Ontology
• The world consists of
– entities that are
• Either particulars or universals;
• Either occurrents or continuants;
• Either dependent or independent;
and,
– relationships between these entities of the form
• <particular , universal>
• <particular , particular>
• <universal , universal>
e.g. is-instance-of,
e.g. is-member-of
e.g. isa (is-subtype-of)
Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and
Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Basic BFO distinctions
universals
Continuant
Independent
Continuant
Dependent
Continuant
thing
quality
Occurrent
process, event
.... ..... .......
particulars
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Basic BFO distinctions
universals
has_participant
Continuant
isa
isa
Independent
Continuant
Dependent
Continuant
Occurrent
process, event
~ thing
.... ..... .......
inheres_in
particulars
instance_of (at t)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
BFO Top-Level Ontology (partial)
Continuant
Spatial
Region
Independent
Continuant
SDC
Quality
Role
Realizable
Disposition
Function
Dependent
Continuant
Occurrent
(always dependent
on one or more
independent
continuants)
GDC
Information
Content
Entity
Process
Functioning
Temporal
Region
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Three
Terminology
levels of reality in Realist Ontology
Representation and Reference
representational units
(3) Representational units in various
forms about (1), (2) or (3)
cognitive
units
communicative
units
universals
particulars
(2) Cognitive entities which are our
beliefs about (1)
(1) Entities with objective existence
which are not about anything
First Order Reality
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Universals and Defined Classes
Unconstrained
reasoning
HUMAN BEING
instance_of at t
extension_of at t
E: all human beings at t
DC-x: patients at t
subclass_of at t
I-y
class_member_of at t
class_member_of at t
OWL-DL
reasoning
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Goal of OGMS
• To be a consistent, logical and extensible
framework (ontology) for the representation
of
– features of disease
– clinical processes
– results
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Motivation
• Clarity about:
– disease etiology and progression
– disease and the diagnostic process
– phenotype and signs/symptoms
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Approach
• a disease is a disposition rooted in a physical disorder in the
organism and realized in pathological processes.
produces
etiological process
bears
disorder
realized_in
disposition
pathological process
produces
diagnosis
interpretive process
produces
signs & symptoms
participates_in
abnormal bodily features
recognized_as
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Cirrhosis - environmental exposure
•
•
•
•
•
•
•
Etiological process - phenobarbitolinduced hepatic cell death
– produces
Disorder - necrotic liver
– bears
Disposition (disease) - cirrhosis
– realized_in
Pathological process - abnormal tissue
repair with cell proliferation and
fibrosis that exceed a certain
threshold; hypoxia-induced cell death
– produces
Abnormal bodily features
– recognized_as
Symptoms - fatigue, anorexia
Signs - jaundice, splenomegaly
•
•
•
•
•
•
•
Symptoms & Signs
– used_in
Interpretive process
– produces
Hypothesis - rule out cirrhosis
– suggests
Laboratory tests
– produces
Test results – documentation of
elevated liver enzymes in serum
– used_in
Interpretive process
– produces
Result - diagnosis that patient X has a
disorder that bears the disease
cirrhosis
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Influenza - infectious
•
•
•
•
•
•
•
Etiological process - infection of
airway epithelial cells with influenza
virus
– produces
Disorder - viable cells with influenza
virus
– bears
Disposition (disease) - flu
– realized_in
Pathological process - acute
inflammation
– produces
Abnormal bodily features
– recognized_as
Symptoms - weakness, dizziness
Signs - fever
•
Symptoms & Signs
– used_in
• Interpretive process
– produces
• Hypothesis - rule out influenza
– suggests
• Laboratory tests
– produces
• Test results – documentation of elevated
serum antibody titers
– used_in
• Interpretive process
– produces
• Result - diagnosis that patient X has a
disorder that bears the disease flu
But the disorder also induces normal
physiological processes (immune response)
that can result in the elimination of the
disorder (transient disease course).
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Huntington’s Disease - genetic
•
•
•
•
•
•
•
Etiological process - inheritance of
>39 CAG repeats in the HTT gene
– produces
Disorder - chromosome 4 with
abnormal mHTT
– bears
Disposition (disease) - Huntington’s
disease
– realized_in
Pathological process - accumulation of
mHTT protein fragments, abnormal
transcription regulation, neuronal cell
death in striatum
– produces
Abnormal bodily features
– recognized_as
Symptoms - anxiety, depression
Signs - difficulties in speaking and
swallowing
•
•
•
•
•
•
•
Symptoms & Signs
– used_in
Interpretive process
– produces
Hypothesis - rule out Huntington’s
– suggests
Laboratory tests
– produces
Test results - molecular detection of
the HTT gene with >39CAG repeats
– used_in
Interpretive process
– produces
Result - diagnosis that patient X has a
disorder that bears the disease
Huntington’s disease
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
HNPCC - genetic pre-disposition
hereditary non-polyposis colorectal cancer
• Etiological process - inheritance of a mutant mismatch repair gene
– produces
• Disorder - chromosome 3 with abnormal hMLH1
– bears
• Disposition (disease) - Lynch syndrome
– realized_in
• Pathological process - abnormal repair of DNA mismatches
– produces
• Disorder - mutations in proto-oncogenes and tumor suppressor genes
with microsatellite repeats (e.g. TGF-beta R2)
– bears
• Disposition (disease) – to acquire non-polyposis colon cancer
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Big Picture
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Primitive Terms
• ‘bodily feature’ – may denote a physical component, a
bodily quality, or a bodily process.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
There are way more sorts of classes than universals
Quality
do not have
corresponding
universals
isa
Fever
Independent
continuant
Process
isa
isa
isa
Edema
Rash
Tremor
extension_of
bodily features
rashes
signs of infectious
disease
infectious
fevers
allergic
rashes
edemas
fevers
infectious
rashes
tremors
orthostatic
tremors
signs of
Graves’
disease
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Primitive Terms
• clinically abnormal - some bodily feature that
– (1) is not part of the life plan for an organism of the
relevant type (unlike aging or pregnancy),
– (2) is causally linked to an elevated risk either of pain
or other feelings of illness, or of death or dysfunction,
and
– (3) is such that the elevated risk exceeds a certain
threshold level.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Disposition
• A disposition is
•
a realizable entity which is such that
•
(1) if it ceases to exist, then its bearer is
physically changed, and
•
(2) whose realization occurs, in virtue of the
bearer’s physical make-up, when this bearer is in
some special physical circumstances
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Foundational Terms
• Disorder =def. – A causally linked combination of
physical components that is (a) clinically abnormal and
(b) maximal, in the sense that it is not a part of some
larger such combination.
• Pathological Process =def. – A bodily process that is a
manifestation of a disorder and is clinically abnormal.
• Disease =def. – A disposition (i) to undergo pathological
processes that (ii) exists in an organism because of one or
more disorders in that organism.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Dispositions and Predispositions
• All diseases are dispositions; not all dispositions are diseases.
• A predisposition is a disposition.
• Predisposition to Disease of Type X =def. – A disposition in an
organism that constitutes an increased risk of the organism’s
subsequently developing the disease X.
• HNPCC is caused by a
– disorder (mutation) in a DNA mismatch repair gene that
– disposes to the acquisition of additional mutations from
defective DNA repair processes, and thus
– predisposition to the development of colon cancer.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Etiology
• Etiological Process =def. – A process in an organism that
leads to a subsequent disorder.
• Example:
– toxic chemical exposure resulting in a mutation in the genomic DNA
of a cell;
– infection of a human with a pathogenic virus;
– inheritance of two defective copies of a metabolic gene
• The etiological process creates the physical basis of that
disposition to pathological processes which is the disease.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Evaluation related
• Sign =def. – A bodily feature of a patient that is observed in a physical
examination and is deemed by the clinician to be of clinical significance.
(Objectively observable features)
• Symptom =def. – A bodily feature of a patient that is observed by the
patient and is hypothesized by the patient to be a realization of a disease.
(a restricted family of phenomena (including pain, nausea, anger,
drowsiness), which are of their nature experienced in the first person)
• Laboratory Test =def. – A measurement assay that has as input a patientderived specimen, and as output a result representing a quality of the
specimen.
• Laboratory Finding =def. – A representation of a quality of a specimen
that is the output of a laboratory test and that can support an inference to
an assertion about some quality of the patient.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Definitions - Qualities
• Manifestation of a Disease =def. – A bodily feature of a patient that is (a) a
deviation from clinical normality that exists in virtue of the realization of a
disease and (b) is observable.
– Observability includes observable through elicitation of response or through
the use of special instruments.
• Preclinical Manifestation of a Disease =def. – A manifestation of a disease that
exists prior to its becoming detectable in a clinical history taking or physical
examination.
• Clinical Manifestation of a Disease =def. – A manifestation of a disease that is
detectable in a clinical history taking or physical examination.
• Phenotype =def. – A (combination of) bodily feature(s) of an organism
determined by the interaction of its genetic make-up and environment.
• Clinical Phenotype =def. – A clinically abnormal phenotype.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Diagnosis
• Clinical Picture =def. – A representation of a
clinical phenotype that is inferred from the
combination of laboratory, image and clinical
findings about a given patient.
• Diagnosis =def. – A conclusion of an interpretive
process that has as input a clinical picture of a
given patient and as output an assertion to the
effect that the patient has a disease of such and
such a type.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
A well-formed diagnosis of ‘pneumococal pneumonia’
• A configuration of
Disease
representational units;
isa
• Believed to mirror the
person’s disease;
Pneumococcal pneumonia
• Believed to mirror the
disease’s cause;
Instance-of at t1
• Refers to the universal
of which the disease is
#78
#56
caused
John’s portion
John’s
believed to be an
by
of pneumococs
Pneumonia
instance.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some motivations and consequences (1)
• No use of debatable or ambiguous notions such as
proposition, statement, assertion, fact, ...
• The same diagnosis can be expressed in various
forms.
Disease
isa
Pneumococcal pneumonia
Instance-of at t1
#78
caused
by
#56
Portion of
pneumococs
caused
by
isa
Pneumonia
Instance-of
Instance-of at t1
at t1
#56
caused
by
#78
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some motivations and consequences (2)
• A diagnosis can be of level 2 or level 3, i.e. either
in the mind of a cognitive agent, or in some
physical form.
• Allows for a clean interpretation of assertions of
the sort ‘these patients have the same diagnosis’:
 The configuration of representational units is such that
the parts which do not refer to the particulars related to
the respective patients, refer to the same portion of
reality.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Distinct but similar diagnoses
Pneumococcal pneumonia
Instance-of at t1
#78
John’s portion
of pneumococs
caused
by
Instance-of at t2
#56
#956
John’s
Pneumonia
Bob’s
pneumonia
caused
by
#2087
Bob’s portion
of pneumococs
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some motivations and consequences (3)
• Allows evenly clean interpretations for the wealth
of ‘modified’ diagnoses:
– With respect to the author of the representation:
• ‘nursing diagnosis’, ‘referral diagnosis’
– When created:
• ‘post-operative diagnosis’, ‘admitting diagnosis’, ‘final
diagnosis’
– Degree of the belief:
• ‘uncertain diagnosis’, ‘preliminary diagnosis’
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
ReMINE Adverse Event Ontology
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
ReMINE Project
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
To Err is Human: Building a Safer Health
System
• 2000 Institute of Medicine Report:
• estimated 98,000 deaths a year caused by adverse
events. Other studies indicate that this is an
underestimation. Since then various agencies
started to fund projects to improve the quality of
healthcare; many devoted to detecting and
reporting adverse events. Need for a common
ontology.
68
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What is an adverse event ?
• We asked Google
• We asked the experts
• And as so often in biomedical terminology, …
• … we obtained many distinct and mutually
incompatible answers
• … creating a silo problem
69
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What is an adverse event ?
•
•
•
•
•
•
•
•
•
•
a reaction …
an effect …
an event …
a problem …
an experience …
an injury …
a symptom …
an illness …
an occurrence …
a change …
and also:
• something …
• an act …
• an observation …
as well as …
• a term !!!
70
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The view of some experts
D4
D5
D6
D7
D8
D9
an observation of a change in the state of a subject assessed as being untoward
by one or more interested parties within the context of a protocol-driven research
or public health.
an event that results in unintended harm to the patient by an act of commission or
omission rather than by the underlying disease or condition of the patient
any unfavorable and unintended sign (including an abnormal laboratory finding),
symptom, or disease temporally associated with the use of a medical treatment or
procedure that may or may not be considered related to the medical treatment or
procedure
any untoward medical occurrence in a patient or clinical investigation subject
administered a pharmaceutical product and which does not necessarily have to
have a causal relationship with this treatment
an untoward, undesirable, and usually unanticipated event, such as death of a
patient, an employee, or a visitor in a health care organization. Incidents such as
patient falls or improper administration of medications are also considered
adverse events even if there is no permanent effect on the patient.
an injury that was caused by medical management and that results in measurable
disability.
BRIDG
IOM
NCI
CDISC
JTC
QUIC
71
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Clearly, confusion reigns …
The question
“What are adverse events?”
cannot be answered directly, but needs to be
reformulated as
“What might the author of a particular
sentence containing the phrase ‘adverse
event’ be referring to when he uses that
phrase?”.
72
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
At least one argument
• There is no single entity which each of these authors
would be able to point to and say, faithfully and honestly,
– “that is an observation” (definition D4),
– “that is an injury” (definition D9),
– “that is a laboratory finding” (definition D6).
• Clearly,
– nothing which is an injury can be a laboratory finding, although,
of course, laboratory findings can aid in diagnosing an injury.
– nothing which is a laboratory finding, can be an observation,
although, of course, some observation must have been made if
we are to arrive at a laboratory finding.
73
R T U New York State
Centerapproaches
of Excellence in to bringing clarity
Current
Bioinformatics & Life Sciences
• Building a consensus definition (and rejecting the
others):
– e.g. BRIDG (an observation of a change in a subject )
– the other definitions do not disappear and will still be
used
• Building an ontology of all of those things
relevant to understanding any given use of
‘adverse event’
– Done thus far, unfortunately, by using very weak
principles underlying ‘concept’-orientation, e.g.
• allowing ‘age’ and ‘gender’ to be subclasses of ‘patient’,
• not allowing adequate treatment of temporal sequence
74
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Our research questions
• Can realism-based ontology be of value
– to identify the different sorts of entities that can be
denoted by the term ‘adverse event’ ?
– to establish how these entities relate to each other and
to use these relations to identify to what extent the
various definitions overlap ?
– to describe the portion of reality that is covered by all
entities denoted by the terms that appear in the various
definitions for ‘adverse event’ ?
75
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Basic hypothesis
– all the authors of the mentioned definitions use the
term ‘adverse event’ in contexts which look quite
similar
– in each of them, more or less the same sorts of entities
seem to be involved
• … there is some common ground (some portion of
reality) which is such that the entities within it can be
used as referents for the various meanings of ‘adverse
event’.
76
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Study design
• Goal:
– to bring clarity into the wilderness that grew from current efforts.
• Methods:
– analyze the literature and collect all relevant definitions.
– study a variety of relevant classification systems, taxonomies,
terminologies and concept-based ontologies,
– apply the realism-based principles advocated in
• Basic Formal Ontology (BFO)
• Referent Tracking (RT)
to build a representation for the relevant portion of reality
– assess whether the representation covers what is (or might be)
expressed in the various definitions .
77
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ontology development in ReMINE
support
annotation
ReMINE
ReMINE
ReMINE
ReMINE
Taxonomies
Taxonomies
Taxonomies
Taxonomies
Description of specific adverse
event domains (childbirth, patient
transfer, ..) as cognized by human
beings
support
reasoning
ReMINE
Adverse Event
Domain Ontology
ReMINE
ReMINE
ReMINE
ReMINE
Application
Application
Application
Application
Ontology
Ontology
Ontology
Ontologies
higher order logic
Realism-based, purpose independent
representation of the portion of
reality described in the taxonomies
description logic
Purpose dependent
reformulations of the
parts of RAEDO which
are relevant for a
specific domain
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
ReMINE
Taxonomy
Annotated
Events
Risk Manager’s
Event Administration System
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The cognitive model underlying the taxonomies
Risky
Parameters
has
has_quality
has_role
Software
Situation
Environment
occurs_in
SHEL entity
Time Interval
Hardware
occurs_in
has_quality
Contributing
Factor
causes
Adverse_Event
Impact on
Patient
has_quality
prevents
Mitigation
Factor
Liveware
results_in
results_in
occurs_during
Impact on
Organization
Patient
Incident Type
Process
results_in
Problem
has
Primary
Diagnosis
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
ReMINE’s notion of adverse event
1. an ‘incident [that] occurred during the past
and [is] documented in a database of adverse
events’
– Stefano Arici, Paolo Bertele. ReMINE Deliverable D4.1 –
RAPS Taxonomy: approach and definition. V1.0 (Final)
August 8, 2008. (p21)
… which is a ‘perdurant’ - ibidem (p26)
… ‘that occurs to a patient’ - ibidem (p23)
2. an expectation of some future happening that
can be prevented - ibidem (p23)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Terminologists agree, ontologists think …
• Can something which is an incident be at the same
time an expectation ?
• Can something which is an incident a time t, later
become an adverse event simply because it [?] has
been entered in a database ?
• Can adverse events really occur in software ?
• …
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Intermediate conclusion
• The ReMINE taxonomy (and all concept-based terminologies and ‘ontologies’ in general)
provides a distorted view of reality.
• For good reasons: the distortion is such that
– it reflects a pragmatic view on what is relevant for the purposes
it is designed,
– it does away with complexities that do not help human beings in
doing a better job.
• But with some negative consequences:
– reusability out of the ReMINE context is hampered,
– integration with other descriptive systems becomes
cumbersome, and
– advanced reasoning turns out to be impossible.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Level 2
interpretation
Diagnoses
Interpretations
Hypotheses
Risk assessments
…
documentation
Level 1
Primary care processes
Secondary processes
management
research
Patients
Clinicians
Drugs
Disorders
…
Level 3
Risk
Management
Ontology
guidance
Patient documentation
Protocols
Guidelines
Event reports
Scientific literature
…
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Using the 3 levels and the particular/universal/class distinctions
• Level 1:
– #1: an incident that happened in the past;
• Level 2:
– #2: the interpretation by some cognitive agent that #1 is an
adverse event;
– #3: the expectation by some cognitive agent that similar
incidents might happen in the future;
• Level 3:
– #4: an entry in the adverse event database concerning #1;
– #5: an entry in some other system about #3 for mitigation or
prevention purposes.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Allows appropriate error management
• Some possibilities:
1. #1with unjustified absence of #2:
• #1 was not perceived at all, or not assessed as being an
adverse event
2. Unjustified presence of #2:
• There was no #1 at all, or #1 was not an adverse event
3. Unjustified absence of #4
• Same reasons as under (1) above
• Justified presence of #2 but not reported in the database
– …
Ceusters W, Smith B. A Realism-Based Approach to the Evolution of Biomedical Ontologies.
Proceedings of AMIA 2006, Washington DC, 2006;:121-125.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Part of the ReMINE Domain Ontology
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Higher order logical representation
• an incident (#1) that happened at time t2 to a patient (#2)
after some intervention (#3 at t1)
• is judged at t3 to be an adverse event, thereby giving rise
to a belief (#4) about #1 on
• the part of some person (#5, a caregiver as of time t6).
• This requires the introduction (at t4) of an entry (#6) in
the adverse event database (#7, installed at t0).
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Back-linking of the ontology to the taxonomies
• ‘ReM:Insufficient illumination’ is a ReMINE
term representing a defined class whose
members are all instances of the universal
represented
by
the
ReMINE
term
‘ReM:illumination’, that universal enjoying an
isa relation with the universal represented by
the BFO term ‘BFO:Quality’
• ‘ReM:international guideline’ is a ReMINE
term representing a defined class whose
members are all instances of the universal
represented by the UCore-SL term
‘UCore:Plan’, that universal enjoying an isa
relation with the universal represented by the
BFO term ‘BFO:InformationContentEntity’
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Advantages
• Synchronisation of two distinct representations of the
same reality:
– taxonomies:
• user-oriented view
• data annotation
– ontologies:
• realism-based view
• unconstrained reasoning
• Domain ontology compatible with OBO-Foundry
ontologies:
– no overlap,
– easier to re-use.
• Not only tracking of incidents, but also:
– how well individual clinicians and organizations manage
adverse events,
– how well one learns from past experiences.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Application in pharmacogenomics 101
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Pharmacogenomics
• What is it ?
– ‘The branch of pharmacology which deals with the
influence of genetic variation on drug response in
patients by correlating gene expression or singlenucleotide polymorphisms with a drug's efficacy or
toxicity’.
– ‘Pharmacogenomics is the whole genome application
of pharmacogenetics, which examines the single gene
interactions with drugs’.
» http://en.wikipedia.org/wiki/Pharmacogenomics
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Typical approach (1)
• Building a huge matrix with patient cases in one dimension
and patient characteristics in the other dimension
Characteristics
Cases
ch1
case1
case2
case3
case4
case5
case6
...
ch2
ch3
ch4
ch5
ch6
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Typical approach (2)
• Use statistical correlation techniques to find associations
between characteristics and (dis)similarities between cases
Characteristics
Cases
ch1
case1
case2
case3
case4
case5
case6
...
ch2
ch3
ch4
ch5
ch6
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Fundamental questions
1.
2.
What is a characteristic ?
What (sorts of) (pharmacogenomically relevant) characteristics go in here ?
Characteristics
Cases
ch1
ch2
ch3
ch4
ch5
ch6
...
case1
case2
case3
case4
case5
case6
...
3.
4.
5.
How can we make distinct pharmacogenomic studies comparable?
Because such matrices tend to become huge, how can we make analysis feasible ?
How can we make results re-usable?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Q1: what is a characteristic ?
– it is for sure not a category entities can belong to: there
is no generic entity for which the name ‘characteristic’
would be appropriate on an exclusive basis;
– there is also no particular entity that you could point to
and state ‘that over there is the only existing
characteristic’
– thus: there are no characteristics, there is just the term
‘characteristic’ which is used to describe that some
entities are (acknowledged to be) in some way of
interest in some context and for some purpose.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
This requires rephrasing Q2
What (sorts of) (pharmacogenomically relevant)
characteristics go in here?
What entities described as being characteristic for
pharmacogenomic purposes should be represented
here?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Examples
Universals
• portion of C19H17ClN2O4
Independent • human being
Continuant • gene
Continuant
Dependent
Continuant
Particulars
• portion of Glifanan in the tablet in front of me
• me
• the HTR2A gene on chromosome 13 of the
most frontal cell in the tip of my nose
• shape
• the shape of my nose
• temperature
• the temperature of the Glifanan tablet in front
of me
• length
• the length of that HTR21 gene
• change in shape
• unfolding of a DNA molecule
• motion
• the circulation of a Glifanan molecule in my
bloodstream
• rise in temperature
• the rise of my body temperature while teaching
this seminar
Occurrent
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Two distinct (?) sorts of relevant entities
Characteristics
Cases
ch1
ch2
ch3
ch4
ch5
ch6
case1
case2
case3
case4
case5
case6
...
phenotypic
genotypic
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Genotype / Phenotype
Gene Ontology
genes
Human Phenotype ‘Ontology’
gene products
features
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The Gene Ontology components
• Molecular Function = elemental activity/task
– the tasks performed by individual gene products; examples are
carbohydrate binding and ATPase activity
• Biological Process = biological goal or objective
– broad biological goals, such as mitosis or purine metabolism,
that are accomplished by ordered assemblies of molecular
functions
• Cellular Component = location or complex
– subcellular structures, locations, and macromolecular
complexes; examples include nucleus, telomere, and RNA
polymerase II holoenzyme
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Application of good ontological principles
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Human Phenotype ‘Ontology’
http://www.humanphenotypeontology.org/index.
php/hpo_home.html
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Q3: How can we make distinct pharmacogenomic
studies comparable?
• Map any characteristic used to relevant, standard
and high quality ontologies
Characteristics
Cases
ch1
case1
case2
case3
case4
case5
case6
...
ch2
ch3
ch4
ch5
ch6
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The positive effects of appropriate mappings
Characteristics
Cases
ch1
ch2
ch3
ch4
ch5
ch6
...
ch6
...
case1
case2
case3
case4
case5
case6
...
Characteristics
Cases
ch1
ch2
ch3
ch4
ch5
Characteristics
Cases
ch6
...
ch1
case1
case1
case2
case2
case3
case3
case4
case4
case5
case5
case6
case6
...
...
ch2
ch3
ch4
ch5
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The positive effects of appropriate mappings
Characteristics
Cases
ch1 ch2 ch3 ch4 ch5 ch6 ...
case1
case2
–
–
–
–
case3
case4
case5
case6
...
Characteristics
Cases
ch1
ch2
ch3
ch4
ch5
Characteristics
Cases
ch6 ...
ch1
case1
case1
case2
case2
case3
case3
case4
case4
case5
case5
case6
case6
...
...
• identification of ontological
relations prior to statistical
correlation:
ch2
ch3
ch4
ch5
ch6 ...
ch1 and ch4
ch1 and ch5
ch1 and ch2
…
• Contributes to answering
‘Q4: how can we make
analysis feasible’
– this method allows for datareduction without information
loss.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Filling the grid
• We know now that here go labels from appropriate
ontologies
Characteristics
Cases
ch1
case1
case2
case3
case4
case5
case6
...
• But, what goes here?
ch2
ch3
ch4
ch5
ch6
...
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Remember we had this …
Universals
• portion of C19H17ClN2O4
Independent • human being
Continuant • gene
Continuant
Dependent
Continuant
Particulars
• portion of Glifanan in the tablet in front of me
• me
• the HTR2A gene on chromosome 13 of the
most frontal cell in the tip of my nose
• shape
• the shape of my nose
• temperature
• the temperature of the Glifanan tablet in front
of me
• length
• the length of that HTR21 gene
• change in shape
• unfolding of a DNA molecule
• motion
• the circulation of a Glifanan molecule in my
bloodstream
• rise in temperature
• the rise of my body temperature while teaching
this seminar
Occurrent
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Or after transposition …
Universals
Continuant
Independent
Continuant
portion of
C19H17ClN2O4
human
being
Occurrent
Dependent
Continuant
gene
shape
temperature
length
change
in shape
motion
rise in temperature
Particulars
• portion of Glifanan in the
tablet in front of me
• me
• the HTR2A gene on
chromosome 13 of the most
frontal cell in the tip of my
nose
• the shape of my nose
• unfolding of a DNA molecule
• the temperature of the
Glifanan tablet in front of
me
• the circulation of a Glifanan molecule in my
bloodstream
• the length of that HTR21
gene
• the rise of my body temperature while teaching
this seminar
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
… and for many patients
Universals
Continuant
Independent
Continuant
portion of
C19H17ClN2O4
case1
Particulars
case2
.
case3
case4
case5
case6
case7
case8
…
.
human
being
..
..
..
.
Occurrent
Dependent
Continuant
gene
shape
..
.
..
temperature
length
change
in shape
motion
. . .
.
.
.
.. .
..
.
.
.
rise in temperature
.
..
..
.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Referent Tracking
Universals
Continuant
Independent
Continuant
portion of
human gene shape temperature
C19H17ClN2O4 being
case1
Particulars
case2
case3
case4
case5
case6
case7
case8
…
. ..
..
. ..
.
Occurrent
Dependent
Continuant
..
.
..
length
change
in shape
motion
. . .
.
.
.
.. . ..
.
.
.
rise in temperature
.
..
..
.
unique
identification
by means of
‘codes’
unique
identification
by means of
‘instance
unique
identifiers’
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Data and Information Models
An ontological view
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
This holds also for data and information models
1
Patient-specific information
3
Scientific “knowledge”
2
• EHR-EMR-ENR-…
• PHR
• Various modality related
databases
– Lab, imaging, …
• Textbooks
• Classification systems
• Terminologies
• Ontologies
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What is an Information Model?
• An information model is:
– ‘a representation of concepts, relationships,
constraints, rules, and operations to specify data
semantics for a chosen domain of discourse that satisfy
some industry need’.
• A ‘quality’ information model is:
– ‘an information model that is complete, sharable,
stable, extensible, well-structured, precise, and
unambiguous’.
Y. Tina Lee. Information Modeling: From Design To Implementation.
http://www.mel.nist.gov/msidlibrary/doc/tina99im.pdf
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Why are there so many IM but no ‘quality’ IM?
• An information model is:
– ‘a representation of concepts, relationships,
constraints, rules, and operations to specify data
semantics for a chosen domain of discourse that satisfy
some industry need’.
• many domains,
• different needs within the same domain,
• selection of ‘concepts’, ‘relationships’, … relevant for the
needs.
can never be complete
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Why are there so many?
Blobel B, Pharow P: Analysis and Evaluation of EHR Approaches. MIE 2008, 26-28 May 2008, Göteborg, Sweden
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Why are so many incompatible?
• An information model is:
– ‘a representation of concepts, relationships,
constraints, rules, and operations to specify data
semantics for a chosen domain of discourse that satisfy
some industry need’.
• confusion about:
– what ‘concepts’ and ‘relationships’ are,
– whether a ‘domain of discourse’ is:
» what is or can be said, versus,
» that about what something is or can be said,
– ‘semantics’.
can never be unambiguous and precise
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Two major problems in information modeling (1)
• Tyranny of the use case:
– ‘if most people wrongly believe that crocodiles are a
kind of mammal, then most people would find it
easier to locate information about crocodiles if it were
located in a mammals grouping, rather than where it
factually belonged’. (p89)
Huhns MN, Stephens LM. Semantic Bridging of Independent
Enterprise Ontologies. In: Kosanke K, ed. Enterprise Inter- and IntraOrganizational Integration: Building International Consensus. Boston,
MA: Kluwer Academic Publishers; 2002:83 – 90.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Two major problems in information modeling (2)
• Assumption of inherent classification:
–
–
we identify every thing by a specific class to which it
belongs; and
there exists a preferred set of classes to describe a domain.
• Sad consequences:
–
–
–
‘the complexity of problems in schema integration, schema
evolution, and interoperability,
violates philosophical and cognitive guidelines on
classification and is, therefore,
inappropriate in view of the role of data modeling in
representing knowledge about application domains’.
Parsons, J. and Wand, Y. Emancipating instances from the tyranny of classes in information modeling. ACM Trans.
Database Syst. 25, 2 (June 2000), 228–268.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Both problems have a common ground
• Confusion brought about by the (dis)similarity between
information and what the information is about:
space
}
}
}
}
anamnesis
clinical examination
diagnosis
therapeutic interventions
time
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The worst of all: Health Level 7 RIM
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
HL7 EHR structure
For HL7,
a
document
is an act !
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
HL7 said for over 15 years …
• ‘The truth about the real world is constructed through a
combination (and arbitration) of such attributed
statements only, and there is no class in the RIM whose
objects represent "objective state of affairs" or "real
processes” independent from attributed statements.
As such, there is no distinction
between an activity and its
documentation.’
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Then what about this advice from the Surgeon General
• ‘The Nation must take an informed, sensitive
approach to communicate with and educate the
American people about health issues related to
overweight and obesity.’
• ‘ACTION: The Nation must take action to assist
Americans in balancing healthful eating with
regular physical
physical activity.’
activity
http://www.surgeongeneral.gov/topics/obesity/calltoaction/fact_vision.htm
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Because of HL7: Americans think that watching
sports is as good as doing sports …
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
... or reading about sports
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
America’s future
www.sfpix.com/health_saturdays/Heal_sat1.html
R T U New York State
Center of Excellence in
http://hl7-watch.blogspot.com/
Bioinformatics & Life Sciences
128
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Not much better: Microsoft HealthVault
• an Allergic Episode = (a) a single piece of data,
that is (b) in a health record that is (c) accessible
through Microsoft Healthvault
Other Health Record Items: a blood pressure
measurement, an exercise session, an insurance
claim.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Even the very promising OpenEHR Model
switching between
data structures and
what the data are
about
T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5.1.1. 16 Aug 2008
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
A bad ontology for this model
Martínez-Costa, Menárguez-Tortosa, Fernández-Breis, Maldonado. A model-driven approach for representing clinical archetypes for
Semantic Web environments. Journal of Biomedical Informatics 42(1), February 2009, 150-164
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Not every term denotes (1)
• ‘A well-known problem in clinical information recording is the problem of
assigning “status”, including variants like “actual value of P” (P stands for
some phenomenon), “family history of P”, “risk of P”, “fear of P”, as well as
negation of any of these, i.e. “not/no P”, “no history of P” etc.
• A proper analysis of these so called statuses shows that they are not “statuses” at
all, …’
– this is so true !
• ‘… but different categories of information as per the ontology. The common
statement types mentioned here are mapped as follows:
•
•
•
•
•
•
•
actual value of P ⇒ Observation (of P);
no/not P ⇒ Observation (of excluded P or types of P, e.g. allergies).
family history of P ⇒ Evaluation (that patient is at risk of P);
no family history of P ⇒ Evaluation (that P is an excluded risk);
risk of P ⇒ Evaluation (that patient is at risk of P);
no risk of P ⇒ Evaluation (that patient is not at risk of P);
fear of P ⇒ Observation (of FEAR, with P mentioned in the description);’
– some of these P’s do not exist at all !
T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5.1.1. 16 Aug 2008
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Not every term denotes (2)
• ‘Another set of statement types that can be confused in systems that
do not properly model information categories concern
interventions, e.g. “hip replacement (5 years ago)”, “hip
replacement(planned)”, “hip replacement (ordered for next tuesday
10 am)”.’
– this is so true !
• ‘Ambiguity is removed here as well,with the use of the correct
information categories, e.g. (I stands for an intervention):
• I (distant past/unmanaged/passively documented)
– ⇒ Observation (of I present in patient);
• I (recent past) ⇒ Action (of I having been done to/for patient);
• I (proposed) ⇒ Evaluation, subtype Proposal (of I suggested/likely for patient);
• I (ordered) ⇒ Instruction (of I for patient for some date in the future).’
– some of these I’s do not exist at all !
T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5.1.1. 16 Aug 2008
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Schemas like this need to be corrected
T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5.1.1. 16 Aug 2008
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
An appropriate view on reality …
T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5.1.1. 16 Aug 2008
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
A bit less appropriate view on reality …
K Bernstein, M Bruun-Rasmussen, S Vingtoft, SK Andersen, C Nøhr.
Modelling and implementing electronic health records in Denmark.
International Journal of Medical Informatics (2005) 74, 213—220.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
… can still lead to an erroneous ‘ontology’
Clinical Investigator Recording (CIR) ontology
T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5.1.1. 16 Aug 2008
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
… and to leaving observed distinctions implicit
‘not knowing’
or ‘not
specifying’
something is
not a property
of that what is
not known or
that about what
a specification
should be
given, but a
property of the
agent involved.
T Beale, S Heard, D Kalra, D Lloyd. The openEHR Architecture Support Terminology.
Revision: 1.0.1; 04 Aug 2008
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Another
example
Over the past 15 years, nearly 500 genes that contribute to
inherited eye diseases have been identified. Diseasecausing mutations are associated with many ocular
diseases, including glaucoma, cataracts, strabismus,
corneal dystrophies and a number of forms of retinal
degenerations. This remarkable new genetic information
highlights the significant inroads that are being made in
understanding the medical basis of human ophthalmic
diseases. As a result, gene-based therapies are actively
being pursued to ameliorate ophthalmic genetic diseases
that were once considered untreatable.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Objectives of the Network
• provide easy access to genetic testing for patients
diagnosed with ocular diseases by screening for
these genes,
• collect and maintain relevant information in secure
databases
– to help speed the progress toward developing
treatments and
– to identify those who are most likely to benefit from
them,
• maintain a genetic specimen repository.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
The eyeGENE™ Database System
• a repository of genotype and phenotype
information of patients with eye diseases,
• linked to a repository of DNA samples of patients
with inherited eye diseases,
• originally designed as a stand alone application,
• but now moving towards a system that can be
‘integrated’ with a variety of other health care IT
systems.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Core medical information in eyeGENE™
• Patient profile information:
– DOB, contact info, race, sex, etc.
• Family information:
– presence of ‘the same’ disease in family members
• Phenotype information:
– One or more diagnoses – currently there are 21 potential diagnoses
– clinical findings data obtained through structured questions for each
diagnosis.
• Genetic test results:
– Result rows organized by Gene (with unique GI#), exons screened, and lab
procedures
– For each gene, exons screened, and lab procedures, results are registered as
either ‘negative’, ‘mutation’ or ‘variant’
– For mutation or variant, results consist of exon, DNA changes, protein
changes, and genotype.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Enhancements to core system capabilities
Patient
Information
Clinical
Findings
Family
Information
Dynamic
Phenotype
Content
(v1/2)
Can track at
Multiple clinics (v4)
Unlimited
blood/DNA
flow (v2)
Redesigned
Interface (v3)
Automated
Email tracking (v3)
Blood and
DNA
specimens
eyeGENE
Genetic test
results
Anonymized
data for
analysis
New (v3)
Medical
Images and
other
uploaded files
Consent Forms
and other
administrative
data
Prototype
Done (v3)
Can upload
As PDFs (v3)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Objectives of our study
1. to understand the type of view embedded in the
eyeGENE™ database and,
2. in case this view would differ from the realist
one, to propose a migration path towards the
latter.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Materials & methods
• We studied
– the available documentation about eyeGENE™’s core medical information,
including parts of its information model and user interfaces.
– some of the clinical questions (and corresponding possible-answer sets) that
are asked to eyeGENE™ users when they enter data in the system,
– system generated reports about lab procedures performed on genes.
• We did not have access to a data-dictionary with data-definitions
and corresponding business rules
– thus had to do some guessing about the exact meaning of data-fields
• We checked
– for design choices in the system that would lead the information to be
collected not to match with the corresponding structure of reality;
– for structural and functional issues in eyeGENE™ that in absence of
sufficient background information for disambiguation would lead to
difficulties in interpreting data once entered.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Realist framework used
• Levels of reality:
– L1- entities: such as specific patients, their relatives, the
disorders they are suffering from, the lab tests that have been
conducted, and so forth;
– L2 - entities: interpretations and opinions on the side of
clinicians, including hypotheses and diagnoses;
•
thus being about entities in first-order reality, although not accessible
to third parties without additional L3 references;
– L3-entities: information-elements about L1 or L2 entities,
examples being entries in information systems such as the
eyeGENE™ database.
• The (type of) relationships that obtain between entities in
each of these levels and across levels.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Results
•
the pragmatic design approach initially followed
by eyeGENE™ exhibits several limitations:
(1) conflating the three levels of reality as described
above,
(2) not representing faithfully the relevant portions of
reality at each level,
(3) forcing ‘data’ to be entered while there is nothing
the data can be data about.
some examples …
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
‘Required fields’
• User must provide data for such fields, but what is
the relation with reality?
– country: each person for sure lives in some country
– postal code, state:
• not all countries use postal codes nor consist of states
– phone number
• not everybody has a phone, or at the time of data entry the
number might not be known.
• No other option than entering fake data.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Reductionism (1)
• Forced selection from incomplete list
– 22 ocular disease types
• aren’t there any more? are the others not of interest? Just not
now or never? …
• Forced structure of data-types
– Belgian phone numbers are not structured the way US
phone numbers are structured
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
eyeGENE™ core medical data schema
Patient
Clinical
Encounter
Patient
Clinical Finding
Patient
Diagnosis
Diagnosis
Clinical
Finding
Diagnosis
Finding
Link
Clinical
Finding
Unit Link
Units
Specimen
Lab Result
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Reductionism (2)
• Where are the disorders ?
– diagnoses are in the heads of e.g. physicians
– disorders are in the body of the patient
• L1-L3 confusions
• The way clinical findings are linked to diagnoses
does not allow to study how findings are related to
disorders.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some recommendations (1)
•
For each table, data field and associated allowed values,
hard- or soft-coded business rule that restrict data-input,
1. assess what (type of) entity in reality would be denoted by any
data instance,
– includes any ‘value’ from ‘value sets’, external terminologies, etc
2. represent how these entities in reality relate to each other as
well as to other ontologically relevant entities that are not
explicitly addressed in the information model,
•
the domain model proper,
–
based on realism-based ontologies
3. describe formally how the information model has to be
interpreted in terms of the domain model.
–
‘interpretation model’
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some recommendations (2)
•
•
The (relevant parts of the) interpretation model should
be part of any information exchange.
Change user interfaces and information model only
when no ‘realist interpretation’ is possible or faithful
data entry cannot be achieved.
–
–
–
certain fields should not be ‘required’,
formatting, e.g. phone numbers, is acceptable in a userinterface when it satisfies local situations (not ‘requirements’),
but not for exchange,
‘unknown’ and ‘null values’ are acceptable, if suitable
interpretations are provided in the interpretation model, not just
as text in data-dictionaries.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Conclusions
• eyeGENE™ is successfully in use and processes by now
over 100 samples / month.
• the NIH roadmap goal to ‘require new ways to organize
how clinical research information is recorded, new
standards for clinical research protocols, modern
information technology’, is not reached now. (Does any
system ?).
• making eyeGENE™ ‘reality-aware’ is feasible.
• the hope that at some future time relevant phenotypic data
can be automatically extracted from an electronic medical
record will remain a dream as long as these systems do
not change.