S36 VipulKashyap

Download Report

Transcript S36 VipulKashyap

Semantic Web Technologies for
Translational Medicine
Vipul Kashyap, PhD
[email protected]
Senior Medical Informatician, Clinical Knowledge Management and Decision Support
Clinical Informatics R&D, Partners Healthcare System
Panel on “Towards a Semantic Web for the Life Sciences?”
October 24, 2005
Outline
•
Translational Medicine Use Case
— Translation of Genomic Research Insights into Clinical Care
•
Key Functionalities
— Data Integration
— Actionable Decision Support
— Knowledge Update and Propagation
•
Semantic Web Technologies
— RDF: Resource Description Framework
— OWL: Web Ontology Language
— SWRL: Semantic Web Rules Language
•
Conclusions
Translational Medicine Use Case*:
Dr. Genomus Meets Basketball Player
Who fainted at Practice
• Clinical exam reveals
abnormal heart sounds
• Family History: Father with
sudden death at 40,
• 2 younger brothers
apparently normal
• Ultrasound ordered based on
clinical exam reveals
cardiomyopathy
Structured Physical Exam
Structured Family History
Structured Imaging Study
Reports
* Use Case provided by Dr. Tonya Hongsermeier
Actionable Decision Support in
the Workflow Context
Echo triggers guidance to screen for possible mutations:
- MYH7, MYBPC3, TNN2, TNNI3, TPM1, ACTC, MYL2, MYL3
Knowledge-based
Decision Support
Connecting Dx, Rx, Outcomes and
Prognosis Data to Genotypic Data for Cardiomyopathy
person
concept
Z5937X
Z5937X
Z5937X
Z5937X
Z5937X
Z5956X
Z5956X
Z5956X
Z5956X
Z5956X
Z5956X
Z5956X
Z5956X
Syncope
ER visit
Palpitations
Gene-Chips
Echocardio
Gene-Chips
Cardiomyop
Atrial Fib.
Echocardio
EKG
Cardiac Arr
ER Visit
Thalamus
date
3/4
3/4
3/4
3/4
4/6
5/2
5/2
5/2
5/2
3/9
3/9
3/9
3/9
raw value
microarray
(encrypted)
Gene expression in HCM Test Results
Outcomes calculated every week
Myectomy
Atrial Arrhythymi
ER visits
Clinic visits
Ventricular Arrhy
ICD
Cong. Heart Failure
microarray
(encrypted)
statistics
application
server
population
registry
database
ownership
manager
encryption
A one slide Introduction to RDF/OWL
What is RDF?
What is OWL?
•
•
Web Ontology Language – description of knowledge
and ontologies of a given domain
•
Axioms/constraints capture knowledge about a given
domain, e.g.,
—
class(Patient), class(Person)
—
Patient  Person
•
Resource Description Framework – description
of any resource
Triples <resource, property, value>,
e.g., <URI1, “name”, “Mr. X”>
—
Nodes: “URI1”, “Mr. X”
—
Edge: “name”
•
Graph based Data Model
•
RDF graphs are instances of ontological
elements
•
Lattice Organization
•
Axioms/constraints are imposed on underlying RDF
Graph instances
• URIs (URLs) are used as identifiers for:
• Resources, Properties, Values, Namespaces and Ontological Elements
• Namespaces contain:
• Tags for RDF and OWL languages
• Ontological elements (classes, properties) that are instantiated by these RDF Graphs
• Ontological elements or XML Schema datatypes that are dimensions of identifiers such as LSIDs
A Strawman Ontology for
Translational Medicine
OWL ontologies that blend knowledge
from the Clinical and Genomic Domains
Clinical Knowledge
Figure reprinted with
permission from
Cerebra, Inc.
Genomic Knowledge
Data Integration
Domain Ontologies
for Translational Medicine
Instantiation
Merged RDF Graph
RDF Graph 1
RDF Wrapper
LIMS Data
RDF Graph 2
RDF Wrapper
EMR Data
Use of RDF graphs that instantiate
these ontologies:
-- Rules/semantics-based integration
independent of location, method of access
or underlying data structures!
- Highly configurable, minimize
software coding
Bridging Clinical and Genomic Information
“Paternal”
“Mr. X”
1
90%
degree
type
name
Patient
(id = URI1)
has_structured_test_result
evidence
Patient
(id = URI1)
related_to
has_family_history
Person
(id = URI2)
associated_relative
MolecularDiagnosticTestResult
(id = URI4)
identifies_mutation
indicates_disease
problem
FamilyHistory
(id = URI3)
“Sudden Death”
MYH7 missense Ser532Pro
(id = URI5)
EMR Data
LIMS Data
Rule/Semantics-based Integration:
- Match Nodes with same Ids
- Create new links: IF a patient’s structured test result indicates a disease
THEN add a “suffers from link” to that disease
Dialated
Cardiomyopathy
(id = URI6)
Bridging Clinical and Genomic Information
90%
evidence
Dialated
Cardiomyopathy
(id = URI6)
suffers_from
“Paternal”
“Mr. X”
1
type
name
degree
indicates_disease
StructuredTestResult
(id = URI4)
has_structured_test_result
identifies_mutation
MYH7 missense Ser532Pro
(id = URI5)
Patient
(id = URI1)
related_to
has_family_history
has_gene
Person
(id = URI2)
associated_relative
problem
FamilyHistory
(id = URI3)
RDF Graphs provide a semantics-rich substrate for decision
support. Can be exploited by SWRL Rules
“Sudden Death”
Actionable Decision Support:
using SWRL
IF the Patient’s structured test result identifies the mutation MYH7
missense:Ser532Pro with confidence ≥ 90%
AND the structured test result is indicative of Dialated Cardiomyopathy
THEN
Patient suffers from Dialated CardioMyopathy
Patient has gene MYH7missense:Ser532Pro
Perform DCM monitoring and management protocol on the Patient.
patient(?p) & molecular_diagnostic_test(?t) & has_structured_test_result(?p, ?t) &
identifies_mutation(?t, “MYH7 missense:Ser532Pro”) &
indicates_disease(?t, “Dialated Cardiomyopathy”)
 suffers_from(?p, “Dialated Cardiomyopathy”)
has_gene(?p, “MYH7 missense:Ser532Pro)
recommended_intervention(“DCM Monitoring and Management”)
Semantic Web Rules Language (SWRL)
•
•
•
References to ontological concepts and relationships
— Describe clinical and genomic information
Can be used to infer patient state:
— Patient has a particular gene/mutation
— Patient suffers from a particular disease
Can be used to recommend clinical care:
— Order Monitoring and Management Protocol
patient(?p) & molecular_diagnostic_test(?t) & mutation(?m) & disease(?d)
has_structured_test_result(?p, ?t) & identifies_mutation(?t, ?m) &
indicates_disease(?t, ?d) & suggested_protocol(?d, ?pro)
 suffers_from(?p, ?d)
has_gene(?p, ?m)
order_protocol(?pro)
Knowledge Update and Propagation
IF Molecular Diagnostic reveals MYH7 missense: Ser532Pro or Phe764Leu
AND No Structural Heart Disease on Echocardiogram
THEN perform DCM monitoring and management protocol
Knowledge
Update
(Hypothetical)
IF Molecular Diagnostic reveals MYH7 missense: Ser532Pro
AND No Structural Heart Disease on Echocardiogram
THEN perform late onset of DCM monitoring protocol
If Molecular Diagnostic reveals MYH7 missense Phe764LEU
AND No Structural Heart Disease on Echocardiogram
THEN perform early onset of DCM monitoring protocol
•
•
•
•
Discovery of New Genotypes
Invention of New Monitoring Protocols
Discovery of Associations between Genotype, Disease and Monitoring Protocols
Modification of Decision Support Rules to Reflect This
 Modifies resultant RDF graphs generated!
Knowledge Update and Propagation
•
•
•
•
Discovery of New Genotypes
Invention of New Monitoring Protocols
Discovery of Associations between Genotype, Disease and Monitoring Protocols
Modification of Decision Support Rules to Reflect This
 Modifies resultant RDF graphs generated!
IF Molecular Diagnostic reveals MYH7 missense: Ser532Pro or Phe764Leu
AND No Structural Heart Disease on Echocardiogram
THEN perform DCM monitoring and management protocol
Knowledge
Update
(Hypothetical)
IF Molecular Diagnostic reveals MYH7 missense: Ser532Pro
AND No Structural Heart Disease on Echocardiogram
THEN perform late onset of DCM monitoring protocol
IF Molecular Diagnostic reveals MYH7 missense Phe764LEU
AND No Structural Heart Disease on Echocardiogram
THEN perform early onset of DCM monitoring protocol
Knowledge Update and
Propagation
Genotype
indicates
Rule
- genotype_condition
- indicates_disease
- recommended_intervention
Disease
indicates
recommended_intervention
Monitoring
Protocol
Knowledge
Update
Genotype2
indicates
Genotype1
indicates
indicates
Monitoring
Protocol1
Monitoring
Protocol2
Decision Support
Use of OWL Inferences for:
Logic Update
- Keeping knowledge internally consistent
- Propagating changes to Dependent Knowledge
Artifacts
Rule1
- genotype_condition
- indicates_disease
- recommended_intervention
Rule2
- genotype_condition
- indicates_disease
- recommended_intervention
Disease
recommended_intervention
Update
Propagation
Updated RDF Graphs
are generated from
this point on!
Conclusions
•
Translational Medicine is a knowledge intensive field. The ability to capture
semantics of this knowledge is crucial for implementation.
•
Personalized Medicine cannot be implemented in an scalable, efficient and
extensible manner without Semantic Web technologies
•
The rate of Knowledge Updates will change drastically as Genomic
knowledge explodes
•
Automated Semantics-based Knowledge Update and Propagation will be
key in keeping the knowledge updated and current