r27_651_medinfo2013_..

Download Report

Transcript r27_651_medinfo2013_..

Building A Knowledge Base of Severe Adverse
Drug Events Based On AERS Reporting Data
Using Semantic Web Technologies
Guoqian Jiang, MD, PhD
Mayo Clinic College of Medicine, Rochester, MN, USA
MEDINFO 2013
Copenhagen, Denmark
August 21, 2013
©2013 MFMER | slide-1
Acknowledgements
• Co-authors
• Liwei Wang – Jilin University, China
• Hongfang Liu – Mayo Clinic, USA
• Harold R. Solbrig – Mayo Clinic, USA
• Christopher G. Chute – Mayo Clinic, USA
• This work was supported in part by the SHARP
Area 4: Secondary Use of EHR Data
(90TR000201).
©2013 MFMER | slide-2
Introduction
• Adverse Drug Events (ADEs) have been a
well-recognized cause of patient morbidity
and increased health care costs.
• A semantically coded knowledge base of
ADEs with severity information is critical for
clinical decision support systems and
translational research applications.
©2013 MFMER | slide-3
In the Field of Translational Research
• Pharmacogenomics study of ADEs
• the genetic component of ADEs is being considered as
one of significant contribution factors for drug response
variability and drug toxicity.
• PharmGKB – initiated by NIH
• To collect and disseminate human-curated information
about the impact of human genetic variation on drug
responses
• Canadian Pharmacogenomics Network for Drug Safety
• to identify novel predictive genomic markers of severe
ADEs in children and adults
©2013 MFMER | slide-4
ADEpedia Project
• A standardized knowledge base of ADEs that
intends to integrate existing known ADE
knowledge for drug safety surveillance from
disparate resources such as
• the FDA Structured Product Labeling (SPL),
• the FDA Adverse Event Reporting System
(AERS) and
• the Unified Medical Language System (UMLS).
• A framework of knowledge integration and
discovery that aims to support pharmacogenomictarget prediction of ADEs.
©2013 MFMER | slide-5
Severe ADEs
• Since the clinical applications of
pharmacogenomics on ADEs are usually focused
on the clinically severe ADEs, we designed a
module in the ADEpedia framework for extracting
severe ADE knowledge.
• However, few open-source ADE knowledge
resources with severity information are available
and it remains a challenging task for measuring
and identifying the severity information of ADEs.
©2013 MFMER | slide-6
Objective of the study
• To develop and evaluate a semantic web
based approach for building a knowledge
base of severe ADEs based on the FDA
AERS reporting data.
©2013 MFMER | slide-7
Semantic Web Technologies
• The W3C standards
• The Resource Description Framework (RDF)
• A model of directed, labeled graphs
• Using a set of triples (subject, predicate,
object)
• The SPARQL
• A query language for RDF graphs
• The Web Ontology Language (OWL)
• A standard ontology language used for
ontology modeling
©2013 MFMER | slide-8
Materials (I)
• Normalized AERS Dataset – AERS-DM
• Reporting data from 2004 to 2011
• Drug names – RxNorm Codes by MedEx
• Mapped to the NDF-RT drug classes
• ADE names – MedDRA codes
• Aggregated to the System Organ Class
(SOC) codes
• Contains 4,639,613 putative Drug-ADE pairs
• Unique report ID number (ISR)
• Used to identify the outcome codes
©2013 MFMER | slide-9
©2013 MFMER | slide-10
Materials (II)
• Common Terminology Criteria for Adverse Event
(CTCAE) and Its Grading System
• We used the CTCAE version 4.0 rendered in the
Web Ontology Language (OWL) format that is
publicly available.
• This version contains 764 AE terms and 26 “Other,
specify” options for reporting text terms not listed in
CTCAE.
• Each adverse event (AE) term is associated with a
5-point severity scale. The AE terms are grouped by
MedDRA Primary SOC classes.
• In the CTCAE, “Grade” refers to the severity of the
adverse event.
©2013 MFMER | slide-11
©2013 MFMER | slide-12
Materials (III)
• ADE Datasets
• SIDER 2
• Released on October 17, 2012
• Contains 996 drugs, 4,192 side effects (SE), and
99,423 drug-SE pairs
• UMLS ADE dataset from ADEpedia
• Contains 266,832 drug-disorder concept pairs,
covering 14,256 (1.69%) distinct drug concepts
and 19,006 (3.53%) distinct disorder concepts.
• There are a total of 102 relationships between
the drug-disorder concept pairs. - 1. Indications;
2. Contraindications; 3. Adverse drug effects; and
4. Other associations.
©2013 MFMER | slide-13
Methods
• Linking outcome codes with putative drug-ADE
pairs
• Validating the drug-ADE associations
• Data integration in a semantic web framework
• Classifying the AERS ADEs into the CTCAE in
OWL
• We asserted the mappings between AERS
outcome codes and CTCAE grades
©2013 MFMER | slide-14
System Architecture
©2013 MFMER | slide-15
Results
• Produced a cardiac-AERS-DM dataset
• contains 164,895 entries with 21,757 unique putative
Drug-ADE pairs,
• covering 3,073 unique drug codes in RxNorm and
251 unique ADE codes in MedDRA.
©2013 MFMER | slide-16
For validated drug-ADE pairs
• We had 2,444 unique pairs, of which 760 pairs
are in Grade 5; 775 pairs in Grade 4 and 2,196
pairs in Grade 3.
• The drug-ADE pairs cover 821 unique drug
codes in RxNorm and 69 unique ADE codes in
MedDRA, whereas 20 of 36 (55.6%) of AE
terms under the Cardiac Disorders category in
CTCAE were covered.
©2013 MFMER | slide-17
©2013 MFMER | slide-18
©2013 MFMER | slide-19
Severity Classification of ADEs in CTCAE
©2013 MFMER | slide-20
Discussion
• We utilized a normalized AERS dataset, in
which the drug names are normalized using
standard drug ontologies RxNorm and NDF-RT
and the ADEs are normalized using MedDRA.
• Which facilitated the interoperability between
ADE datasets (e.g., mappings to SIDER)
©2013 MFMER | slide-21
Validation Pipeline
• The SIDER dataset should be considered as a
“silver” standard rather than a “gold” standard
for the validation.
• Although the UMLS drug-disorder pairs only
covered a small portion of putative drug-ADE
pairs (1.4%), the validation illustrated the
usefulness of known ADE knowledge asserted
in the UMLS in discerning the indications from
the ADEs.
• For those new ADEs that have not been
recognized, a robust ADE detection algorithm
will be required in the future.
©2013 MFMER | slide-22
Rationale for the use of CTCAE grading
system
• The CTCAE as a standard has been widely
used in clinical cancer study for recording the
AE severity;
• It has clear severity definitions using a 5-scale
grading system;
• It includes the most common AE terms that
have been well classified and mapped with a
standard AE vocabulary MedDRA;
• It contains well-defined conditions for grading
the severity of AE terms based on the domain
knowledge.
©2013 MFMER | slide-23
Leveraging Semantic Web Technologies
• We leveraged semantic web technologies that
provide a scalable framework for data
integration of heterogeneous ADE resources.
• In particular, we represented validated drugADE pairs in the OWL format, which not only
provides seamless integration with the CTCAE,
but also enables a standard infrastructure for
automatic classification of ADEs based on the
severity conditions specified in the CTCAE.
©2013 MFMER | slide-24
Summary
• We developed a semantic web based approach
for building a standard severe ADE knowledge
base using a normalized FDA AERS reporting
data.
• The datasets produced in this study is publicly
available from our ADEpedia website
• http://adepedia.org
• Although we were focused on the Cardiac
Disorders domain, we believe the approach can
be easily generalized to analyze the data in all
other domains available in the AERS reporting
data.
©2013 MFMER | slide-25
Mayo Clinic
Locations
©2013 MFMER | slide-26
Questions & Discussion
©2013 MFMER | slide-27