Metadata Driven Clinical Data Integration
Download
Report
Transcript Metadata Driven Clinical Data Integration
Metadata Driven Clinical Data
Integration – Integral to Clinical
Analytics
April 11, 2016
Kalyan Gopalakrishnan, Priya Shetty
Intelent Inc.
Sudeep Pattnaik,
Founder, Thoughtsphere
Agenda
Role of Dynamism and Automation in Integration
Integration – Two Approaches
Metadata Driven Data Processing
Metadata Driven Flows - Two Approaches
Copyright © 2016 Intelent Inc. All rights reserved.
Role of Dynamism and Automation in Integration
Dynamism
Drivers for Dynamism
and Automation
• Source structure, transformation rules,
target structure based on study
analytical needs. Most could vary
across studies.
• This warrants a set of dynamic
transformation rules to accommodate
heterogeneous needs.
• In addition the structure of the source,
the physical storage, maturity of the
data transfer mechanism and relevant
data dictionaries could vastly vary as
well
• Important to minimize and possibly
avoid any code change, transformation
pre-processing services in data
ingestion layer.
• These are simply costly and time
consuming, and discourages adoption
within the enterprise.
• Storage structures at appropriate levels of
hierarchy and stages of data lifecycle need to be
dynamic
• Such dynamism needs to be planned either by
leveraging existing metadata or manufactured
metadata
• Alternatively or in addition, a robust user interface
or means of configuration can address gaps.
• Key is to minimize code change.
Automation
•
High availability of data to points of analysis
•
From disparate sources: Raw source data,
integrated data across CTMS, IxRS, EDCs,
Labs, Reconciled, cleansed/not cleansed,
aggregated data
•
Based on use cases - interim analysis,
submission, operational metrics, central
monitoring, medical monitoring etc.
•
Key is to automate data delivery in appropriately
usable format with minimal manual intervention
Copyright © 2016 Intelent Inc. All rights reserved.
Integration – Two Approaches
Integration Aspects
Warehouse Approach
Hub Approach
Storage and Modeling
• Pre-Modeling Required
• Structure oriented
• Generic content model (schema) required
based on storage technology. For Ex.
Form/Domain level storage
• No Pre-Modeling, Loosely coupled
• Storage granularity preserved as per
the source system.
• Data tagged at appropriate level after
reconciliation.
Source Data
Integration
• Requires source system adapters, Pre
formatting to warehouse structure – ETL
approach.
• System agnostic Integration. Data is
ingested at source level granularity without
pre-processing – ELT approach.
• Requires source feeds to adhere to input
• Enabling dynamism and automation for
descriptions or requires setup /
transformations, requires:
configuration
• Availability of a repository of governed
metadata – structural and
• Robust mapping user interface @ Study
transformational.
level which utilizes a mapping library with
• Interface that allows study level mappings
auto (machine) learning technologies –
and leveraging existing library of rules
promotes mapping reuse across studies
• Multiple adapter development, especially with
• Post processing pipeline architecture
external sources (Labs/partner data)
Data Processing
• Heavy reliance on data pre-processing
before loading into the warehouse
• Time consuming and costly
• Transformations accomplished on an asneeded basis, in a post-processing layer,
based on business needs. For Ex:
• Operational review processes need
subject level data granularity
• Bio-statistical programming
processes need SDTM +/- domain
level tabulated data
Copyright © 2016 Intelent Inc. All rights reserved.
Metadata Driven Data Processing
Business Issue
How do we provide quicker access to source
and analysis ready data?
How do we adapt to changes in regulatory
standards rapidly and apply these changes to
business and operational processes?
How do we bring in more efficiency in the
source to target mapping and transformation
processes?
Solution Overview
Data Ingestion Framework ingests data from
Diverse Sources (clinical data, operational
data, reference data)
Populate Structural Metadata
(Source/Target/Reference) and
Transformational Metadata
(rules/derivations) in Metadata Repository
Dynamic Process applies transformation
rules on source data to generate target
datasets
Solution Impact
High Availability of Data (Source, Integrated,
Standardized)
Reusability of Standard Algorithms
Dynamic Automated Process
Accelerated Path for Submissions
Enhanced Support for Product Defense, Data
Sharing, Data Mining
Traceability
Copyright © 2016 Intelent Inc. All rights reserved.
Approach 1 - Metadata Driven Dynamic SAS Engine
Structural & Transformational Metadata extracted
from Metadata Repository drives dynamic program
for generating hybrid SDTM target datasets
Dynamic SAS process leverages SAS Macros
corresponding to transformational metadata
Source to Target
Transformations
– Updates in
metadata
repository
applied in next
run, MedDRA
Merge, ISO
Formats
Copyright © 2016 Intelent Inc. All rights reserved.
Approach 2 – ClinDAP - Thoughtsphere’s Metadata Driven Source
System Agnostic Clinical Data Aggregation Framework
ClinDAP Next
Generation
Data
Aggregation
Platform
•
•
•
•
•
Source System Agnostic Data Aggregation Framework
Proprietary algorithms to aggregate disparate data sources (EDC, CTMS, IVRS, Labs,
ePro, etc.)
Document-oriented database readily assembles any structured or unstructured data
Robust Mapping Engine, extensible rule library reusable across studies (Hybrid SDTM)
Interactive visualization-based data discovery
Robust Mapping Framework –
Reusable mapping library, Leverage
existing SAS libraries, Specify
complex study level transformations,
Extensible targets – Hybrid SDTM,
ADaM
Ability to operationalize analytics is
possible when you enable
automation and dynamism to
integrate data and generate
standardized datasets
Copyright © 2016 Intelent Inc. All rights reserved.