Metadata Driven Clinical Data Integration

Download Report

Transcript Metadata Driven Clinical Data Integration

Metadata Driven Clinical Data
Integration – Integral to Clinical
Analytics
April 11, 2016
Kalyan Gopalakrishnan, Priya Shetty
Intelent Inc.
Sudeep Pattnaik,
Founder, Thoughtsphere
Agenda
 Role of Dynamism and Automation in Integration
 Integration – Two Approaches
 Metadata Driven Data Processing
 Metadata Driven Flows - Two Approaches
Copyright © 2016 Intelent Inc. All rights reserved.
Role of Dynamism and Automation in Integration
Dynamism
Drivers for Dynamism
and Automation
• Source structure, transformation rules,
target structure based on study
analytical needs. Most could vary
across studies.
• This warrants a set of dynamic
transformation rules to accommodate
heterogeneous needs.
• In addition the structure of the source,
the physical storage, maturity of the
data transfer mechanism and relevant
data dictionaries could vastly vary as
well
• Important to minimize and possibly
avoid any code change, transformation
pre-processing services in data
ingestion layer.
• These are simply costly and time
consuming, and discourages adoption
within the enterprise.
• Storage structures at appropriate levels of
hierarchy and stages of data lifecycle need to be
dynamic
• Such dynamism needs to be planned either by
leveraging existing metadata or manufactured
metadata
• Alternatively or in addition, a robust user interface
or means of configuration can address gaps.
• Key is to minimize code change.
Automation
•
High availability of data to points of analysis
•
From disparate sources: Raw source data,
integrated data across CTMS, IxRS, EDCs,
Labs, Reconciled, cleansed/not cleansed,
aggregated data
•
Based on use cases - interim analysis,
submission, operational metrics, central
monitoring, medical monitoring etc.
•
Key is to automate data delivery in appropriately
usable format with minimal manual intervention
Copyright © 2016 Intelent Inc. All rights reserved.
Integration – Two Approaches
Integration Aspects
Warehouse Approach
Hub Approach
Storage and Modeling
• Pre-Modeling Required
• Structure oriented
• Generic content model (schema) required
based on storage technology. For Ex.
Form/Domain level storage
• No Pre-Modeling, Loosely coupled
• Storage granularity preserved as per
the source system.
• Data tagged at appropriate level after
reconciliation.
Source Data
Integration
• Requires source system adapters, Pre
formatting to warehouse structure – ETL
approach.
• System agnostic Integration. Data is
ingested at source level granularity without
pre-processing – ELT approach.
• Requires source feeds to adhere to input
• Enabling dynamism and automation for
descriptions or requires setup /
transformations, requires:
configuration
• Availability of a repository of governed
metadata – structural and
• Robust mapping user interface @ Study
transformational.
level which utilizes a mapping library with
• Interface that allows study level mappings
auto (machine) learning technologies –
and leveraging existing library of rules
promotes mapping reuse across studies
• Multiple adapter development, especially with
• Post processing pipeline architecture
external sources (Labs/partner data)
Data Processing
• Heavy reliance on data pre-processing
before loading into the warehouse
• Time consuming and costly
• Transformations accomplished on an asneeded basis, in a post-processing layer,
based on business needs. For Ex:
• Operational review processes need
subject level data granularity
• Bio-statistical programming
processes need SDTM +/- domain
level tabulated data
Copyright © 2016 Intelent Inc. All rights reserved.
Metadata Driven Data Processing
Business Issue

How do we provide quicker access to source
and analysis ready data?

How do we adapt to changes in regulatory
standards rapidly and apply these changes to
business and operational processes?

How do we bring in more efficiency in the
source to target mapping and transformation
processes?
Solution Overview

Data Ingestion Framework ingests data from
Diverse Sources (clinical data, operational
data, reference data)

Populate Structural Metadata
(Source/Target/Reference) and
Transformational Metadata
(rules/derivations) in Metadata Repository

Dynamic Process applies transformation
rules on source data to generate target
datasets
Solution Impact

High Availability of Data (Source, Integrated,
Standardized)

Reusability of Standard Algorithms

Dynamic Automated Process

Accelerated Path for Submissions

Enhanced Support for Product Defense, Data
Sharing, Data Mining

Traceability
Copyright © 2016 Intelent Inc. All rights reserved.
Approach 1 - Metadata Driven Dynamic SAS Engine
Structural & Transformational Metadata extracted
from Metadata Repository drives dynamic program
for generating hybrid SDTM target datasets
Dynamic SAS process leverages SAS Macros
corresponding to transformational metadata
Source to Target
Transformations
– Updates in
metadata
repository
applied in next
run, MedDRA
Merge, ISO
Formats
Copyright © 2016 Intelent Inc. All rights reserved.
Approach 2 – ClinDAP - Thoughtsphere’s Metadata Driven Source
System Agnostic Clinical Data Aggregation Framework
ClinDAP Next
Generation
Data
Aggregation
Platform
•
•
•
•
•
Source System Agnostic Data Aggregation Framework
Proprietary algorithms to aggregate disparate data sources (EDC, CTMS, IVRS, Labs,
ePro, etc.)
Document-oriented database readily assembles any structured or unstructured data
Robust Mapping Engine, extensible rule library reusable across studies (Hybrid SDTM)
Interactive visualization-based data discovery
Robust Mapping Framework –
Reusable mapping library, Leverage
existing SAS libraries, Specify
complex study level transformations,
Extensible targets – Hybrid SDTM,
ADaM
Ability to operationalize analytics is
possible when you enable
automation and dynamism to
integrate data and generate
standardized datasets
Copyright © 2016 Intelent Inc. All rights reserved.