downloading - University of Kansas Medical Center

Download Report

Transcript downloading - University of Kansas Medical Center

KUMC Biomedical Informatics Resources
for your Research: a focus on HERON
Russ Waitman, PhD
Director of Medical Informatics,
Associate Professor, Department of Biostatistics
Director, Frontiers Biomedical Informatics
Assistant Vice Chancellor, Enterprise Analytics
University of Kansas Medical Center
Kansas City, Kansas
This project is supported in part by NIH grant UL1TR000001
and NSF Award CNS-1258315
Biomedical Informatics Can Help Your Research
• We have tools and expertise to manage data
and convert it into information
• REDCap and CRIS – enter and manage data
• HERON – fish for data from the hospital/clinic
• Biweekly Frontiers Clinical Informatics Clinics
– Tuesday 4-5 pm in 1028 Dykes Library.
– Next session April 30, 2013.
You’re that fisherman: wanting to land data to
answer your research hypothesis
Bennett Spring Trout Park, Lebanon Missouri
http://mdc.mo.gov/regions/southwest/bennett-spring
The Fish: Diagnoses, Demographics, Observations,
Treatments
Why so many fish?
Current Goal: Build Hatchery, Manage the Fishery
Second Goal: If you need help fishing, get a guide
Photo Credit: HuntFishGuide.com
http://www.flickr.com/photos/huntfishguide/5883317106/
Prepare and Analyze Data
Photo Credit: S. Klathill
http://www.flickr.com/photos/sklathill/505464990/
Our shared goal: a tasty publication
Photo Credit: Steve Velo
http://www.flickr.com/photos/juniorvelo/259888572/
Nightmare: looks like a nice river, but can’t catch fish
• I’ll just enter everything in Excel….
• What if I lose or accidentally sort my
spreadsheet?
• How to I let students only review deidentified data?
• Hospital/Clinic is making me use this
Electronic Medical Record and I get
nothing in return...
Little White Salmon River, Washington State, last Summer in July
Sometimes, You’re willing to enter data/buy fish:
REDCap: Research Electronic Data Capture
• https://redcap.kumc.edu
– It uses the same username and password as your KUMC email.
•
Non-KUMC researchers can request an affiliate account through Frontiers CTSA office
– Check out the training materials under videos
– Case Report Forms and Surveys
• For consultation and to move project to production: Register
your project with us so we can keep track of your request.
– http://biostatistics.kumc.edu/projectReg.aspx
– After you register your project, a CRIS team member, likely Kahlia Ford will get in touch
with you.
• Check out other institutions using REDCap and possibly
borrow from the master library.
– http://www.project-redcap.org/
REDCap Case Report Form Example
REDCap Survey: Think SurveyMonkey
Option Two: CRIS
REDCap Disclaimer
• For clinical trials, CRIS (Velos) may be a better fit
– Multiple years of experience
– CRIS team builds for you with biostatistics review
– Budget for CRIS team and biostatistics explicitly
• “Investigator driven” REDCap only works if you, the Principal
Investigator, takes responsibility for your data
– Scalability: informatics provides consultation and responsibility for
technical integrity; not your dictionary or data entry.
• Underwritten by CTSA, but you “feed and talk to your fish”
– Middle model where informatics can build for you in REDCap.
• Again, you budget for our team’s time
REDCap: think Fish Tank you manage
http://www.flickr.com/photos/wiccked/185270913/lightbox/
I want to go fishing, not fill a fish tank (REDCap)
Use HERON: a managed fishery
Bonneville Hatchery: Trout, Salmon, Sturgeon, Columbia River, Oregon
Central CTSA Informatics Aim: Create a data “fishing”
platform: HERON, https://heron.kumc.edu
• Get a License: Develop business
agreements, policies, data use agreements
and oversight.
• Get a Fishing Rod and Bass Boat:
Implement open source NIH funded (i.e.
i2b2 https://www.i2b2.org/) initiatives for
accessing data.
• Know what your catching: Transform data
into information using the NLM UMLS
Metathesaurus as our vocabulary source.
• Stock Different Tasty Fish: link clinical data
sources to enhance their research utility.
HERON: Getting a Fishing License
Single sign-on
using your email
username
Real-time check
for current human
subjects training
• Fill out System Access Agreements to sponsor students/staff
• Fill out Data Use Agreement to request data export
• No Limit!!! IRB Protocol Not Required to view or pull deidentified data
• Must be on campus or use VPN or https://access.kumed.com
• Check http://frontiersresearch.org/frontiers/HERON-Introduction
for more information, status, and training videos
The i2b2 “Fishing Rod”: build Diabetes cohort
Types of “fish” in folders
Drag concepts from upper left
into panels on the right
i2b2 : AND in Frontiers Research Registry
Dragging over the second condition
i2b2: AND a high Hemoglobin A1C
When you add a numeric concept,
i2b2 asks if you want to set a constraint
i2b2 Result: 497 patients in Cohort
Run the Query
Query took 4 seconds
497 patient in cohort
I2b2: Explore Cohort, Visualize
The dream: landing the big one
Catch the data
for JAMA, NEJM
publication
http://www.oregon.com/columbia_gorge_attractions/bonneville_hatchery
Without getting bit
How the team works: HERON Evolves Every Month
• Goal: stable monthly process, minimal downtime
• Complete rebuild of the repository, not HL7 messaging update based.
• Two databases: create new DB while old DB is in use.
• When the new DB is ready, switch over i2b2 to serve customers fresh data.
• Initial Files from Clinical Organizations
• Export KUH Epic Clarity relational database instead of Cache/MUMPS.
• Monthly file from UKP clinic billing system (GE IDX). UHC CDB, NAACCR
• Demographics, services, diagnoses, procedures, and Frontiers research
participant flag.
• Extract Transform Load (ELT) processes largely SQL (some Oracle
PL/SQL)
• Wrapped in python scripts.
• Goals for a monthly release (20 months in a row so far):
– Fresh data. Example: another month of visits = millions of facts
– New types of data. Example: family history
– New functionality: Example: link data by encounter across clinical and
financial sources; distinguish medication administration from prescription
Monthly Release
Blog highlights:
https://informatics.kumc.edu/work/blog
- Features
- Size
- Dates of sources
HERON’s Data Sources, Types of Data
https://informatics.kumc.edu/work/wiki/HeronProjectTimeline#Sep2012Planning
- contains current plan for next several monthly releases
“Who’s Using HERON” and collaboration approaches
• Find a colleague
• Talk with hospital, clinic to
understand workflow
• Attend bi-weekly clinics
• Watch the videos:
http://frontiersresearch.org/frontie
rs/informatics-training-videos
• Request a consult
http://frontiersresearch.org
/frontiers/biomedicalinformatics
If you don’t see what you want, or you
really like things, let us know:
https://redcap.kumc.edu/surveys/?s=3S
BkPg&tool=1
HERON De-identification: Remove HIPAA 18
identifiers -> non human subjects research
• HIPAA Safe Harbor De-identification
– Remove 18 identifiers and randomly date shifting by up to 365 days back
in time
• Downside: can’t do seasonal studies without IRB approval to go back and get actual
dates
• In general, tack on 7 months when wanting volume for the last year.
– Resulting in non-human subjects research data but treated as a limited
data set from a system access perspective. System users and data recipients
agree to treat as a limited data set (acknowledging re-identification risk)
• To be addressed:
– For now, we won’t add free text such as progress notes with text
scrubbers (DeID, MITRE Identification Scrubber toolkit)
• Date Shift example:
– Patient was born August 13, 1968, had their blood pressure measured on
November 28, 2012.
– Each month dates shifted, ex: to -15 for January release: New birthday is
July 29, 1968 and the blood pressure measurement occurred on
November 13, 2012.
• For another patient, their offset might be -278. Next month the Aug 13th patient’s offset
might be -192.
Research Context: Medical Informatics Hypotheses
Hypothesis #2: Computer +
Clinical Process-> Better Health?
Hypothesis #1:
Admin + Clinical
-> Better Knowledge?
Emerging Functionality: From Data Aggregation to
Hospital Quality Preliminary Analysis
• Motivation: Build a way to go beyond counting and obtain insight
before you need a Data Use Agreement and release patient data.
– Grows out Dan Connolly’s survival analysis tool for NCI site visit
– Intermediate step of a multi-cohort generalized survival plugin
– R Data Builder plugin in i2b2 and integration with RStudio Server
• (http://www.rstudio.com/ide/docs/server/getting_started)
• Test Case: Antibiotic Administration
for Septic patients in the Emergency
Room
– Past publication to bring in flowsheet
data an important foundation
– University HealthSystem Consortium
CDB “gold” standard for KU Hospital
– What can you solve in i2b2 “same
financial encounter” versus send to R?
Repurposing i2b2 Clinical Research Infrastructure for
Inpatient Quality Improvement
• i2b2 “largely” ambulatory or population/genomics focused
• Is i2b2 version 1.6 with same financial encounter and modifiers
now useful for inpatient research?
• Goal: understand medication
timing and antibiotic selection
• Suspect vancomycin preferred
• Validate HERON medications
– Especially administration timing
Systems Architecture
Identified data server
monthly refresh ETL
i2b2 compatible
star schema
de-identification process
i2b2 compatible
star schema
secure FTP/ETL
Staged
source data
De-identified server
Application server
RStudio Server
Source System files
(EMR dump, UHC CDB extract)
i2b2
Hive
rgate
R scripts
plots,
statistics
Investigator’s client
i2b2 web client
One tab in browser
RStudio IDE web client
Another tab in browser
R Data Builder Plugin and RStudio Server
Web based for user. Just
another tab in the browser
All data stays on the server
so there’s no data release
and risk of re-identification
due to a lost file
i2b2 Plugin invokes a program that creates
a Rda file in their directory on the server
UHC, Flowsheets, Medications data sources:
what i2b2 could answer versus R analysis
3513 patients had
a UHC-defined
septicemia
diagnosis
2912 patients were
an Emergency
Admission
2861 patients age
were 18 years or
older
2722 patients had
an exposure to an
Antibiotic in the
encounter
1839 had
ED Triage
documentation
during the
encounter
i2b2 could define cohort
1223 had 2 SIRS
criteria, organ
dysfunction and
suspicion/treatment
of infection
1836 had the
Sepsis Screen
Used during the
encounter
Cohorts above line
defined with i2b2
717 MD notified
Average time to sepsis
screening 2.9 hours,
median 49 minutes
Note: 28 patients who lacked an ED departure
time were excluded from further analysis
Cohorts below line
further refined with R
1244 patients had
1st antibiotic admin
within 24 hours
(1474 encounters)
Average time in
ED is 7.9 hours,
median 7.1
A
cohort refinement with R
261 had 1st antibiotic
admin before sepsis
screening (277
encounters)
D
1040 had 1st antibiotic admin
after sepsis screening
(1197 encounters)
993 had 1st antibiotic
admin given in ED
(1140 encounters)
316 had 1st antibiotic
admin not in ED
(334 encounters)
E
B
C
Average time
spent in ED is 8.7
hours, median 7.6
Average time
spent in ED is 6.7
hours, median 6.6
Density Plots: Time from Arrival to First Antibiotic
Broad Spectrum versus
Vancomycin
1
Lag when given3 outside Emergency Room
0.15
Drug
0.10
broad
vanc
0.05
Proportion of Encounters
Proportion of Encounters
0.20
0.00
0.15
When
in.ed
0.10
not.in
0.05
0.00
0
5
10
15
20
25
0
5
Hours
10
15
20
25
Hours
Lag in Broad Spectrum
after Vancomycin
2
Administration
relative to RN Sepsis Screen
4
0.15
0.10
Drug
broad
vanc
0.05
0.00
Proportion of Encounters
Proportion of Encounters
0.20
0.15
Admin
before
0.10
after
0.05
0.00
0
5
10
Hours
15
20
25
0
5
10
Hours
15
20
25
Aligning Clinical Research Informatics for Quality:
Registry Abstraction and Data Delivery
• REDCap registries into i2b2 allows intuitive exploration
– Researchers may need less abstraction as data is extracted from the EMR.
• i2b2 into REDCap: inherit security model, graphical/export tools
Next Steps
• Informatics Research and Systems for Hypothesis #1
– Administrative plus Clinical/Biomedical providers better knowledge
– Current UHC models of administrative data based on linear regression
•
Want to reproduce UHC models with for our data in HERON
– Then develop systematic method to evaluate utility of clinical data
• Perhaps applicability of newer machine learning and statistical methods and methods for
validation (ex: bootstrapping)
• Engage with Clinical Researchers and Hospital Quality
– Continue to harvest valuable data: microbiology discrete pathology results
– Advance streamlined methods for self service
• Recognize though that data driven research is non-trivial and sometime the effort is
underestimated by investigators
• Harvest Epic alerts (best practice, drug interaction), Orderset
Utilization to evaluate Hypothesis #2
– Computer + Clinical Process -> Improved Decisions and Better Health
Questions?