Jennifer Garvin PhD, MBA, RHIA, CPHQ, CCS, CTR, FAHIMA Salt

Download Report

Transcript Jennifer Garvin PhD, MBA, RHIA, CPHQ, CCS, CTR, FAHIMA Salt

Automating the Inpatient Chronic
Heart Failure Quality Measures in
VA
www.wordle.net
Jennifer Garvin PhD, MBA, RHIA, CPHQ, CCS, CTR, FAHIMA
Salt Lake City VA Healthcare System IDEAS Center
University of Utah 5/16/14
This study is supported by the VA HSR&D IBE- 09-069-1 grant
Overview
• Describe development of natural language
processing (NLP) tools
• Describe inpatient chronic heart failure (CHF)
quality measures
• Demonstrate the development of natural
language process development tools using the
case study of CHF
Applied Use Case
Purpose & Background
Decision
Support
Performance
Measures
Appropriateness
Measures
Clinical
Guidelines
Evidence
Clinical
Processes
Formulary
Clinical
Reminders
http://www.healthquality.va.gov/chf/
VA Informatics and Computing
Infrastructure- VINCI
• Research approvals
• Assigned a research folder on VINCI
• In VINCI all approved investigators and staff
can access data
• http://www.hsrd.research.va.gov/for_researchers/vinci/
4
Preparation for Research
• Workflow analysis- visitation and discussion with 2 echo
laboratories
• Understanding how the documents are developedvariety of approaches
• Initial review of document structure to inform our
sampling strategy- we planned to oversampled free
text and semi-structured text
– Paragraph with no outline structure (free text)
– Outline with some free text (semi-structured text)
– Outline (highly structured text)
5
Document Format (Fake Texts using XXX
in Place of Letters)
6
EF Sampling Strategy
• We had a total of 765 documents available, we needed 367
minimum documents for the test set, and used the remaining
398 documents were available for training.
• However, if during system training, the performance of the
system reaches the pre-specified level of accuracy without
using all available training documents, the remaining unused
documents in the training set could be added to the test set.
• The 765 documents were randomly assigned to a training and
test set in preparation for annotation.
7
NLP Development MethodsReference (Gold) Standard
Development
• All documents in the training and test sets
must have an accompanying reference (gold)
standard so that the accuracy of the system
can be measured during training and testing
• Software program called Knowtator was used
for annotation
• Two independent reviewers with a third
adjudicator when disagreement occurred
8
Ejection Fraction Annotation Schema
Classes
1. Ejection Fraction – annotate all mentions
of left ventricular ejection fraction.
2. Value – annotate all mentions of the
quantitative value associated with left ventricular
ejection fraction.
3. Qualitative assessment – annotate all
mentions of qualitative assessment of LV
ejection fraction and LV systolic function.
4. LV systolic function – annotate all
mentions of left ventricular systolic function.
5. Document level
a) EF Range (<40%, >=40%, undetermined)
b) Informativeness: The format of the document
was sufficiently predictable that I could skim it
rapidly to find what I wanted
c) Consistency: In order to verify that the
information in the document was internally
consistent, I found that I had to go “back and
forth”
•
NLP Development MethodsTraining
and
Testing
Training
– The system we used was developed but was not “trained” for this
specific use case
– Separated the training documents into batches of documents and
ran the system of a set of the batches
– Evaluated false positives and false negatives based on comparison to
the reference (gold) standard
– Reprogram the system and run against the next batch
– When pre-specified level of accuracy reached, measure accuracy at
the last iteration
• Testing
– The system is run on the sequestered documents and the output
received by the statistician
10
Automated Data Acquisition for Heart
Failure (ADAHF)
Diagram of Overall Classification and Sub-classifications
Summary of Development Steps
• Determine a use case
• Investigate if there are
existing tools that have
been used for a given
use case
• If none, an existing tool
may need to be may be
generalized or a new
one developed
• Determine data
elements
• Determine
development
environment
• Train and Test
• Assess accuracy via
sensitivity,(recall)
specificity, positive
predictive value
(precision)
Stakeholder Engagement
Theoretic Framework and Model
• Socio-Technical Model
(STM)3
Eight Dimensions of
which we are using
four:
• The Promoting Action
on Research on
Implementation in
Health Services
(PARIHS) framework1-2
1Stetler,
2Kitson
– Evidence
– Context
– Facilitation
2011 http://www.implementationscience.com/content/6/1/99
, 2008 http://www.implementationscience.com/content/3/1/1
– hardware and software
– clinical content
– workflow and
communication
– internal organizational
features
3 Sittig and Singh , A new sociotechnical model for studying
health information technology in complex adaptive
healthcare systems, Qual Saf Health Care 2010;19
Stakeholder Engagement: Semistructured Interview and Thematic
Analysis
• Approach is “Applied”- to solve a problem6 using a theoretical
thematic analysis7
• Two independent reviewers each create summary and
organize identify themes to answer research questions
• Research group met to develop consensus codes on master
themes.
• Three documents resulted – two summaries, groupconsensus codes , consensus codes with highlighted text
6
7
Guest et al, Applied Thematic Analysis , 2012
Braun et al, Using Thematic Analysis in Psychology , 2006
Stakeholder Interview Results:
Respondent Characteristics
• We interviewed 13 stakeholders. The interviewees included
among others: clinical quality specialists; directors of quality
management, clinical analysis and reporting; epidemiology;
clinicians and pharmacists; and program analysts
• The range for the number of years in VA of respondents is 235
• And similarly, the range for the number of years in
quality/patient safety is 2-33
Stakeholder Engagement Preliminary
Results - Internal Factors
• Internal factors that facilitate implementation
of an automated system include:
– Use of evidence-based care
– A culture of continuous quality improvement
coupled with measurement and accountability
processes
– Quality control reporting both within and external
to the VA.
Stakeholder Engagement (cont.)
Hardware and Software-Preliminary Findings
We have an Informatics-Rich Environment in
VA
• Informatics is used for:
– Communication between
Providers and Patients
• Secure messaging
• Blue Button download
• Kiosks
• Mobile technology
• MyHealtheVet
• Informatics is used for
(cont.):
– Clinical Care
• CPRS
• Clinical Decision Support
• Templates designed to
facilitate clinically
relevant content
• Smart forms
• CART-CL
• Primary Care Almanac
Stakeholder Engagement- Preliminary
Results- Informatics Applications Used
with Quality Metrics
• Quality Improvement
Functionality (current)
• Performance integrated
tracking application (PITA)
• Measure Master Report
• System extraction and
output specifics:
• Capture the concepts and
values as well as the
words around the
concepts
• Have the ability to adjust
the EF value captured
• In Development
Growing number of
informatics tools
•
•
•
Process chart notes
using NLP
Surveillance tools
Analytic tools
• Meaningful Use
– Determine how we could
provide data for
meaningful use.
Formative Evaluation Process
• Initial stakeholder
engagement- went
through a couple cycles
of:
– Develop a prototype of
the tool with report
– Revise based on
feedback
• Developed a final
prototype of the table
• Develop an initial
functional HMP Module
• User-centered Design
Analysis
Thank you! Questions or
Comments?
• Please contact me at
[email protected]
• ADAHF Team:
• This study is undertaken as part
of the VA HSR&D IBE- 09-069-1
grant. The views expressed in this
article are those of the authors
and do not necessarily represent
the views of the Department of
Veterans Affairs or the University
of Utah School of Medicine.
• I thank the Department of
Veterans Affairs for my fellowship
and Gail Graham RHIA and Mark
Weiner MD for being my mentors
– Julia Heavirland/ Jenifer Williams
(Annotators)
– Youngjun Kim (Application Specialist)
– Stephane Meystre MD, PhD (Faculty
System Developer)
– Drs. Bruce Bray, Paul Heidenreich, Mary
Goldstein, Wendy Chapman, Michael
Matheny, Gobbel (Co-investigators)
– Andrew Redd PhD and Dan Bolton MS
(Statisticians)
– Megha Kalsy MS and Natalie Kelly MBA
(Stakeholder Engagement)
– Jennifer Garvin PhD, MBA (Principal
Investigator)
Extra Slides
EF Sampling Strategy
• To account for clustering, the sample size was increased by
the design effect. For ICC = 0.005, Deff = 1 + 25(0.005) =
1.125 so we require 179(1.125) = 201.375  202 positive
cases or 29 positive cases per facility;
• Dividing by prevalence estimate of EF in documents, we
have 29/0.80 = 36.25  37 documents required per facility.
We multiply that result by the number of sites and the
product is 37(7) = 259
• Doubling the sample size for the three facilities with freeor semi-structured text resulted in a minimum of 367
documents in the test set.
22
Definitions
• Sensitivity is the proportion of patients with disease who test positive.
In probability notation: P(T+|D+) = TP / (TP+FN).
• Specificity is the proportion of patients without disease who test
negative. In probability notation: P(T-|D-) = TN / (TN + FP).
• Sensitivity and specificity describe how well the test discriminates
between patients with and without disease. They address a different
question than we want answered when evaluating a patient, however.
What we usually want to know is: given a certain test result, what is
the probability of disease? This is the predictive value of the test.
Predictive value of a positive test (PPV) is the proportion of patients
with positive tests who have disease. In probability notation: (D+|T+) =
TP / (TP+FP).
23
Definitions
• The weighted harmonic mean of precision and
recall is the F-measure
• F= 2*Precision*Recall/(Precision + Recall)
• Kappa- more accurate than percent
agreement as it accounts for chance
agreement
• K= Observed agreement + Hypothetical
probability of chance agreement/1hypothetical probability of chance agreement
24
Definitions
• Regular expressions:
– A regular expression (regex or regexp for short) is
a special text string for describing a search
pattern.
– www.regular-expressions.info
25