AHRQ Slide Template 2009

Download Report

Transcript AHRQ Slide Template 2009

Systematic Review Module 6:
Data Abstraction
Joseph Lau, MD
Thomas Trikalinos, MD, PhD
Tufts EPC
Melissa McPheeters, PhD, MPH
Jeff Seroogy, BS
Vanderbilt University EPC
1
CER Process Overview
Search for and
select
studies:
Prepare topic:
·
· Refine key
questions
· Develop
analytic
frameworks
Identify
eligibility
criteria
· Search
for relevant
studies
· Select
evidence for
inclusion
Abstract data:
· Extract
evidence
from studies
· Construct
evidence
tables
Analyze and
synthesize data:
· Assess quality of
studies
· Assess
applicability of
studies
· Apply qualitative
methods
· Apply
quantitative
methods (metaanalyses)
· Rate the strength
of a body of
evidence
Present
findings
2
Learning Objectives
 What is data abstraction? Why do it?
 What kind of data to collect?
 How much data to collect?
 How to collect data accurately and efficiently?
 How many extractors? With what
background?
 How do abstraction forms look like?
 What are some challenges in data
abstraction?
 Is it feasible to query original authors?
3
Aims of Data Abstraction
 Summarize studies to facilitate synthesis
 Identify numerical data for meta-
analyses
 Obtain information to assess the quality
of studies more objectively
 Identify future research needs
4
On Data Abstraction (I)
 Abstracted data should
– accurately reflect information reported in the
publication
– remain in a form close to the original reporting (so
that disputes can be easily resolved)
– provide sufficient information to understand the
studies and to perform analyses
 Abstract what is needed (avoid over doing it);
data abstraction is labor intensive and can
costly and error prone
 Different question may have different data
needs
5
On Data Abstraction (II)
 Involves more than copying words and numbers




from the publication to a form
Clinical domain, methodological, and statistical
knowledge is needed to ensure the right
information is captured
Interpretation of published data is often needed
Quality assessment of articles belongs in this step
Appreciate the fact that what is reported is
sometimes not necessarily what was carried out
6
What Data to Collect?
 Guided by key questions and eligibility criteria
 Anticipate what data the summary tables
should include, what data will be needed to
answer questions, and conduct meta-analyses
 Data extraction follows the PICO format and
include study design
– Population
– Intervention or exposure
– Comparators (when applicable)
– Outcomes and numbers
– Study design
7
Data Elements: P, I, C
 Population-generic elements may
include patient characteristics such as
age, gender distribution, and disease
stage. May need more specific items
according to topic
 Intervention or exposure and comparator
items depend on the abstracted study
– RCT, observational study, diagnostic test
study, prognostic factor study, familybased or population-based genetic studies,
etc.
8
Data Elements: O
 Outcomes should be determined a priori
with Technical Expert Panel
 Criteria often are not clear as to which
outcomes to include and which to
discard
– Mean change in ejection fraction or
proportion with increase in ejection fraction
by > 5%
 May be useful to record different
outcome definitions and consult content
experts before making a decision
9
Data Elements: O
 Apart from data on outcome definitions,
you need quantitative data for metaanalysis
– Dichotomous (deaths, strokes, MI, etc.)
– Continuous variables (mmHg, pain score,
etc.)
– Survival curves
– Sensitivity, specificity, Receiver Operating
Characteristic (ROC)
– Correlations
– Slopes
10
Data Elements: Study Design
 Varies by type of study
 Some information to consider collecting
when recording study characteristics for
RCTs
– Number of centers (multi-center studies)
– Method of randomization (adequacy of
allocation concealment)
– Blinding
– Funding source
– Intention to treat (ITT), lack of standard
definition
11
Clarifying EPC Lingo
 In the EPC program, we often refer to
the following types of tables:
– Evidence tables are prettified data
extraction forms. Typically, each study is
abstracted to a set of evidence tables.
– Summary tables synthesize evidence
tables to summarize studies. They contain
context-relevant pieces of the information
included in the study-specific evidence
tables.
Developing Data Abstraction
Forms (Evidence Tables)
 No single generic form will fit all needs
 While there are common generic elements, in
general, form needs to be modified for each
topic or study design
 Organization of information in PICO format
highly desirable
 Well-structured form vs. flexible form
 Anticipate the need to capture “unanticipated”
data
 Iterative process, needs testing on multiple
studies by several individuals
12
13
Common Problems when Creating
Extraction Forms (Evidence Tables)
 Forms have to be constructed before
any serious data extraction is underway
– Original fields may turn out to be inefficient
or unusable
 In practice, reviewers have to
– Be as thorough as possible in the initial
set-up
– Reconfigure the tables as needed
– Dual review process helps fill in gaps
14
Example
First draft
Second draft
15
Example
Final draft
16
Common Problems when Creating
Extraction Forms (Evidence Tables)
 Lack of uniformity among outside
reviewers
– No matter how clear and detailed the
instructions, data will not be entered
identically from one reviewer to the next
 Solutions
– Evidence Table Guidance document—
instructions on how to input data
– Limit the amount of core members
handling the evidence tables to avoid
discrepancies in presentation
17
Example
 From the Vanderbilt EPC Evidence
Table Guidance document
– The “Country, Setting” field: provides a list
of possible settings that could be
encountered in the literature
 Academic medical center(s), community,
database, tertiary care hospital(s), specialty
care treatment center(s), substance abuse
center(s), level I trauma center(s), etc.
– The “Study design” field: cross-sectional,
longitudinal, case-control, RCT, etc.
18
Example
Reviewer A
Reviewer B
Samples of Final Data Extraction
Forms (Evidence Tables)
19
 For evidence reports or technology
assessments with many key questions,
data extraction forms may become very
long (several pages)
 The next few slides are examples of data
extraction forms: do not study them, just fly
through them
 When you design your own extraction
forms, improvise: there are many possible
functional versions
[Technology Assessment on home monitoring of obstructive sleep apnea syndrome AHRQ,
2007, Tufts EPC]
Patient and Study
Characteristics
Technology Assessment Home diagnosis of sleep apnea-hypopnea syndrome www.ahrq.gov
20
Characteristics of Index Test
and Reference Standard
Technology Assessment Home diagnosis of sleep apnea-hypopnea syndrome www.ahrq.gov
21
22
Results (Concordance/Accuracy)
Technology Assessment Home diagnosis of sleep apnea-hypopnea syndrome
23
Results (Nonquantitative)
Technology Assessment Home diagnosis of sleep apnea-hypopnea syndrome www.ahrq.gov
Considerations in Managing
Data Abstraction
 How to maximize scientific accuracy
under budgetary and logistical
constraints?
 How many people should extract data?
 Should data extraction be performed
“blinded” to the author, affiliations,
journal, results?
 How to resolve discrepancies?
24
Typical EPC Evidence
Reports
 Systematic review of a topic
 5 key questions (e.g., prevalence, diagnosis,






management, future research)
Analytic framework, evidence tables, summary
tables, meta-analyses, decision models
12 months from start to completion of final report
Screen 5,000 to >10,000 abstracts
Examine several hundred full-text articles
Synthesize 100 to 300 articles
100 to >200 pages in length
25
Estimating Time to Conduct a Meta-analysis
from Number of Citations Retrieved
 Metaworks Inc. project summary (EPC I)
– 37 meta-analysis projects
– Mean total number of hours: 1,139 hours
– Median: 1,110 hours (216 to 2,518 hours)
– Pre-analysis activities (literature search, retrieval,
screening, data extraction): 588 hours (standard
deviation 337 hours)
– Statistical analysis: 144 hours (106)
– Report and manuscript: 206 hours (125)
– Administrative: 201 hours (193)
Allen IE, Olkin I. JAMA 1999;282:634-35.
26
27
Fixed and Variable Costs Associated
with Systematic Reviews
JAMA 1999;282:634-35.
28
Tools Available for Data Abstraction
and Collection (Pros and Cons)
 Word processing software (MS Word)
 Spreadsheet (MS Excel)
 Database software (e.g., MS Access,
Epi-Info)
 Dedicated off-the-shelf commercial
software (e.g., TrialStat)
 Homegrown software
Who Should Abstract Data
and How Many People?
 Domain experts vs. methodologists
 Single or double independent
abstraction followed by reconciliation vs.
single and independent verification
 Blinded (to authors, journal, results) data
abstraction?
Berlin J. Does blinding of readers affect the results of meta-analysis?
Lancet 1997;350:185-186.
29
30
Challenges in Data Extraction
 Problems in data reporting
 Inconsistencies in published papers
 Data reported in graphs
Examples of Data Reporting
Problems in the Literature
“Data for the 40 patients who were
given all 4 doses of medications were
considered evaluable for efficacy and
safety. The overall study population
consisted of 10 (44%) men and 24
(56%) women, with a racial
composition of 38 (88%) whites and 5
(12%) blacks.”
[Verbatim]
31
Examples of Data Reporting
Problems
32
Examples of Data Reporting
Problems
33
Inconsistencies in Published
Papers
 Let’s extract the number of deaths in two
arms, at 5 years of follow-up.
34
35
Results Text
Overall Mortality
[…] 24 deaths occurred
in the PCI group, […]
and 25 in the MT
group […]
[Verbatim]
Dead
PCI
(205)
MED
(203)
24
25
36
Overall Mortality (Figure 2 in
Manuscript)
Dead
PCI
(205)
MT
(203)
24
25
28
35
[The paper clearly states that there is no censoring]
37
Clinical Events (Table 2 in
Manuscript)
Dead
PCI
(205)
MT
(203)
24
28
32
25
35
33
38
Digitizing Data Reported in
Graphs
Data Are Often Presented in
Graphical Form
We want to
dichotomize
measurements
for a 2 x 2 table:
Cutoff should be 15
(events per hour)
in each axis.
This information is
not reported in
the paper, but
can be extracted
from the graph:
count the dots!
Ayappa I et al. Sleep. 2004 Sep 15;27(6):1171-9
39
40
Using Digitizing Software
Engauge Digitizer,
an open-source
software:
Each data point is
marked with a red
“X,” and the
coordinates are
given in a
spreadsheet.
digitizer.sourceforge.net
Reconstructing the Plot to Count
Classification at Specific Cutoffs
41
Reconstructing a BlandAltman Plot
42
43
Additional Common Issues
 Missing information in published papers
 Variable quality of studies
 Publications with at least partially
overlapping patient subgroups
 Variable quality of conduct and quality
 Potentially fraudulent data
Considerations When Contacting
Authors for More Information
 How important is the information likely to




be?
How reliable are additional data?
How likely are you to be successful?
How much effort is required?
Where else should you look for more
data?
– FDA website
– ClinicalTrials.gov - Results Database
44
45
Types of Missing Data
 Detailed PICOTS information (e.g.,
population demographics, background diet,
comorbidities, concurrent medications,
precise definitions of outcomes)
 Information to assess methodological
quality (e.g., randomization methods,
blinding)
 Necessary statistics for meta-analysis (e.g.,
standard error, sample size, confidential
interval, exact p-value)
A Nonexhaustive List of Common
Data Abstraction Problems
 Non-uniform outcomes (e.g., different pain






measurements in different studies)
Incomplete data (frequent problem: no standard error
or confidence interval)
Discrepant data (different parts of the same report
gave different numbers)
Confusing data (cannot figure out what the authors
reported)
Nonnumeric format (reported as graphs)
Missing data (only the conclusion is reported)
Multiple (overlapping) publications of the same study
with or without discrepant data
46
Why Do Such Problems
Exist?
It is an eye-opening experience to
attempt to extract information from a
paper that you have read carefully and
thoroughly understood only to be
confronted with ambiguities, obscurities,
and gaps in the data that only an
attempt to quantify the results reveals.
Gurevitch J, Hedges LV. Chapter 17. Meta-analysis: Combining the
results of independent experiments. (pg 383). In: Design and analysis of
ecological experiments. Samuel M. Scheiner, Jessica Gurevich, eds.
Chapman & Hall, New York, 1993.
47
Why Do Such Problems
Exist?
Because so few research reports give
effect size, standard normal deviates, or
exact p-values, the quantitative reviewer
must calculate almost all indices of study
outcomes . . . Little of this calculation is
automatic because results are presented in
a bewildering variety of forms and are often
obscure.
Green BF, Hall JA. Quantitative methods for literature reviews. Annual
Review of psychology 1984;35:37-53.
48
49
Closing Remarks
 Laborious, tedious, (could take an hour
or more per article); nothing is automatic
 To err is human
 Interpretation and subjectivity are
unavoidable
 Data often not reported in a uniform
manner (e.g., quality, location in paper,
metrics, outcomes, numerical value vs.
graphs)