Transcript Document
Statistics and Critical
Thinking for the
Interventionist
Michael J. Cowley MD, FSCAI
Nothing to disclose
The Thinker
Critical Thinking
• Skillful evaluation and synthesis of
information gathered from observation,
experience, or reasoning
8th Annual International Conference on Critical Thinking and Education Reform, 1987
Critical Thinking
Based on Universal Intellectual Values
• Clarity
• Accuracy
• Precision
• Consistency
• Relevance
• Sound evidence
Critical Thinking is the Transfer of Knowledge to Patient Care
8th Annual International Conference on Critical Thinking and Education Reform, 1987
How Do We Read the Literature in a
Wired Nation?
• The Abstract
• The Abstract & Summary
• The Abstract & Summary and some graphs
• The Results and Discussion
• The Entire Paper (including Methodology)
Statistics Vocabulary for Interventionists
Receiver operator curves
Statistics and Critical Thinking
• Types of Statistical Tests
• Types of Clinical Studies
• Critical Analysis of Medical Literature
Common Statistical Methods
• Single variable, continuous function
• t tests (1-tailed and 2-tailed)
• Single variable, discrete (categorical)
• Chi square; Fisher exact test
• Multiple comparisons (ANOVA)
• Multiple variables: multivariate analysis
• Identifies “independent” predictors
• Time dependent analysis
• Hazard functions (Kaplan-Meier, etc)
Two-sided Probability Value
• One-tailed (sided) test:
• Used when the difference can only go in one direction
• Example: Contrast effect on renal function
• Two-tailed (sided) test:
• Used when the difference can go in either direction
• Example: Drug effect on serum K+
• Two-tailed p value is twice the one-sided value
Evolution of Evidence
Primary Evidence
Randomized
controlled trial
Observational studies
Secondary Evidence
Synthesized quantitative
Data (meta-analyses)
Systematic reviews
Uncontrolled trials
Summary reviews
Descriptive studies
Case reports
Opinions of respected
authorities
Basic Clinical Trial Designs
Basic Clinical Trial Designs
• Cohort vs Case-Control Studies
• Prospective vs Retrospective
• Randomized vs Observational Studies
• Superiority vs Non-Inferiority
Basic Clinical Trial Designs
Cohort vs Case-Control
• Cohort Studies:
• The vast majority of studies are cohort studies – a
group of patients followed until they have events
• Case Control Studies
• Events of interest (e.g. stent thrombosis) are identified
and then matched to pts without events to determine
predictors of events
- Can be useful for analyzing low-frequency events
Basic Clinical Trial Designs
Prospective vs Retrospective
• Prospective Studies:
• Patients are followed forward in time; data is
gathered at baseline and as events happen
• Retrospective Studies:
• All events have happened already, and patients’
information is collected from the past (through
chart review, phone calls, etc).
• Hybrid: for example, database queries from
prospectively conducted studies
Basic Clinical Trial Designs
Randomized vs Observational
• Randomized Studies:
• Patients are allocated to a treatment (e.g. DES
vs BMS) randomly, thus decreasing bias
• Observational Studies:
• Data is analyzed based upon what treatment
was received (e.g. DES vs BMS)
• The reasons for treatment received may be
subject to “confounding” or bias
Clinical Trial Types
• Superiority:
• goal is to show that the new treatment is better
than placebo or standard therapy
• Non-Inferiority:
• goal is to show the new therapy is not worse than
standard therapy by some tolerable margin
• (e.g., 30-day mortality difference <1%)
• Equivalence:
• goal is to show whether outcome of 2 therapies are
within some acceptable range of one another
• (e.g., 30-day mortality within ± 1%)
Statistical Concepts: Superiority Trial
Superior
Uncertain
Inferior
Uncertain
-4
-2
0
2
4
Difference in Primary Endpoint
D is significant if confidence intervals do no cross line of identity
Non-Inferiority Trial
Interpretation of Results
Non-Inf Margin
Interpretation
Non-Inferior
Not Non-Inferior
Non-Inferior
Not Non-Inferior
Non-Inferior
Not Non-Inferior
-4
-2
0
2
4
Difference in Primary Endpoint vs Active Control
2 sided test
Alpha = 5%
Non-Inferiority Trial
Interpretation of Results
Non-Inf Margin
Interpretation
Non-Inferior
Not Non-Inferior
Non-Inferior
Both non-inferior
and superior to
std therapy
Not Non-Inferior
Non-Inferior
Not Non-Inferior
-4
-2
0
2
4
Difference in Primary Endpoint vs Active Control
2 sided test
Alpha = 5%
There are 3 kinds of lies: lies, damned
lies and statistics
Mark Twain: “Chapters from My Autobiography” 1906
Statistics and Critical Analysis
• What is the question being asked?
• How was the study done?
• Who was included and excluded?
• Are the statistical methods described?
• Are the statistical methods appropriate?
• The Authors are selling something:
Buyer Beware !
Be skeptical; read carefully!
Bias in Clinical Trials
• Design bias
• Selection bias
• Endpoint bias
• Inclusion bias
• Ascertainment bias
• Reporting bias
Randomization addresses only one of these issues
Bias in Clinical Trials
• Design bias
• Selection bias
• Endpoint bias
• Inclusion bias
• Ascertainment bias
• Reporting bias
Randomization addresses only one type of bias
Clinical Trial Design Considerations
Endpoints
• Primary Endpoints:
• Clinical Outcomes
• i.e. Death, MI, stroke, TVR
• Composite EP often used
• Surrogate (non-clinical) Outcomes
• Late lumen loss, binary restenosis
Clinical Trial Analysis Issues
Generalizability
Who does the study apply to?
• Inclusion / exclusion criteria
• Proportion of eligibles in RCT
• Low inclusion rate greatly reduces
generalizability of results
Are Randomized Trials Generalizable?
BARI
7%
EAST
8%
CABRI
GABI
5%
0%
5.2% randomized
4%
ERACI
RITA
91,730 patients screened
17%
4%
20%
40%
60%
80%
100%
% of screened patients randomized
Clinical Trial Analysis Issues
• Magnitude of p value
• Subgroups
• Power
Critical Analysis of Medical Literature
The Magic of p<0.05
• p=0.05: 95% probability that observed difference
is real (not due to chance)
• p<0.05 designated to define significance
• p=0.06: 94% probability that difference is real
• Is p=0.06 meaningfully different than p=0.05?
• p<0.00001 confers much greater certainty
• Statistical significance = clinical relevance
Clinical Trial Design Issues
Methodology
• Inclusion
• Exclusion
• Sample size
• Concomitant therapy
• Agents, dose, timing
• Endpoint selection
Clinical Trial Design Issues
• What are the questions?
• Who do they apply to?
• Inclusion, exclusion
• How are they assessed?
• Endpoints: soft, hard, composite
• Sample size
• Estimate based on expected Rx difference
Clinical Trial Design Issues
Role of Estimates and Assumptions
• Event rate of treatment group
• Event rate of control group
• D (difference) considered important
• Power and sample size
Considerable guesswork involved!
Composite Endpoints
• Are they of equal weight?
• Death, MI, Stroke
• TVR, Re-hosp, recurrent angina
• Do they occur with equal frequency?
• Do they have the same clinical importance?
• Do they reflect similar treatment effect?
Clinical Trial Analysis Issues
Are the study results still relevant ?
• BARI:
• PTCA vs CABG in stent era
• COURAGE:
• BMS vs Med Rx (OMT) in DES era
Subgroup Analysis
• Since pt groups are not homogeneous, subgroup
analysis for treatment response may be legitimate
• Pre-specified subgroups are most appropriate
• Most trials are not large enough and lack power to
detect subgroup differences
• Widespread subgroup testing is inappropriate and
may yield false positive results (1 in 20 if p=0.05)
• Do not rely on subgroup p values
• Use interaction tests instead
TRITON Primary EP (CV Death, MI, Stroke)
Major Subgroups
UA/NSTEMI
STEMI
Reduction in risk (%)
18
21
B
Male
Female
21
12
<65
Age 65-74
>75
25
14
6
No DM
DM
14
30
BMS
DES
20
18
GPI
No GPI
21
16
CrCl < 60
CrCl > 60
14
20
Overall
19
0.5
Prasugrel Better
1
HR
Pinter = NS
Clopidogrel Better
2
Meta-analysis
• Very Popular
• Pooling of similar (or somewhat similar studies)
• Enhances power for uncommon endpoints
• Useful for hypothesis generation
• Methodology differences can confound interpretation
• Analyses based on combining disparate studies
may provide misleading results
Meta-analysis
Clever Technique or Pseudoscience?
• Any one study is too small and not generalizable
• Informal literature reviews are too subjective
• By combining information from multiple trials, it can:
• Test of an overall (summary) hypothesis
• Better estimate of average treatment effect
• Evaluate consistency of the trials
• Provide comparative data display
Enoxaparin vs UFH
Death or MI at 30 Days*
Trial
Enox
UFH
OR [95% CI]
ESSENCE
5.8
7.5
0.76 [0.58, 1.01]
TIMI 11B
6.4
7.8
0.81 [0.60, 1.10]
INTERACT
4.6
8.1
0.55 [0.28, 1.08]
A to Z
7.3
6.9
1.06 [0.68, 1.67]
SYNERGY
12.6
14.8
0.84 [0.68, 1.05]
Overall
8.0
9.4
0.81 [0.70, 0.94]
Odds Ratio (95% CI)
0.2
Test for heterogeneity: χ2 = 2.86, df = 4, P = .58
* No prerandomization therapy
1.0
Enox better
2.0
UFH better
Petersen JL: JAMA 2004;292:89-96
Clinical Judgment
• Process of making inferences or decisions
based on incomplete information
• an “educated guess”
• Excellent judgment is ability to make correct
decisions based on incomplete information
“Medicine is a science of uncertainty
and an art of probability”
“It is much simpler to buy books than to read them
and easier to read them than to absorb their content”
William Osler (1849 – 1919)
Statistics and Critical Thinking
• Evidence based medicine is ideal
• But the evidence base:
• is not always strong
• is often conflicting
• may not apply to your patient
• may no longer be relevant
Clinical Judgment is required
Statistics and Critical Thinking
Conclusion
• Statistics are powerful tools, but like any tool,
they can be misused
• Incomplete understanding and inappropriate
use of statistics can lead to faulty conclusions
• Always put the data in a clinical perspective
• The combination of great clinical skills with a
knowledge of statistical methodology (and
limitations) is the best way to achieve high
quality evidence-based practice