The Perils of Subgroups - American Statistical Association

Download Report

Transcript The Perils of Subgroups - American Statistical Association

The Perils of Subgroups –
Concerns, Examples, Alternatives
FDA/Industry Statistics Workshop 2006
Andreas Sashegyi, PhD
Eli Lilly and Company
Introduction
• A well-powered study to test a specific scientific
hypothesis defines clear limits on the information to be
gathered
• Subgroup analyses stretch these limits (to a greater or
lesser extent) and conclusions from such analyses are
prone to increased Type I and II error rates
Things are not always as they seem…
European Carotid Surgery Trial
• Carotid endarterectomy vs medical intervention in
patients with recently symptomatic carotid stenosis
• Consider subgroup analysis in patients with ≥ 70%
stenosis according to month of birth…
ECST Subgroup Analysis
Figure 3, Rothwell PM, Lancet 2005; 365:176-186
• Treatment-Birthmonth interaction p<.001!
Some Common Pitfalls…
• Lack of power to show subgroup effects
• Failure to adjust for multiplicity
• Potential imbalances in subgroup baseline
characteristics
• Inability to confirm effects
• Information distribution
– E.g. constant therapeutic effect on a relative scale implies
decreasing absolute risk reduction with decreasing disease
severity – limits information in subgroups of lower disease
severity
The Conflict
Danger of subgroup
analysis to target
treatment
Applying overall results
of large trials to individual
patients without considering
determinants of
individual response
Example: Xigris
• Activated Protein C for the treatment of adults with
severe sepsis
• Therapeutic area with a history of failed trials
• PROWESS – global pivotal registration trial:
–
–
–
–
Randomized, double-blind, placebo-controlled
24 µg/kg/hr Xigris vs placebo for 96 hours
Planned sample size – 2280 patients
Primary endpoint – 28-day all-cause mortality
– Trial stopped at 2nd interim analysis
Mortality
Xigris
(n=850)
Placebo
(n=840)
p-value
24.7%
30.8%
0.005
Subgroup Analysis of APACHE II Score
Primary
APACHE II
1st quartile
2nd quartile
3rd quartile
4th quartile
0.5
1.0
Relative Risk of Death
2.0
N
1690
Trt
Plc
24.7% 30.8%
433
440
366
451
15.1%
22.5%
23.5%
38.1%
12.1%
25.7%
35.8%
49.0%
Observations…
• Lower mortality observed for 68 of 70 subgroups
• Observed effect in lowest APACHE quartile consistent in
that the 95% CI included the overall point estimate
• Analysis of other disease severity measures showed
consistent survival benefit in less severe patients
– Even within the first APACHE quartile…
Was there evidence to support a differential drug
effect by disease severity?
Some Reasons for Restricted Label
Indication restricted to patients with greater disease
severity as assessed, e.g. by APACHE
• Pre-specification of APACHE II score as an important analysis
• If relative risk reduction is constant, absolute risk reduction (i.e.
benefit) is greatest in highest-risk patients
 Xigris is associated with increased risk of serious bleeding
• APACHE II score was the best discriminator of mortality risk
• APACHE II score can be used at the bedside
• Acknowledgment – hypothesis that benefit is limited to high risk
patients is not proven…
With no proof that Xigris works in all subgroups, nor that it is ineffective
in some, indication focused on practicality and risk/benefit
considerations
Example: Tarceva
• HER1/EGFR tyrosine kinase inhibitor, indicated for
– patients with locally advanced or metastatic NSCLC, after failure
of ≥1 prior chemotherapy regimen
– (in combination with gemcitabine) first-line treatment of patients
with locally advanced, unresectable or metastatic pancreatic
cancer
• NSCLC – global registration trial:
– Randomized, double-blind, placebo-controlled
– Tarceva 150mg/day vs placebo, 2:1 randomization (488 vs 243)
– Primary endpoint – survival
Hazard ratio
N
HR
95% CI
731
0.76
(0.6, 0.9)
Subgroup Analysis Reported in PI
• Potential differential effects according to epidermal
growth factor receptor (EGFR) type and smoking status?
Subgroup
N
HR
95% CI
All patients
731
0.76
(0.6, 0.9)
Never smoked
146
0.42
(0.3, 0.6)
Current/ex-smoker
545
0.87
(0.7, 1.0)
EGFR +ve
185
0.68
(0.5, 0.9)
EGFR –ve
141
0.93
(0.6, 1.4)
EGFR unmeasured
405
0.77
(0.6, 1.0)
Comments
• [An apparently larger effect observed in two subsets]
• [Survival prolonged in EGFR +ve subgroup and
unmeasured subgroup but did not appear to have an
effect in the EGFR –ve subgroup; however CIs are wide
and overlap so that a survival benefit in the EGFR –ve
subgroup cannot be ruled out]
– Would argue for consistent effect
– Interpretation of “EFGR unmeasured” subgroup problematic
Comments
• [For the subgroup who never smoked, EFGR status also
appeared to be predictive of Tarceva survival benefit.
EGFR +ve patients who never smoked had a large
survival benefit; there were too few EGFR –ve patients
who never smoked to reach a conclusion]
– Implicitly assumes that EFGR subgroup finding is real
– Multi-layered subgroup analysis compounds problems…
– Conclusion of a large survival benefit in EFGR +ve patients who
never smoked would imply a more moderate effect in EFGR –ve
non-smokers (or rather, too few EFGR –ve non-smokers imply
non-definitive findings in both subgroups of non-smokers…)
Incorrect conclusions from subgroup
analyses…
Table 1, Rothwell PM, Lancet 2005; 365:176-186
So…
What are we to believe
And how should we proceed
?
Some Safeguards
• Pre-specify limited numbers of analyses, supported by
scientific hypotheses
– Consider expected effects and implied power conditions
• Stratify randomization by subgroup factors
• Focus primarily on treatment-subgroup interaction rather
than effects within subgroups
• Consider multiplicity adjustments
• Minimize post-hoc analysis and interpretation
• Consider alternatives directly addressing risk/benefit
question…
Beyond Standard Subgroup Analysis…
Two alternatives may offer additional insight:
• Meta analysis (of trials or trial subgroups)
• Patient-level analyses (e.g.: GLMs)
 Exploring the first may be helpful if used appropriately
 Exploring the latter is essential
Example: Xigris (revisited)
• Consider post-marketing commitment ADDRESS
– Xigris vs placebo in severe sepsis patients with lower disease
severity
– Global, randomized, double blind, placebo-controlled trial
– Planned sample size ~ 11000 patients
– Primary endpoint – 28-day all-cause mortality
• Due to label differences in EU vs US, ADDRESS also
enrolled some “high risk” patients
ADDRESS
• Study stopped at an interim analysis, due to futility
– Heterogeneous patient population
– Several patient-level factors complicated interpretation
• Diagnosis question for surgical patients with a single organ
failure
• Learning curve – site enrolment sequence effect
• Does ADDRESS still leave doubt about the effect in
lower-risk patients?
– Academically: Perhaps
– Practically:
No
What About High-Risk Patients?
• Can we learn more about the effect in high-risk patients?
Consider recent meta analysis by Friedrich et al:
(Critical Care 2006, 10:145)
- If subgroup effect in PROWESS is real, should not
expect same results in PROWESS and ADDRESS
- Examine same subgroups in both trials for which effect is
expected to be the same and proceed with meta
analysis…
Seeking Confirmation…
• RR (95% CI) in patients with APACHE II ≥ 25:
– 0.71 (0.59, 0.85) PROWESS – 817 patients
– 1.19 (0.83, 1.71) ADDRESS – 321 patients
• Considerable evidence of heterogeneity…
• Meta analysis shows no mortality benefit overall…
But is this an appropriate use of meta analysis?
Caution Advised!
• Subgroups are not necessarily comparable across trials
– Disease severity characteristics suggested “high-risk” patients
receiving Xigris in ADDRESS were significantly sicker than
placebo patients
– Mortality in the ADDRESS subgroup overall much lower than in
PROWESS
 This analysis was not helpful in providing more insight
 The attempt was well-motivated, but in general
subgroup analyses of any kind provide an insufficient
basis for facilitating decisions for individual patients
Subgroups vs Benefit/Risk Analysis
• The “single-objective/hypothesis” paradigm in clinical
trials is important for establishing drug effect in a
population
• Targeting therapy is the next step – best accomplished
by comprehensive risk-benefit analysis
• Patient treatment decisions revolve around individuals –
analyses to inform these decisions should accommodate
characteristics of individuals
Subgroups vs Benefit/Risk Analysis
• Elements of a unified framework for benefit/risk analysis:
– Patient outcomes (efficacy)
– Patient outcomes (adverse events)
– Patient characteristics
– Trade-off considerations – patient-specific
 Should lead to systematic decision analysis
accounting for uncertainty
• Regression or other model-based approaches are wellsuited for this effort
Finally, in the effort to match the right patient with the right
drug, no single analysis can suffice…