Evidence-Based Evaluation of Screening and Diagnostic Tests
Download
Report
Transcript Evidence-Based Evaluation of Screening and Diagnostic Tests
Thomas B. Newman, MD, MPH
Andi Marmor, MD, MSEd
Outline
Overview and definitions
Observational studies of screening
Randomized trials of screening
Conclusion – ecologic view
What is screening?
Common definition:
“Testing to detect asymptomatic disease”
Better definition*:
“Application of a test to detect a potential disease
or condition in people with no known signs or
symptoms of that disease or condition”
*Common screening tests. David M. Eddy, editor. Philadelphia, PA: American
College of Physicians, 1991
What is screening?
Common definition:
“Testing to detect asymptomatic disease”
Better definition*:
“Application of a test to detect a potential disease
or condition in people with no known signs or
symptoms of that disease or condition”
*Common screening tests. David M. Eddy, editor. Philadelphia, PA: American
College of Physicians, 1991
What is screening?
Common definition:
“Testing to detect asymptomatic disease”
Better definition*:
“Application of a test to detect a potential disease
or condition in people with no known signs or
symptoms of that disease or condition”
“ Condition” includes a risk factor for a
disease…
*Common screening tests. David M. Eddy, editor. Philadelphia, PA: American
College of Physicians, 1991
Screening Spectrum
Risk factor
Presymptomatic
disease
Unrecognized
symptomatic
disease
è Fewer people recognized and treated
è Easier to demonstrate benefit
è Less potential for harm
Recognized
symptomatic
disease
Examples of Screening Along the
Spectrum
Risk factor for disease:
Hypercholesterolemia, hypertension
Presymptomatic disease:
Neonatal hypothyroidism, syphilis, HIV
Unrecognized symptomatic disease:
Vision and hearing problems in young children;
iron deficiency anemia, depression
Somewhere in between?:
Prostate cancer, breast carcinoma in situ, more
severe hypertension
Screening for risk factors
Relationship between risk factor, disease and
treatment difficult to establish
Does test predict disease?
Does treatment of risk factor reduce disease?
Does treatment reduce risk factor? (eg: CAST)
Measures of test accuracy apply to disease that
is prevalent at the time the test is done
With risk factors, trying to measure incidence of
disease over time
Potential for harm greatest when screening for
risk factors!
Goals of Screening for
Presymptomatic Disease
Detect disease in earlier stage than would
be detected by symptoms
Only possible if an early detectable phase is
present
Only beneficial if earlier treatment is more
effective than later treatment
Do this without incurring harm to the
patient
Net benefit must exceed net harm
Long follow up and randomized trial may be
needed to prove this
Screening for Cancer
Natural history heterogeneous
Screening test may pick up slower growing
or less aggressive cancers
Not all patients diagnosed with cancer will
become symptomatic
Diagnosis is subjective
There is no gold standard
“It’s just a simple blood test.”
How can screening
be bad???
Possible harms from screening
To all
To those with negative results
To those with positive results
To those not tested
Public Health Threats from
Excessive Screening
“When your only tool is a hammer, you
tend to see every problem as a nail.”
Abraham Maslow
Interventions aimed at individuals are
overemphasized
Biggest threats are public health threats
Biggest gains in longevity have been
PUBLIC HEALTH interventions
Top Ten Countries’ Per Capita Healthcare Spending, 1997 ($)
United States
Switzerland
Luxembourg
Germany
Canada
France
Iceland
Denmark
Netherlands
Norway
0
1000
2000
3000
4000
Anderson GF and Poullier JP Health Affairs 18;178-88 May/June 1999
5000
Potential Years of Life Lost*/100,000 population,
top 10 spending Countries, 1995
United States
Switzerland
Luxembourg
Germany
Canada
Male
Female
France
Iceland
Denmark
Netherlands
Norway
0
2000
4000
6000
8000
10000
Before age 70. From Anderson GF and Poullier JP Health Affairs 18;178-88 May/June 1999
Economic and Political Forces
behind excessive screening
Companies selling machines to do the
test
Companies selling the test itself
Companies selling products to treat the
condition
Managed care organizations
Politicians who are (or want to appear)
sympathetic
Ad by
company that
makes the
machines
Ad for:
Frosted flakes!
( no cholesterol)
Ad sponsored
by the company
that makes
interferon.
Screening as an Obligation
Copyright restrictions may apply.
Schwartz, L. M. et al. JAMA 2004;291:71-78.
Cultural characteristics
"We
live in a wasteful, technology
driven, individualistic and deathdenying culture.“ George Annas, New Engl J Med, 1995
E-mail Excerpt
PLEASE, PLEASE, PLEASE TELL ALL
YOUR FEMALE FRIENDS AND
RELATIVES TO INSIST ON A CA-125
BLOOD TEST EVERY YEAR AS PART OF
THEIR ANNUAL PHYSICAL EXAMS. Be
forewarned that their doctors might try to
talk them out of it, saying, "IT ISN'T
NECESSARY."
…Insist on the CA-125 BLOOD TEST; DO
NOT take "NO" for an answer!
Source: Funny Times. (1-888-Funnytimes x 476)
Evaluating Studies of Screening
Screening test
Detect disease early
Treat disease
Patient outcome
Evaluating Studies of Screening
Screening test
Detect disease early
Treat disease
Patient outcome
Evaluating Studies of Screening
Screening test
Detect disease early
Treat disease
Patient outcome
Evaluating Studies of Screening
Ideal Study:
Randomized to screen/control
Compares outcomes in ENTIRE screened group to
ENTIRE unscreened group
Observational studies
Compare outcomes in screened patients vs
unscreened (not randomized)
Among patients with disease, compare outcomes
among those dx by screening vs those dx by
symptoms
Screened
R
Not screened
Screened
R
Not screened
Patients with
Disease
Patients with
Disease
D+
DD+
DD+
DD+
DScreened
Not screened
Diagnosed by
screening
Diagnosed by
symptoms
Survival from
Randomization
Survival from
Randomization
Survival from
Enrollment
Survival from
Enrollment
Survival after
Diagnosis
Survival after
Diagnosis
Survival after
Diagnosis
Survival after
Diagnosis
Biases in Observational Studies
of Screening Tests
Volunteer bias
Lead time bias
Length bias
Stage migration bias
Pseudodisease
Volunteer Bias
People who volunteer for studies differ from
those who do not
Examples
HIP Mammography study:
○ Women who volunteered for mammography had lower
heart disease death rates
Coronary drug project:
○ RCT of medications for secondary prevention of CAD
○ Men who took their medicine (drug or placebo!) had
half the mortality of men who didn't
Can occur in any non-randomized trial of
screening
Avoiding Volunteer Bias
Randomize patients to screened and
unscreened groups
Control for factors which might be
associated with both receiving screening
AND the outcome
eg: family history, level of health concern,
other health behaviors
Lead Time Bias (zero-time bias)
Screening identifies disease during a
latent period before it becomes
symptomatic
If survival is measured from time of
diagnosis, screening will always improve
survival even if treatment is ineffective
Lead Time Bias
Latent Phase
Biological Onset
Detectable by screening
Onset of symptoms
Death
Survival After Diagnosis
Lead Time
Detected by screening
Survival After Diagnosis
Contribution of lead time to survival
measured from diagnosis
Avoiding Lead Time Bias
Only present when survival from diagnosis
is compared between diseased persons
Screened vs not screened
Diagnosed by screening vs by symptoms
Avoiding lead time bias
Measure survival from time of randomization
How Much Lead Time is Present?
Depends on relative lengths of latent phase
(LP) and screening interval (S)
Screening interval shorter than LP:
Maximum false increase in survival = LP
Minimum = LP – S
Screening interval longer than LP:
Max = LP
Proportion of disease dx by screening = LP/S
Detectable by screening
Onset of symptoms
Death
LP
Detected by screening
Max
Min
S
Screen
Screen
Screen
Screen
Screen
Figure 1: Maximum and minimum lead time bias possible when screening
interval is shorter than latent phase
Max = LP
Min =LP – S
S
LP
Max
Screen
Screen
Screen
Figure 2: Maximum lead time bias possible when screening interval is
longer than latent phase
Max = LP
Proportion of disease diagnosed by screening: P = LP/S
Length Bias (Different Natural
History Bias)
If disease is heterogeneous:
Slowly progressive : more time in presymptomatic
phase
Cases picked up by screening disproportionately
those that are slowly developing
Higher proportion of less aggressive disease
in group detected by screening creates
appearance of reduced mortality even if
treatment is ineffective
Screen 1
TIME
Screen 2
Mortality when cancer
detected by screening
Mortality when cancer
detected by symptoms
Avoiding Length Bias
Only present when survival from diagnosis is
compared between diseased persons
AND disease is heterogeneous
Lead time bias usually present as well
Avoiding length bias:
Compare mortality in the ENTIRE screened group
to the ENTIRE unscreened group
Stage Migration Bias
Also called the "Will Rogers Phenomenon"
"When the Okies left Oklahoma and moved to
California, they raised the average intelligence
level in both states."
Described by Feinstein and colleagues
(1985) as an explanation for lower stagespecific survival in a 1954 cohort of patients
with lung cancer in comparison to a 1977
cohort
New technologies resulted in the 1977
group diagnosed with more advanced lung
cancer
Stage Migration Bias
Stage 0
Stage 0
Stage 1
Stage 1
Stage 2
Stage 2
Stage 3
Stage 3
Stage 4
Stage 4
Old test
New test
A Non-Cancer Example
“Infants in each of 3 birthweight strata
(VLBW, LBW and NBW) who are exposed to
Factor X have decreased mortality compared
with unexposed weight-matched infants”
Is factor X beneficial?
Maybe not! Factor X could be cigarette
smoking!
Smoking moves otherwise healthy babies to
lower birthweight group, improving mortality in
each group
Other Examples Abound…
The more you look for disease, and the
more advanced the technology
the higher the prevalence, the higher the
stage, and the better the (apparent)
outcome for the stage
Beware of stage migration in any
stratified analysis
Check OVERALL survival in screened vs
unscreened group
Pseudodisease
A condition that looks just like the disease,
but never would have bothered the patient
Type I: Indolent forms of disease which would
never cause symptoms
Type II: Preclinical disease in people who will
die from another cause before disease presents
The Problem:
Treating pseudodisease can only cause harm
Analogy to Double Gold Standard
Bias
Screening (test) result negative
Clinical FU (first gold standard)
Screening (test) result positive
Biopsy (2nd gold standard)
If pseudodisease exists
Sensitivity (true positive rate) of screening
falsely increased
Screening will also prolong survival among
diseased individuals
Example: Mayo Lung Project
RCT of lung cancer screening
9,211 male smokers randomized to two
study arms
Intervention: CXR and sputum cytology every
4 months for 6 years (75% compliance)
Usual care: recommendation to receive
same tests annually
*Marcus et al., JNCI 2000;92:1308-16
MLP Extended Follow-up Results
Among those with lung cancer, intervention
group had more cancers diagnosed at early
stage and better survival
Marcus et al., JNCI 2000;92:1308-16
MLP Extended Follow-up Results
Intervention group: slight increase in lung-cancer
mortality (P=0.09 by 1996)
Marcus et al., JNCI 2000;92:1308-16
What happened?
After 20 years of follow up, there was a
significant increase (29%) in the total
number of lung cancers in the screened
group
Excess of tumors in early stage
No decrease in late stage tumors
Overdiagnosis (pseudodisease)
Black, cause of confusion and harm in cancer screening. JNCI
2000;92:1280-1
Looking for Pseudodisease
Impossible to distinguish from successfully
treated asymptomatic disease in individual
patient
Very few compelling stories describe patients or
physician’s victories over pseudodisease…
Appreciate the varying natural history of
disease, and limits of diagnosis
Clues to pseudodisease:
Higher cumulative incidence of disease in screened
group
No difference in overall mortality between screened
and unscreened groups
Better health behaviors
Screened Group
Prolonged survival
Volunteer Bias
Earlier “zero time”
Early detection
Prolonged survival
Lead Time Bias
Slower growing tumor
with better prognosis
Early detection
Higher cure rate
Length Bias
Lower stage assignment
Early detection
Higher cure rate
Stage Migration Bias
Pseudodisease
Early detection
Higher cure rate
Overdiagnosis
Screened
D+
D-
Survival from
Enrollment
Not screened
D+
D-
Survival from
Enrollment
R
Patients with
Disease
Diagnosed by
screening
Survival after
Diagnosis
Diagnosed by
symptoms
Survival after
Diagnosis
Screened
D+
D-
Survival from
Randomization
Not screened
D+
D-
Survival from
Randomization
R
Issues with RCTs of Cancer
Screening
Quality of randomization
Cause-specific vs total mortality
Poor Quality Randomization
Edinburgh mammography trial
Randomization by healthcare practice
7 practices changed allocation status
Highest SES
26% of women in control group
53% of women in screening group
26% reduction in cardiovascular
mortality in mammography group
Cause-Specific Mortality
Problems:
Assignment of cause of death is subjective
Screening or treatment may have important
effects on other causes of death
Bias introduced can make screening
appear better or worse!
Example
Meta-analysis of 40 RCT’s of radiation
therapy for early breast cancer (N =
20,000)*
Breast cancer mortality reduced (20-yr ARR
4.8%; P = .0001)
BUT mortality from “other causes” increased
(20-yr ARR -4.3%; P = 0.003)
Were these additional deaths actually
due to screening
*Early Breast Cancer Trialists Collaborative Group. Lancet
2000;355:1757
Biases in Cause-Specific Mortality
“Sticky diagnosis” bias:
If cancer diagnosis made, deaths of unclear
cause more often attributed to cancer
Effect: overestimates cancer mortality in
screened group
“Slippery linkage” bias:
Linkage lost between death and cancer
diagnosis (eg: due to screening or treatment)
Death less likely counted in cause specific
mortality
Effect: underestimates cancer mortality in
screened group
The truth about total mortality
Mortality from
other causes
generally exceeds
screening or
cancer-related
mortality
Effect on condition
of interest more
difficult to detect
Total mortality
more important for
some screening
tests than
others…
Conclusions -1
Promotion of screening by entities with a
vested interest and public enthusiasm
for screening are challenges to EBM
High-quality RCT’s are needed
Attention to study design, size of effect
and unmeasured costs
Conclusions - 2
Dysfunctional metaphors for health care *
Military metaphor – battle disease, no cost too
high for victory, no room for uncertainty
Market metaphor -- medicine as a business;
health care as a product; success measured
economically
Reframing of priorities is needed
*Annas G. Reframing the debate on health care reform by replacing our
metaphors. NEJM 1995;332:744-7
Reframing Priorities:
Ecology Metaphor
Sustainability
Limited resources
Interconnectedness
More critical of technology
Move away from domination, buying,
selling, exploiting
Focus on the big picture
Populations rather than individuals
Causes rather than symptoms