Transcript Week 10
Spring 2008
Bias, Confounding,
and Effect Modification
STAT 6395
Filardo and Ng
Confounding
Suppose we have observed an association
between an exposure and disease in a cohort
study or case-control study that:
We are confident was not a biased result due
to a flaw in the design or execution of the
study
Confounding
Suppose we have observed an association
between an exposure and disease in a cohort
study or case-control study that:
We are confident was not a random
association due to chance variation (95%
confidence interval for the estimate does not
include 1.0)
Confounding
Suppose we have observed an association
between an exposure and disease in a cohort
study or case-control study that:
How do we now distinguish between a
noncausal association due to confounding
and a causal association?
Hypothetical example of confounding: comparison
prostate cancer mortality rate in 2 geographic areas
• The exposure of interest is geographic area
• Annual mortality rate from prostate cancer:
Region A : 50 per 100,000
Region B: 20 per 100,000
• Relative risk = 50/20 = 2.5
Do these data show that living in Region A is
a risk factor for prostate cancer?
of
Prostate cancer mortality rate in 2 geographic areas
Prostate
Cancer
Deaths
10
Prostate
Cancer
Mortality*
5
800,000
490
61.25
1,000,000
500
50
<65
800,000
40
5
>65
200,000
160
80
1,000,000
200
20
Mid-year
Region Age Population
<65 200,000
A
>65
All
B
All
*per 100,000 per year
Unadjusted (crude) RR = 50/20 = 2.5
Age-adjusted RR = 66.25/85 = 0.78
Age as a confounder
The large discrepancy between the ageadjusted RR (0.78) and the unadjusted RR
(2.5) means that age confounded the
observed association between geographic
area and prostate cancer mortality
Age as a confounder
Age was a confounder because:
Age is a risk factor for prostate cancer
Age was associated with geographic region
Age is not an intermediate step in a causal pathway
between residence in a geographic region and prostate
cancer mortality
Age as a confounder
• Age is a common confounder in observational
epidemiology because it is associated with
many diseases and many exposures
As distinct from a biased association, which is
erroneous, the confounded association between
geographic region and prostate cancer mortality,
though not causal, is real
Geographic
area
Causal association (?)
RR(unadj)=2.5
RR(adj)=0.78
+
association
Prostate
cancer
+
association
Age
Age confounded the relationship between
geographic area and prostate cancer
Case-control study: alcohol consumption and lung cancer
Cases
Controls
Total
Drinkers
390
325
715
NonDrinkers
110
175
285
Total
500
500
1,000
OR(unadj) = (390x175)/(325x110) = 1.91
Note:
90% of the 500 cases in the study were smokers
25% of the 500 controls in the study were smokers
80% of the smokers drank
Case-control study: alcohol consumption and lung cancer –
table for Smokers
Cases
Controls
Total
Drinkers
360
100
460
NonDrinkers
90
25
115
Total
450
125
575
OR = (360x25)/(100x90) = 1.00
Case-control study: alcohol consumption and lung cancer –
table for NON Smokers
Cases
Controls
Total
Drinkers
30
225
255
NonDrinkers
20
150
170
Total
50
375
425
OR = (30x150)/(225x20) = 1.00
Smoking and lung cancer
Cases
Controls
Total
Smokers
450
125
575
Nonsmokers
50
375
425
Total
500
500
1,000
OR = (450x375)/(125x50) = 27.0
Alcohol consumption and smoking
Smokers
Nonsmokers
Total
Drinkers
460
255
715
Nondrinkers
115
170
285
Total
575
425
1,000
OR = (460x170)/(255x115) = 2.67
Alcohol consumption and lung cancer (summary)
• Unadjusted OR = 1.91
• Stratify by smoking status (2 strata -- smokers and
nonsmokers)
OR = 1 for the relationship between alcohol consumption and lung
cancer among both smokers and non smokers
• Smoking-adjusted OR (weighted average of the
stratum-specific ORs) = 1.00
Smoking confounded the relationship between alcohol
consumption and lung cancer
Large discrepancy between the smoking-adjusted OR
(1.00) and the unadjusted OR (1.91) shows smoking was a
confounder
Smoking confounded the relationship between alcohol
consumption and lung cancer
Smoking was a confounder because:
Smoking is a strong risk factor for lung cancer
Smoking is associated with alcohol consumption
Smoking is not an intermediate step in a causal pathway between
alcohol consumption and lung cancer
Alcohol
consumption
Causal association = NO
OR(unadj)=1.91
OR(adj)=1.00
+
association
Lung
cancer
+
association
Smoking
Smoking confounded the relationship between
alcohol consumption and lung cancer
Confounding: definition
Confounding is a distortion of the association
between exposure and outcome brought about
by the association …of another, extraneous
exposure (confounder) with both the disease
and the exposure of interest
Confounding: definition
As distinct from a biased association, which is
erroneous, a confounded association, though
not causal, is real
Properties of confounders
A confounder must be associated with the
exposure under study
Properties of confounders
Causal association (?)
Alcohol
consumption
RR(unadj)=
RR(adj)
Lung
cancer
?
association
Electomagnetic
fields
Exposure to electromagnetic fields cannot confound the
relationship between alcohol consumption and lung cancer
Properties of confounders
For an extraneous exposure to be a
confounder, it is necessary, but not sufficient
to just be associated with the exposure of
interest
Properties of confounders
Causal association (?)
Alcohol
consumption
RR(unadj)=
RR(adj)
Lung
cancer
+
association
Read meat
Red meat consumption cannot confound the relationship
between alcohol consumption and lung cancer
Properties of confounders
A confounder must also be a risk factor for the
disease
Properties of confounders
Alcohol
consumption
Causal association = NO
OR(unadj)=1.91
OR(adj)=1.00
+
association
Lung
cancer
+
association
Smoking
Smoking confounds the relationship between
alcohol consumption and lung cancer
Properties of confounders
A confounder cannot be an intermediate
variable in the causal pathway between the
exposure of interest and the disease
Properties of confounders
Willingness to get HIV
testing
A: Predictors / Confounders
HIV-related knowledge
Direct effect on HIV-related knowledge
Direct effect on willingness to get HIV testing
Mediated effect of A on willingness to get HIV testing
Properties of confounders
Causal association ?
Exposure
Disease
+/association
+/association
Confounder
Avoiding confounding with appropriate study design
• Randomization
• Restriction
• Matching
Randomization
done in experimental studies ONLY
Subjects are randomly allocated between n groups
...’ensuring’ that known and unknown potential
confounder distributions are similar across groups
Restriction
Restrict the selection criteria for subjects to a single
category of an exposure that is a potential confounder
…in the cohort study of alcohol consumption and lung
cancer, restrict the cohort to persons who have never
smoked.
Enhances internal validity, but could hurt external validity
Matching
In a case-control study, selection of controls who are
identical to, or nearly identical to, the cases with respect to
the distribution of one or more potential confounding factors
Matching is intuitively appealing, but its implications,
particularly in case-control studies, are much more
complicated than one might at first suppose
Assessing the presence of confounding during analysis
Is the potential confounder related to both the
exposure and the disease?
Stratification: Is the unadjusted OR or RR similar in
magnitude to the ORs or RRs observed within strata
of the potential confounder?
Adjustment: Is the unadjusted OR or RR similar in
magnitude to the OR or RR adjusted for the presence
of the potential confounder?
Assessing the presence of confounding during analysis
Is the potential confounder related to both the
exposure and the disease?
Confounding is judged to occur when the adjusted
and unadjusted values differ meaningfully.
Pandey DK et al. Dietary vitamin C and beta-carotene and risk of death in middleaged men. The Western Electric study.
• Concurrent cohort study
• Hypothesis: intake of vitamin C and beta carotene
(both anti-oxidants) are protective against all-cause
mortality
• Potential confounder: cigarette smoking
Unadjusted mortality rates and RRs according to vitamin
C/beta-carotene intake index
Index
Person- Mortality
Deaths years
Rate*
RR
Low
195
10,707
18.2
1.00
Medium
163
10,852
15.0
0.82
High
164
11,376
14.4
0.79
*deaths per 1,000 person-years
Percentage distribution of vitamin C/beta-carotene intake
index by smoking status at baseline
Intake Index (%)
Current
Smoking
Low
Medium
High
No
29.3
35.3
35.4
Yes
35.0
31.1
33.9
Mortality rates and RRs by current smoking at baseline
Current
PersonSmoking Deaths years
Mortality
Rate*
RR
No
165
14,854
11.3
1.00
Yes
357
18,401
19.4
1.72
*deaths per 1,000 person-years
Mortality rates and RRs for vitamin C/beta-carotene intake
index, stratified by current smoking at baseline
Current
Smoking
No
Yes
Intake
Index
Low
Mortality
Rate*
13.4
RR
1.00
Medium
10.7
0.80
High
10.3
0.77
Low
21.4
1.00
Medium
18.9
0.88
High
17.8
0.83
*deaths per 1,000 person-years
Unadjusted and smoking-adjusted mortality RRs according to
vitamin C/beta carotene intake index
RR
Low
Medium
High
Unadjusted
1.00
0.82
0.79
Adjusted*
1.00
0.85
0.81
*Adjusted for smoking using the direct method with the total
cohort as the standard population
Vitamin C/
beta-carotene
association
Causal association (?)
Medium intake:
RR(unadj)=0.82
RR(adj)=0.85
High intake:
RR(unadj)=0.79
RR(adj)=0.81
Mortality
+
association
Smoking
Smoking did not confound the association between
vitamin C/beta carotene intake and all-cause mortality
Methods of adjusting for (controlling for) confounding in the
analysis
• Adjustment methods based on stratification
• Mathematical models (multivariable analysis)
Adjustment methods based on stratification
Stratify by the confounder
Calculate a single estimate of effect across the strata
(adjusted OR or adjusted RR), which is a weighted
average of the RRs or ORs across the strata
Adjustment methods based on stratification
Stratify by the confounder
Calculate the RR or OR for the association between
the exposure and disease within each stratum of the
confounder
3 methods of obtaining a weighted average
• Direct adjustment (used in cohort studies) -- weights are
based on the distribution of the confounder in a standard
population
3 methods of obtaining a weighted average
• Indirect
adjustment (mainly used in occupational
retrospective cohort studies) -- weights are based on the
distribution of the confounder in the study population
3 methods of obtaining a weighted average
• Mantel-Haenszel method (most common adjustment
method based on stratification; used in case-control or
cohort studies) -- weights are approximately proportional to
the reciprocals of the variances of the ORs or RRs within
each stratum
Shapiro S et al. Oral-contraceptive use in relation to
myocardial infarction –a case-control study
• Hypothesis: recent use of oral contraceptives is
associated with risk of myocardial infarction
• Cases: 234 premenopausal women with a definite
first myocardial infarction (median age 43)
• Controls: 1,742 premenopausal women admitted for
musculoskeletal conditions, trauma, abdominal
conditions, and many miscellaneous conditions
(median age 36)
Hospital-based case-control study
Cases
Controls
OC
29
135
No OC
205
1607
OR(unadj) = (29x1607)/(135x205) = 1.7
Age is a likely confounder
• Age is a risk factor for myocardial infarction
• Age is negatively associated with oral contraceptive
use
Assess for confounding by age
• Perform a stratified analysis by age
• Compare the Mantel-Haenszel adjusted OR with the
unadjusted OR
Mantel-Haenszel age-adjusted OR = 4.0
Unadjusted OR = 1.7
Limitations of adjustment methods based on stratification
• There is often more than one potential confounder
• Allow adjustment only for categorical variables;
continuous variables must be categorized
Stratification methods are usually limited to
adjustment for one or two confounders with a small
number of categories each
Multivariable models
• Simultaneous adjustment for multiple potential
confounders, including continuous variables
• Potential confounders are included as variables in the
model along with the exposure under study
• Commonly used models
Logistic regression: case-control and cohort studies
Cox proportional hazards model: cohort studies
Poisson regression: cohort studies
Effect Modification (Interaction) - Oral contraceptives and
myocardial infarction example
Definition: variation in the magnitude of the
association between an exposure and a disease
(variation in the RR or OR) across strata of another
exposure
Are the odds ratios regarding the association
between OC use and MI heterogeneous across the
smoking status strata?
Oral contraceptives and myocardial infarction: stratified
analysis by smoking
Smoking
(cigarettes per day)
0
1-24
25+
OC
Yes
Cases
3
Controls
51
No
79
566
Yes
4
52
No
34
754
Yes
22
32
No
92
287
OR
0.4
1.7
2.1
Effect modification has an underlying biologic basis; it is
not merely a statistical phenomonon.
Other effect modification examples
• Menopausal status modifies the association between
obesity and breast cancer
• The association between gender and hip fracture is
modified by age
• Nutrition modifies the association between HIV infection
and progression of latent tuberculosis infection to active
tuberculosis
Effect modification example: Lyon et al. Smoking and
carcinoma in situ of the uterine cervix
Cases
Controls
Smokers
130
45
Non-smokers
87
198
OR(unadj) = (130x198)/(45x87) = 6.6
Effect modification example: Lyon et al. Smoking and
carcinoma in situ of the uterine cervix
Age
Smoker
Cases
Controls
OR
95% CI
20-29
Yes
No
41
13
6
53
27.9
(11.1-70.2)
30-39
Yes
No
66
37
25
83
5.9
(3.3-10.6)
40+
Yes
No
23
37
14
62
2.8
(1.3-5.9)
OR(unadj) = (130x198)/(45x87) = 6.6
Mantel-Haenszel age-adjusted OR = 6.3
p-value for heterogeneity <0.01
Confounding vs. Effect Modification
• Confounding: Confounding is a distortion of the RR or
OR that should be adjusted for
• Effect modification: Effect modification is a property of
a putative causal association.
It is a finding to be detected and estimated, not a bias to be avoided or
confounding to be adjusted for
An effect modifier may or may not itself be a confounder
Confounding vs. Effect Modification –cohort study example
• The unadjusted RR for the association between
Exposure A and Disease X is 9.7
How does age affect the relationship between
Exposure A and Disease X? 4 hypothetical scenarios
Confounding vs. Effect Modification –cohort study example
Age
RR*
20-29
9.8
*Iexp(A) /Inonexp(A)
30-39
10.7
RR(unadj) = 9.7
40-49
9.3
RR (adj) = 10.1
50-59
11.4
60-69
10.1
70-79
9.1
Age is neither a confounder nor an effect modifier
Confounding vs. Effect Modification –cohort study example
*Iexp(A) /Inonexp(A)
RR(unadj) = 9.7
RR(adj) = 10.1
Age
RR*
20-29
15.7
30-39
17.3
40-49
12.8
50-59
9.1
60-69
3.2
70-79
2.4
Age is an effect modifier, but not a confounder
Note: When there is effect modification, we cannot summarize
the relationship between Exposure A and Disease X with a
single number [RR(adj)]
Confounding vs. Effect Modification –cohort study example
*Iexp(A) /Inonexp(A)
RR(unadj) = 9.7
RR(adj) = 4.3
Age
RR*
20-29
5.4
30-39
2.8
40-49
4.7
50-59
3.5
60-69
4.1
70-79
4.5
Age is a confounder, but not an effect modifier
Confounding vs. Effect Modification –cohort study example
*Iexp(A) /Inonexp(A)
RR(unadj) = 9.7
RR(adj) = 4.3
Age
RR*
20-29
8.6
30-39
8.5
40-49
6.2
50-59
4.1
60-69
2.0
70-79
2.5
Age is a confounder and an effect modifier
Case-control study of alcohol consumption, smoking, and
oral cancer
Alcohol Cases Controls
Yes
80
40
No
40
125
OR(unadj) = (80x125)/(40x40) = 6.25
Case-control study of alcohol consumption, smoking, and
oral cancer
Smoking Cases Controls
Yes
84
45
No
36
120
OR(unadj) = (84x120)/(45x36) = 6.22
Case-control study of alcohol consumption, smoking, and
oral cancer
Smoker
No
Yes
Alcohol Cases Controls
No
20
100
Yes
16
20
No
20
25
Yes
64
20
OR
4.0
4.0
Unadjusted OR = 6.25 Smoking-adjusted OR = 4.0
Smoking is a confounder of the relationship between
alcohol consumption and oral cancer and no effect
modification
Case-control study of alcohol consumption, smoking, and
oral cancer
Alcohol
No
Yes
Smoker Cases Controls
No
20
100
Yes
20
25
No
16
20
Yes
64
20
OR
4.0
4.0
Unadusted OR = 6.22 Alcohol-adjusted OR = 4.0
Alcohol consumption is a confounder of the relationship
between smoking and oral cancer and no effect
modification
ORs for the joint effect of smoking and alcohol consumption
on risk of oral cancer
Smoker
No
Yes
Alcohol Cases Controls
OR
No
20
100
Ref
Yes
16
20
4.0
No
20
25
4.0
Yes
64
20
16.0
Assessment of effect modification (summary)
• Stratify by the potential effect modifier
• Calculate the RR or OR for the association between
the exposure and disease within each stratum of the
potential effect modifier
Assessment of effect modification (summary)
• Assess the degree of heterogeneity of the RRs or
ORs across the strata by inspection
• Calculate a p-value for heterogeneity –however,
remember that formal test for heterogeneity are
conservative and they might fail to detect effect
modification