Case control study

Download Report

Transcript Case control study

Issues in case-control studies
Kwang Hyuck Lee
M.D., Ph.D.
[email protected]
Divison of gastroenterology
Department of Medicine
Samsung Medical Center
Sungkyunkwan University School of Medicine
• Pancreas and biliary tract
SMC pancreas biliary tract
Biliary tract and pancreas
Managing with specialized Endoscopy
복부 초음파 Vs 위 내시경검사
내시경 초음파를 이용 조직검사
(EUS: endoscopic ultrasound)
Endoscopic ultrasound in Pancreaticobiliary disease
ERCP (endoscopic retrograde
cholangiopancreatography)
ERCP
Endoscopic Retrograde CholangioPancreaticography
Why do medical doctors have
to learn epidemiology?
 Graduate school degree
 Investigation  journal
 Academic position
 Interested in a research
Presenter’s Name
 Ability to
Date
 Do a case-control study
 Evaluate other papers properly
Case-control study –
historical synonyms
 Retrospective study
 Trohoc study
 Case comparison study
Presenter’s Name
 Case compeer
study
Date
 Case history study
 Case referent study
11
Case Control Study
Disease
Case
Exposed Yes
No
Yes
No
A1
B1
A
B0
Presenter’s Name 0
Control
Date
A1 B0
OR 
 (cross  product ratio)
A0 B1
생체 간이식 후 간수치 상승 환자에서
담도 협착의 조기 발견과 관련된 요인
Presenter’s Name
Date
오초롱, 이광혁, 이종균 , 이규택 , 권준혁*,조재원*, 조주희**
성균관대학교 의과대학, 삼성서울병원 소화기내과, 이식외과*, 암교육센터**
연구목적
 생체간이식(LDLT) 후 발생하는 담도 합병증
 가장 좋은 치료인 내시경적 치료 성공률 : 50% 전후
 담도 합병증을 조기에 발견하여
내시경적 배액술을 시행하면 성공률이 높다.
 LDLT 후 간 기능 이상 소견을 보이는 환자 중에
담도 합병증을 예측할 수 있는 요인을 찾고자 하였다.
대상 및 방법
 기간 및 대상 환자
 2006년 1월부터 2008년 12월 생체간이식을 받은 환자
 수술 후 회복된 간기능이 다시 악화되었던 환자
 duct to duct 문합 환자만 포함(hepaticojejunostomy 환자는 제외)
 조사한 항목
 기저질환, 증상
 간기능 검사
 수술기록
 영상의학검사
분석 group
 LDLT 후 간수치가 재상승한 환자를 대상으로 group 을 나눔
(상승 기준 : AST>80, ALT>80, ALP>250 or bilirubin>2.2)
Group A
: ERCP가 필요한 환자 Vs ERCP 필요하지 않은 환자
Group B
: 문합부 담도협착 환자 Vs 거부반응 환자
Group C
: CT 상 협착소견이 없었던 환자 중에
ERCP가 필요한 환자 Vs 필요하지 않은 환자
LDLT patients during 3years : n=213
Patients with LFT elevation : n=120
Analysis group A
need ERCP
n=74
not need ERCP
n=46
Analysis group B
stricture
58
rejection
23
leakage
13
infection
7
stone
3
HCC
5
viral reactivation
3
vessel stenosis
3
etc
5
Analysis group C
CT(-) need ERCP : 32
CT(-) not need ERCP : 40
Case-Control Study or not?
Presenter’s Name
Date
19
Presenter’s Name
Date
20
Presenter’s Name
Date
Brock MV, et al. N Engl J Med 2008;358:900-9
21
Conducting case-control studies
 Case and Control selection
 Exposure measurement
Presenter’s Name
 Odds ratio
Date
Research
 New Question ??
 Method
 Clinical study
 Translational study
 Laboratory study
Presenter’s Name
 ClinicalDate
study
 Observational studies
• Case-control study Vs Cohort study
 Randomized controlled trial
Why case-control studies?
 New question of interest
 Cohort study with the appropriate outcome
or exposure ascertainment does NOT exist
 Need toPresenter’s
initiate
a new study
Name
Date
 Do you have the time and/or resources to
establish and follow new cohort?
24
Case control study ??
 High cholesterol  Myocardial infarction
 MI (+) case
 MI (-) control
 Cholesterol level
 ResultPresenter’s Name
Date
• Negative
• Positive
25
Impetus for case-control studies :
EFFICIENCY
 May not have the sufficient duration of
time to see the development of diseases
with long latency periods.
 May not have the sufficiently large cohort
Presenter’s Name
to observe outcomes of low incidence.
Date
NOTE: Rare outcomes are not necessary for a
case-control study, but are often the drive.
26
Presenter’s Name
Date
27
Efficiency of case-control study
 Do maternal exposures to estrogens around
time of conception cause an increase in
congenital heart defects?
 Assume RR = 2, 2-sided α = 0.05, 90% power
 Cohort study: If I0 = 8/1000, I1 = 16/1000, would
need 3889
exposed
and 3889 unexposed
Presenter’s
Name
mothers
Date
 Case-control study: If ~30% of women are
exposed to estrogens around time of conception,
would need 188 cases and 188 controls
Schlesselman, p. 17
28
Strengths of case-control study
 Efficient – typically:
 Shorter period of time
 Not as many individuals needed
 Cases are selected, thus particularly good for
rare diseases
Presenter’s Name
Date
 Informative – may assess multiple
exposures and thus hypothesized causal
mechanisms
29
Learning objectives
 Exposure
 Selection of cases and controls
 Bias
 Selection, Recall, Interviewer, Information
 Odds ratios
Presenter’s Name
 MatchingDate
 Nested studies
 Conducting a case-control study
DCR Chapter 8
30
Exposure ascertainment – examples
 Active methods
 Questionnaire (self- or interviewer- administered)
 Biomarkers
 Passive methods
Presenter’s
Name
 Medical
records
Date
 Insurance
records
 Employment records
 School records
31
Exposure ascertainment issues
 Establish biologically relevant period
 Measurement occurs once at current time
 Repeated exposure
 Previous exposure
 Measure of exposure occurs after outcome
Presenter’s Name
has developed
Date
 Possibility of information bias
 Possibility of reverse causation (outcome
influences the measure of exposure)
32
Is it possible in case-control
study? – relevant period
Presenter’s Name
Date
Yesterday smoking and radiation 
Cancer risk
33
Information bias: recall bias
 Mothers of babies born with congenital
malformations more likely to recall
(accurately or “over-recall”) events during
pregnancy such as illnesses, diet, etc.
Presenter’s Name
Date
34
Possibility of reverse causation
 High cholesterol  Myocardial infarction
 MI (+) case
 MI (-) control
 Cholesterol level
Name
 ResultPresenter’s
?
Date
 MI  Cholesterol
level decrease
 Measure cholesterol after MI
35
Case selection – basic tenets
 Eligibility criteria
 Characteristics of the target and source population
 Diagnostic criteria
 Definition of a case: misclassification
Presenter’s Name
Date
 Feasibility
36
Source populations – samples
 Health providers: clinics, hospitals, insurers
 Occupations: work place, unions
 Surveillance/screening programs
 Laboratories, pathology records
 Birth records
Presenter’s Name
 Existing cohorts
Date
 Special interest groups: disease foundations or
organizations
37
Incident versus prevalent cases
 Incident cases: All new cases of disease
cases (that become diagnosed) in a certain
period
 Prevalent
cases:
Presenter’s
Name All current cases
Date of when the case was diagnosed
regardless
38
Incident Vs Prevalence
 Do the cases represent all incident cases in
the target population?
 Exposure–disease association
Vs
Exposure–survival association
Presenter’s Name
Date
39
Prevalence cases
 Disease
 only A (causal factor)
 A+B (protective factor)
 A+C (protective factor)
 Patient A: A
Presenter’s Name
 Patient Date
B: A+B
 Patient C: A+C
1-month survival
1-year survival
10-year survival
1 month
1 year
10 years
Prevalence cases  A,B,C: Causes
 intervention of B or C
↓↓Survival
40
Disease severity
 Which stage is chosen for a case?
 Early stage only
 Late stage only
Progression not always
Influence of severity
 Increase sample size for stratification
Presenter’s Name
Date
41
Early stage only
 Finding risk factors of thyroid cancer
 Decrease risk factors  Prevent thyroid cancer
 Health promotion
 Case: small thyroid cancer
 Control: normal population
Presenter’s
 Determined
theName
differences of exposure
Date
 Small thyroid cancer  no progression
 What is the clinical meaning of this study?
42
Late stage only
– difficult diagnosis
 Pancreatic cancer Vs. Weight
 Cases: pancreatic cancer (late stage)
 Low weight due to Cancer progression
 Conclusion
 low weight  pancreatic cancer
Presenter’s Name
Date
Increase sample size for
stratification
43
Selection bias
 Selection of cases independent of exposure
status
 Related to severity
 Related
to hospitalization
or visiting
Presenter’s
Name
Date
44
Example selection bias (1)
 Hypothesis
 Common cold  Asthma
 Setting
 Patients in Hospital
 Truth
Presenter’s Name
 Common Date
cold: aggravating factor not causal factor
 No different incidence of asthma according to
common cold
 Common cold (+)  aggravation  hospital visit
 Common cold (-)  no symptoms  no visit
45
Example selection bias (2)
Total
Common cold in
society
Patients in
hospital
Common cold in
hospital
Asthma
1,000
10
50
10
General
200,000
2,000
1000
20 (10+ alpha)
Cause positive
Cause negative
Case (asthma)
Date
10
40
Control
1
49
Presenter’s Name
Odds ratio = (1X49)/(4X1)
46
Case and Control selection
Presenter’s Name
Date
Same distribution
of risk factors ??
47
Presenter’s Name
Date
Guallar E, et al. N Engl J Med 2002;347:1747-54
48
Selection of controls –
basic tenets
 Same target population of cases
 Selection needs to be independent of exposure
 Should have the same proportion of exposed to
non-exposed persons as the underlying cohort
(source population)
 Confirmation
of lack
Presenter’s
Name of outcome/disease
 Should answer
yes to: If developed disease of
Date
interest during study period, would they have been
included as a case?
49
Selecting controls –
Same as case source
Characteristics
1.
2.
3.
4.
Convenient
Most likely same target population
Rule out outcome – avoids misclassification
Similar factors leading to inclusion into source
population
5. Sometimes
impractical
Presenter’s
Name
 Examples


Date
Breast cancer screening program
• Confirmed breast cancer – cases
• No breast cancer – controls
Same hospital as case series
• Similar referral pattern – examine by illness types
50
Source for controls
 Geographic population
 Roster needed
 Probability sampling
 Neighborhood controls
 Random sample of the neighborhood
Presenter’s Name
 FriendsDate
and family members
 Hospital-based control
51
Selection of controls:
Friends or family members
 Friends or family members
 Ask each case for list of possible friends who meet
eligibility criteria
 Randomly select among list
 Type of matching - will be addressed later
 Concerns:
Presenter’s Name
 May inadvertently
select on exposure status, that is,
Date
friends because
of engaging in similar activities or
having similar characteristics/culture/tastes
 “over-matching”
52
Presenter’s Name
Date
Am J Epidemiol 2004;159:915-21
53
Selection of controls
Hospital or clinic-based
 Strengths
 Ease and accessibility
 Avoid recall bias
 Concerns
 Section bias: exposure related to the hospitalization
 A mixturePresenter’s
of the best
defensible control
Name
Date
 Referral pattern
 Same
 Or not
54
Diet pattern: Colon cancer
 소화기 암 전문 병원 (GI referral center)에서
연구를 수행함
 Case : 소화기 클리닉의 대장암 (+)
 Control : 호흡기 클리닉의 대장암 (-)
• 소화기 클리닉: 대기실 소화기 암 관련 음식 정보
• 호흡기
클리닉Name
Presenter’s
Date차이는 질환의 차이가 아니라
 두 군 간에
클리닉의 차이를 반영할 수도 있다.
 Control :소화기 클리닉의 위암 (+)
55
Presenter’s Name
Date
Guallar E, et al. N Engl J Med 2002;347:1747-54
56
Weakness of Case-Control
Studies
 Time period from which the cases arose
 Survival factor, Reverse causation
 Biologically relevant period
 Only one outcome measured
 Susceptibility to bias
Presenter’s Name
 Separate
sampling of the cases and controls
Date
 Retrospective measurement of the predictor
variables
57
Issues in case-control studies
Eliseo Guallar, MD, DrPH
[email protected]
Presenter’s Name
Date
Juhee Cho, M.A., Ph.D.
[email protected]
Case and Control selection
Presenter’s Name
Date
Same distribution
of risk factors ??
59
Selection of cases
 Case selection in hospitals/ Control selection in general population
 Alcohol  Hip fractures: All visit hospitals
 IUD  abortion
 1st abortion: Some visit but others not
 Women with IUD in general population more frequently visit clinics
Target population
Study sample
Presenter’s Name
Disease
DateNo disease
Exposed
Non-exposed
A
B
C
D
Disease No disease
Exposed
a
b
Non-exposed
c
d
60
1st abortion: 3% rate and no relation of IUD
 General population
 IUD(+) 1000  30 970
 IUD(-) 9000  270 8730
Presenter’s Name
 IUD: frequent
visit
Date
 Hospital population
 IUD (+) 90%  27
 IUD (-) 45%  122
873
3930
case
control
Yes
10
10
No
90
90
100
100
case
control
Yes
27
15
No
122
134
149
149
61
Case
control
Yes
27
15
(%)
18.1
10.1
No
122
134
(%) Presenter’s
81.9
70.0
Name
Total Date 149
149
%
100
100
Pearson chi2(1) = 3.9911 Pr=0.046
Total
42
14.1
256
85.9
298
100
62
How to overcome….
 Control: general population 
 difference due to frequent visit
 Control: Hospital population
 theoretically same unless this control
group has higher abortion rates due to
Name
other Presenter’s
problems
 ControlDate
mixture: both
63
Critics from papers
Limited cases
Selection bias
from control selection
Presenter’s Name
Date
To make you paper better than
previous studies
64
Presenter’s Name
Date
65
Presenter’s Name
Date
Nomura A, et al. N Engl J Med 1991;325:1132-6
66
Selection bias
in nested case-control study
 Controls were excluded if they had had
gastrectomy or history of peptic ulcer disease
 Controls with a cardiovascular disease or
cancer at baseline or during follow-up were
excluded
Target population
Study sample
Presenter’s
Name
Disease
No disease
Date
Disease No disease
Exposed
A
B
Exposed
a
b
Nonexposed
C
D
Nonexposed
c
d
67
Presenter’s Name
Date
68
Presenter’s Name
Date
At GI clinic
MacMachon B, et al. N Engl J Med 1981;304:630-3
69
Presenter’s Name
Date
Exclude other diseases
MacMachon B, et al. N Engl J Med 1981;304:630-3
70
Presenter’s Name
Date
MacMachon B, et al. N Engl J Med 1981;304:630-3
71
Selection bias in case-control study
 Controls were largely patients with diseases of
the gastrointestinal tract
 Control patients may have reduced their coffee
intake as a consequence of GI symptoms
Target population
Study sample
Presenter’s
Name
Disease
No disease
Disease No disease
Date
Exposed
A
B
Exposed
a
b
Nonexposed
C
D
Nonexposed
c
d
72
Presenter’s Name
Date
73
Presenter’s Name
Date
Antunes CMF, et al. N Engl J Med 1979;300:9-13
74
Presenter’s Name
Date
Non-GY Control
GY
Control
Antunes CMF, et al. N Engl J Med 1979;300:9-13
6.0
2.1
75
Criticisms
of prior case-control studies
 Diagnostic surveillance bias
 Women on estrogens are evaluated more
intensively – they are more likely to be diagnosed
and to be diagnosed at earlier stages
 Women with asymptomatic cancer who receive
estrogens are more likely to bleed and to be
diagnosed
Presenter’s Name
Date
Antunes CMF, et al. N Engl J Med 1979;300:9-13
76
To avoid selection bias
in case-control studies
 Selection of cases
 Types of cases selected (non-fatal, symptomatic, advanced)
 Response rates among cases
 Relation of selection to exposure – Are exposed cases more
(or less) likely to be included in the study?
 Selection of controls
 Type of controls (general population, hospital, friends and
Presenter’s Name
relatives)
Date controls, diseases selected as control conditions
 For hospital
 Response rate among controls
 Relation of selection to exposure – Are exposed controls
more (or less) likely to be included in the study?
 Similar response rates in cases and controls do NOT
rule out selection bias
77
Presenter’s Name
Date
78
Recall issues
 All information in case-control studies is historic, so if
relying on reporting by participants, accuracy depends
on recall
Concerns:
 Do cases recall prior events differently from controls?
 Mindset of
someone
Presenter’s
Name with disease : Is there
something
that I did that may have caused the disease?
Date

Recall Bias
(Information Bias)
79
Recall bias – example
 Mothers of babies born with congenital
malformations more likely to recall
(accurately or “over-recall”) events during
pregnancy such as illnesses, diet, etc.
Presenter’s Name
Date
80
Presenter’s Name
Date
81
Folic acid and neural tube defects
Figure 1: Features of neural tube development and neural tube defects. Botto et el.
Neural tube defects. NEJM 1999. (28th days after fertilization)
Background and Aim



A reduced recurrent risk of neural tube defects among
women receiving muti-vitamin supplements containing
folic acid.
Most of NTDs are de-novo; less than 10% of NTDs are
recurrent.
First occurrence of only NTDs and periconceptional
folate supplements
Study population
Pregnant women
Target
Source
Study


Case

NTDs
Control

Other major malformations due to recall bias

Subjects with oral clefts were excluded because vitamin
supplementation has been hypothesized to reduce the risk:
selection bias
Overall data
Folate (+) OR = 0.6 (0.4 – 0.8)
85
Recall Bias: Previous knowledge
86
Recall Bias quantification
Case
Control
OR
In this study
1000
1000
real
500
800
0.625
Control – 75%
all
400
600
0.667
Case – 80%
0.6
Prev known
450
600
0.750
Case – 90%
0.8
Prev unknown
375
600
0.625
Case – 75%
0.4
Recall rate
87
Recall bias –
assessment / avoidance
 Check with recorded information, if possible
 Use objective markers or surrogates for
exposure – careful of markers that are affected
by disease
 Ask participant to identify which factor(s) are
Presenter’s Name
important for disease
Date
 Build in false risk factor to test for overreporting
 Use controls with another disease
88
Study population
Pregnant women
Target
Source
Study


Case

NTDs
Control

Other major malformations due to recall bias

Subjects with oral clefts were excluded because vitamin
supplementation has been hypothesized to reduce the risk:
selection bias
Selection bias


If oral clefts were included in control group, control
with exposure (lack of vitamin supplement or folate
intake) increased.
As B number increases, the probability of rejecting
null hypothesis decreases.
Cleft = ↓intake of vitamin
Case
Control
Exposure (+)
A
B
Exposrue (-)
C
D
Exposure: lack of folate intake
Methods

Periconceptional folic acid exposure was determined by
Interview with study nurses

Demographic
Health behavior factors
Reproductive history
Family history of birth defects
Occupation
Illnesses (chronic and during pregnancy)
Use of alcohol, cigarettes and medications
Vitamin use during the 6 months before the last LMP
through the end of pregnancy
Semi-quantitative food frequency questionnaire
Knowledge of vitamins and birth defects









Confounding
Exposure
↓ Folate intake
Confounding
Alcohol
Outcome
↑ NTDs
Interviewer bias
 Differential interviewing of cases and controls,
i.e., may probe or interpret responses
differently

Presenter’s Name
Date
Interviewer Bias
(Information Bias)
93
Interviewer bias –
avoidance / assessment
 Self-administered instruments (prone to more
non-response)
 Standardized instruments  Computerized
instruments (CADI, ACASI)
 Avoid open-ended questions but rather use
Name possible response elicited
questionsPresenter’s
with each
Date
 Training
 Masking interviewers to research question
 Masking interviewers to case/control status
 Same interviewers for cases and controls
94
Odds ratio
Disease
Exposed Yes
No
Yes
No
A1
B1
A
B0
Presenter’s Name 0
Date
A1 B0
OR 
 (cross  product ratio)
A0 B1
Example: CHD and Diabetes
CHD
Yes
Diabetes
No
Yes
183
65
No
575
735
Presenter’s Name
Date
183 / 65
ORCHD 
 3.62
575 / 735
No units!
96
Some properties of odds ratios
 Null value: OR = 1
 OR >= 0 (cannot be negative)
 Multiplicative scale (be careful with plots)
 Use logistic regression to estimate
multivariate
adjusted odds ratios in casePresenter’s Name
control Date
studies
97
Odds ratios and
the “rare disease assumption”
 With incidence density sampling (represents
underlying cohort at time of case) and sampling
of cases and controls independent of exposure:
 OR ≈ IR
 With outcomes of very low incidence in the
underlyingPresenter’s
cohortName
and sampling of cases and
Date
controls independent
of exposure:
 OR ≈ RR
 Higher incidence increases the bias away from
the null
98
Presenter’s Name
Date
99
Matching
 Individual matching
 Up to 1:5
 Frequency matching
 Case selection  confounder frequency 
matching
Presenter’s Name
 Stratified
sampling
Date
 Stratification  selection of case and
control
100
Odds ratio – matched pairs
Case
Control # pairs
A1
B1
n11
A1
B0
n10
A0
Presenter’s Name
B1
n01
B0
n00
Date
A0
N = total # pairs
N pairs = N cases and N controls  2 N people
101
Presenter’s Name
Date
Antunes CMF, et al. N Engl J Med 1979;300:9-13
102
Matching
 Cannot examine the independent effect of
matched variable on outcome
 May inadvertently match
 On the exposure itself or its surrogate
 On a factor in the causal pathway
Presenter’s Name
 On a factor
that is affected by the outcome
Date
 Logistical complexity of matching
 Particularly useful when distribution of
confounders is very different in cases and
controls
103
Designing a case-control study
Overview I






What is the research question?
In what target population?
What source(s) will be used?
How long will recruitment take?
What is the definition of the cases?
What confirmation
is needed? Is screening/additional
Presenter’s Name
testing necessary?
Date
 Will prevalent cases be used? Does exposure
influence the disease prognosis?
 What is the underlying cohort?
 How many cases are seen per year in the source?
104
Designing a case-control study
Overview II
 What are the eligibility criteria for controls?
 What source(s) will be used to identify controls?
 Do they represent the same underlying cohort as the
cases?
 What confirmation is needed? Is screening/additional
testing necessary?
 Sampling methods? Will the controls be selected
Presenter’s Name
throughout the study period? Can they be selected as
Date
cases if they
later develop disease?
 Do additional sources need to be used?
 For both cases and controls, does exposure status
affect: inclusion in source populations or
participation?
105
Designing a case-control study
Overview III
 Are there known confounders? Should matching be
used?
 What methods will be used to recruit cases and controls?
 What methods will be used to obtain information about
exposures and potential confounders? Active / Passive?
 Are the methods of data collection objective and
Name
independent Presenter’s
of case/control
status?
Date
 What methods are in-place to avert and monitor
differential recall by case/control status if interviewing is
involved?
 If study involves personnel-administered data collection,
are the personnel masked to case-control status?
106
Summary
 What is the study question?
 Appropriate
 Duration of recruitment
 Definition of cases
 Prevalence case
Presenter’s Name
 Eligibility
of controls
Date
 Represent the target population
 Another sources
107