Transcript - MediPIET
Measures of
association
Tunisia, 29th October 2014
Prof. Nissaf Bouafif
Observatoire National des Maladies Nouvelles et Emergentes - Tunisia
[email protected]
Learning objectives
•
•
•
•
Calculate and interpret relative risk RR
Calculate and interpret an odds ratio OR
Calculate Confidence Interval for RR and OR
Interpret CI and p-value
Observational Studies
• Descriptive studies
– To estimate the frequency - magnitude
– Distribution: who, where and when
– Useful for generate hypothesis
• Analytical studies
– To identify factors associated with disease
– Quantify the risk
Measures of association (i)
• Relative Risk (RR)
– Cohort study
• Odds ratio (OR)
– Case control study
• Association statistically significant:
– Confidence Interval 95% (CI95%)
– P value
Analytical study: 2x2 table
Disease
Yes
No
Exposed
a
b
Not exposed
c
d
Measures of association
Force of association
RR/OR
Signifiance of association
P value
Precision of association
CI
5
Measures of association (ii)
CI 95%
RR / OR < 1
Protective factor
RR / OR 1
absence of risk
RR / OR > 1
Risk factor
0
1
6
Measures of association
Cohort study
Case control study
7
Cohort study: measure
Disease
No
Yes
exposed
Not exposed
a
c
b
d
Ie =
Ine =
a
a+b
c
c+d
8
Cohort study: measure
population
at risk
Exposed
Not Exposed
Ne
Nne
case
Incidence
a
a
Ie = N
e
c
c
Ine = N
ne
9
Cohort study: measure
Level of
exposure
population
at risk
case
High
N1
a1
Ie1 = a1/N1
Medium
N2
a2
Ie2 = a2/N2
Low
N3
a3
Ie3 = a3/N3
Nne
c
Ine = c/Nne
Not exposed
Incidence
10
Cohort study: measure
Incidence Rate (or Attack Rate)
– Exposed group:
Ie
– Not Exposed group: Ine
How to compare these two risks?
What can we calculate?
11
Measures in a cohort study
Comparisons :
• Risk Excess
• Relative Risk
RE= Ie – Ine
RR =
a
b
c
d
Ie
Ine
a
(
)
RR = a b
c
(
)
cd
12
Example
• Risk of gastric ulcerae and long term aspirin
treatment
– RR= 3.5
– CI 95% (1.8 – 5.4)
What does it mean?
Relative Risk (RR)
Indicates how many times is more likely to
develop the disease if exposed to specific
factor compared to non exposed
RR measure the force of association
Interpretation of RR
CI 95%
RR < 1
Protective factor
RR 1
absence of risk
RR > 1
Risk factor
0
1
15
Example (i)
Disease
Non
disease
Exposed
35
10
45
Non exposed
15
40
55
50
50
Incidence exp =35/45= 0.77
Incidence non exp =15/55= 0.27
RR=0.77/0.27= 2.85
Example (ii)
What is the risk to develop lung cancer if
smoke
Lung cancer
Non lung
cancer
Smoker
1000
300
Non smoker
200
2500
RR=10.8 ; CI95%= (9.06-11.91)
P value= 0.000001
Example (iii): collective toxi-infection
in Mecca, 1979
Meat consumption
Disease
No
Yes
AR
Yes
63
25
71,6%
No
1
6
14,3%
RR
5,0
RR = 71,6 / 14,3 = 5,0
18
Measures in a cohort study: relative risk
exposure level
population
at risk
case incidence
RR
High
N1
a1
Ie1
RR1
medium
N2
a2
Ie2
RR2
low
N3
a3
Ie3
RR3
Nne
c
Ine
reference
Not exposed
19
Example: Outbreak of gastroenteritis, Gourdon,
2000
Number
of drinks
Population Case
At risk
AR
RR
CI95%
> 7 drinks / day
98
56
57,1%
3,9
2,9-5,3
4-7 drinks / day
103
45
43,7%
3,0
2,2-4,2
1-3 drinks / day
99
30
30,3%
2,1
1,4-3,0
373
54
14,5%
Not exposed
20
Measure of association in Cohort
study
In summary:
• a cohort study allow to calculate indicators
that have a clear meaning
• results are immediately intelligible
Measure of association
Cohort study
Case control study
22
Measure of association in case
control study
• no direct calculation of risk
• proportion of exposure
• RR estimation
23
What is Odds
Probability that an event will happen
Probability that an event will not happen
Example of Odds calculation
Won
Lost
Total
------------------------------------------------------------------------------------------------------------------------------------------------
Football game, team A
14
1
15
--------------------------------------------------------------------------------------------------------------------------------------------------
14 / 15
Odds of winning = ------------1 / 15
= 14 : 1
= 14
Odds of a rare event
is equal to the risk of the event
The number of hepatitis cases during an outbreak
Cases
Non cases
Population
-----------------------------------------------------------------------------------Hepatitis A
30
49,970
50,000
------------------------------------------------------------------------------------
30 / 50,000
Odds of disease = -----------------------49,970 / 50,000
Risk of disease =
30/50,000
= 0.006
= 0.006
Odds Ratio (OR)
case
Controls
Exposed
a
b
Not exposed
c
d
Odds of exposure in cases
a / (a + c)
c / (a + c)
=
a+c
a
c
OR =
b+d
a/c
b/d
=
ad
Odds of exposure in controls
b / (b + d)
d / (b + d)
=
b
d
bc
27
Odds Ratio (OR)
case
Controls
Exposed
a
b
Not exposed
c
d
a+c b+d
a
( )
Odds exposition among cases
a *d
Odds ratio (OR) =
c
b
Odds exposition among controls
( ) c*b
d
28
Odds Ratio
• Indicates how many times is more likely the exposure
among cases than in controls
• Similar interpretation as RR
Interpretation of odds ratio
CI 95%
OR < 1
Protective factor
OR = 1
absence of risk
OR > 1
Risk factor
0
1
OR = estimator of RR
30
Example: Collective toxi-infection in Mecca, 1979
Meat consumption
Case
Controls
Yes
63
25
No
1
6
64
31
63 / 1
25 / 6
Odds of exposure
OR = (63 / 1) / (25 / 6) = 15,1
31
Different levels of exposure
Odds Ratio (OR)
Level of exposure
Case
Controls
OR
High
a1
b1
OR1
medium
a2
b2
OR2
low
a3
b3
OR3
c
d
Not exposed
reference
32
Example Relationship between exposure to
hydrocarbons and bladder cancer, USA, 1984
Exposure
time
Case
controls
OR
CI95%
> 19 years
18
27
7,2
2,4-24,3
10 to 19 years
24
50
5,2
1,9-16,6
1 to 9 years
37
113
3,6
1,4-10,8
6
65
< 1 year
reference
33
Relation between RR and OR
RR et OR
In a cohort study:
RR = [a /(a+ b)] / [c /(c + d)]
If the disease is rare:
a << a + b
c << c + d
a
b
c
d
bb
c+dd
a+
RR = (a / b) / (c / d) = ad / bc = OR
35
RR and OR
in conclusion, if the disease is rare
OR = good estimator of RR
36
OR est un bon estimateur du RR ?
In cohort study:
RR = [a /(a+ b)] / [c /(c + d)]
If the disease is rare:
a << a + b
c << c + d
then
bb
c+dd
a+
RR = (a / b) / (c / d) = ad / bc = OR
a
b
c
d
????
????
5 = 15,1 !!!!
37
OR est un bon estimateur du RR ?
Mathematical relationship between OR and RR :
OR =
R0=
RR (1 - R0)
-----------------------1 - (RR x R0)
frequency of the disease in the not exposed population
More the disease is rare and the RR is low
more the OR is a good estimator of the relative risk
RR=1,5
RR = 2,0
RR = 5,0
R0 =0,001
OR= 1,5
OR = 2,0
OR= 5,0
R0 =0,01
OR= 1,5
OR = 2,0
OR = 5,2
R0 =0,02
OR= 1,5
OR = 2,0
OR = 5,4
R0 =0,05
OR = 1,5
OR = 2,1
OR = 6,3
Compared values of OR and RR according to Disease risk
in not exposed
Interpretation of measures
Association statistically significant
“The risk of developing lung cancer was eleven
times higher (RR=10.8) among smokers
compared to non-smokers”
How confident can we be in this result?
What is the precision of this estimate?
Interpretation of the p-value
Females report liking statistics more
frequently than men do (PR:1.3;
p=0.03)
• There is a 3% probability of observing this or a
greater difference between genders if there is
no true difference
Interpretation of the p-value
• P-values are a measure of probability
• Probability of observing this result
(OR/RR/RD) or a more extreme one, if there is
no true difference between groups
Interpretation of p-value
• The smaller the p-value the stronger the
evidence against the Null hypothesis
– Small p-value: Data are not very consistent with
the null hypothesis
– Large p-value: Reasonable chance that the
differences seen are due to sampling variation
(not much evidence against Null hypothesis)
Cut-off levels of significance
• 0.01, 0.05, 0.10
• P-value < 0.05
H0 rejected (significant)
• P-values > 0.05 H0
not rejected
(non-significant)
• Degrades epidemiology into a simple
dichotomy: yes/no
– Reality is much more interesting
– The picture is rarely black or white
What information does the p-value
provide?
• The p-value does not tell us about the
– direction of association (protective/risk factor)
– magnitude of association
– precision around the point estimate
Confidence intervals
Population
Point estimate
Confidence interval
P-value
Sample
The epidemiologist needs measurements
rather than probabilities
The best estimate = point estimate
Confidence interval precision of the point estimate
Confidence interval definition
• Range of values, on the basis of the sample data,
in which the population value (or true value)
may lie.
• Formal definition: If the measurement of the
estimate could be replicated many times, the
correct value is inside the interval 95% (or 90% or
80%...) of the time
• Pragmatic definition: We can be reasonably
confident that the correct value is inside the
confidence interval
Confidence Interval of RR
Semi exact method:
(1R1) (1R0)
IC95%RRexp (1,96
L R L R
1
calculation: exp(A)= eA
1
0
avec e= 2,7183
If the calculation of the incidence rate is possible:
IC95%RRexp 1,96
Miettinen Method:
1 a1 c
IC95%RR
1 1,96
2
0
Confidence interval of OR
• Semi exact method:
IC95%ORexp1,96
1 a1 b1 c1 d
• Miettinen method:
IC95% OR
1 1, 96
2
52
CI terminology
Indicates amount of random error around the point estimate
Point estimate
Confidence interval
RR = 1.45 (0.99 – 2.13)
Lower
confidence
limit
Upper
confidence
limit
Width of confidence interval
depends on …
•
amount of variability in the data
•
size of the sample
•
level of confidence (usually 90%, 95%, 99%)
Looking at the CI
A
B
RR = 1
Large RR
A common way to use CI regarding OR/RR is :
If 1.0 is included in CI non significant
If 1.0 is not included in CI significant
Norovirus on a Greek island
• How confident can we be in the result?
• Relative risk = 21.5 (point estimate)
• 95% CI for the relative risk:
(8.9 - 51.8)
The probability that the CI from 8.9 to 51.8
includes the true relative risk is 95%.
Norovirus on a Greek island
“The risk of illness was higher among people
who ate raw seafood (RR=21.5, 95% CI 8.9 to
51.8).”
Example: Chlordiazopoxide use and
congenital heart disease (n=1 644)
C use
No C use
Cases
Controls
4
4
386
1 250
OR = (4 x 1250) / (4 x 386) = 3.2
p = 0.080 ; 95% CI = 0.6 - 17.5
From Rothman K
Precision
• Wide confidence interval: low precision
– Small study
• Narrow confidence interval: high precision
– Large study
• Are all values in the confidence interval just as
likely?
P-value function
• A graph showing the p-value for all possible
values of the estimate (not just RR=1)
• All confidence intervals can be read from the
curve
• The stronger the association, the more to the
right is the peak
• The narrowness of the curve indicates the
precision
• The function can be constructed from the
confidence limits
Point estimate
Strength
p-value
95% confidence interval
Precision
Example: Chlordiazepoxide use and
congenital heart disease
C use
No C use
Cases
4
386
Controls
4
1250
OR = (4 x 1250) / (4 x 386) = 3.2
p = 0.08 (not significant)
From Rothman K
So chlordiazepoxide use is safe?
OR = 3.2 (0.81 – 13)
”The confidence interval includes 1 so the
association is not significant”
3.2
p=?
Odds ratio
0.81 - 13
3.2
p=0.08
Odds ratio
0.81 - 13
Example: Chlordiazepoxide use and
congenital heart disease – large study
C use
No C use
Cases
1090
14 910
Controls
1000
15 000
OR = (1090 x 15000) / (1000 x 14910) = 1.1
p = 0.04 (significant)
From Rothman K
3.2
p=0.080
0.6 – 17.5
Example: Chlordiazopoxide use and congenital
heart disease – large study (n=17 151)
C use
No C use
Cases
Controls
240
211
7 900
8 800
OR = (240 x 8800) / (211 x 7900) = 1.3
p = 0.013 ; 95% CI = 1.1 - 1.5
Precision and strength of association
Strength
Precision
So chlordiazopoxide use is safe?
OR = 1.1 (1.0– 1.2)
”The confidence interval does not include 1
so the association is significant”
COMMON ERRORS
WHEN INTERPRETING P-VALUES
Be careful!
Common error 1
• Not taking the point estimates and Cis into
account in small studies
• P-value > 0.05: reject hypothesis
20 studies
0
1
RR
Common error 2
• All findings with p<0.05 are assumed to be
due to a true association
• With a 0.05 significance level:
1 in 20 comparisons in which the null
hypothesis is true: p<0.05
http://www.jerrydallal.com/LHSP/multtest.htm
Common error 3
• All statistically significant (p<0.05) findings
have public health importance
Common error 3
• Does vaccine A prevent disease A?
• 100,000 participants
• RR: 0.92; 95% CI: 0.91 – 0.93;
p-value< 0.001
• But: VE only 8%
– Not much disease prevented through vaccine
Confidence interval provides more
information than p value
• Magnitude of the effect
(strength of association)
• Direction of the effect
(RR > or < 1)
• Precision of the point estimate of the effect
(variability)
Comments on p-values and CIs
• Presence of significance does not prove clinical or
biological relevance of an effect
• A lack of significance is not necessarily a lack of
an effect:
“Absence of evidence is not evidence of absence”
Comments on p-values and CIs (ii)
• A huge effect in a small sample or a small effect in a
large sample can result in identical p values.
• A statistical test will always give a significant result if
the sample is big enough.
• p values and CIs do not provide any information on
the possibility that the observed association is due to
bias or confounding
Recommendations
• Always look at the raw data (2x2-table). How many
cases can be explained by the exposure?
• Interpret with caution associations that achieve
statistical significance
• Double caution if this statistical significance is not
expected
• Use confidence intervals to describe your results
2
What we have to evaluate the study
Test of association, depends on sample size
p value
Probability that equal (or more extreme)
results can be observed by chance alone
OR, RR
Direction & strength of association
if > 1 risk factor
if < 1 protective factor
(independently from sample size)
CI
Magnitude and precision of effect
Suggested reading
• KJ Rothman, S Greenland, TL Lash, Modern Epidemiology,
Lippincott Williams & Wilkins, Philadelphia, PA, 2008
• SN Goodman, R Royall, Evidence and Scientific Research,
AJPH 78, 1568, 1988
• SN Goodman, Toward Evidence-Based Medical Statistics.
1: The P Value Fallacy, Ann Intern Med. 130, 995, 1999
• C Poole, Low P-Values or Narrow Confidence Intervals:
Which are more Durable? Epidemiology 12, 291, 2001
Presentations
• IntoEpi, Alain Moren, Preben Aavitsland,Esther
Kissling, EpiConcept
• CEC Méthodologie statistique et épidémiologie,
Faculté de Médecine Tunis
• EpiTun
• IDEA
• T Ancelle / A Bosman / D Coulombier / A Moren / P
Sudre / M Valenciano/ J Fitzner/ S Hahné, P
Penttinen/ P Kreidl / B Schimmer
Thank you!
Prof. Nissaf Bouafif
Observatoire National des Maladies Nouvelles et Emergentes - Tunisia
[email protected]