Transcript - MediPIET

Measures of
association
Tunisia, 29th October 2014
Prof. Nissaf Bouafif
Observatoire National des Maladies Nouvelles et Emergentes - Tunisia
[email protected]
Learning objectives
•
•
•
•
Calculate and interpret relative risk RR
Calculate and interpret an odds ratio OR
Calculate Confidence Interval for RR and OR
Interpret CI and p-value
Observational Studies
• Descriptive studies
– To estimate the frequency - magnitude
– Distribution: who, where and when
– Useful for generate hypothesis
• Analytical studies
– To identify factors associated with disease
– Quantify the risk
Measures of association (i)
• Relative Risk (RR)
– Cohort study
• Odds ratio (OR)
– Case control study
• Association statistically significant:
– Confidence Interval 95% (CI95%)
– P value
Analytical study: 2x2 table
Disease
Yes
No
Exposed
a
b
Not exposed
c
d
Measures of association
Force of association
RR/OR
Signifiance of association
P value
Precision of association
CI
5
Measures of association (ii)
CI 95%
RR / OR < 1
Protective factor
RR / OR  1
absence of risk
RR / OR > 1
Risk factor
0
1

6
Measures of association

Cohort study
Case control study
7
Cohort study: measure
Disease
No
Yes
exposed
Not exposed
a
c
b
d
Ie =
Ine =
a
a+b
c
c+d
8
Cohort study: measure
population
at risk
Exposed
Not Exposed
Ne
Nne
case
Incidence
a
a
Ie = N
e
c
c
Ine = N
ne
9
Cohort study: measure
Level of
exposure
population
at risk
case
High
N1
a1
Ie1 = a1/N1
Medium
N2
a2
Ie2 = a2/N2
Low
N3
a3
Ie3 = a3/N3
Nne
c
Ine = c/Nne
Not exposed
Incidence
10
Cohort study: measure
Incidence Rate (or Attack Rate)
– Exposed group:
Ie
– Not Exposed group: Ine
 How to compare these two risks?
What can we calculate?
11
Measures in a cohort study
Comparisons :
• Risk Excess
• Relative Risk
RE= Ie – Ine
RR =
a
b
c
d
Ie
Ine
a
(
)
RR = a  b
c
(
)
cd
12
Example
• Risk of gastric ulcerae and long term aspirin
treatment
– RR= 3.5
– CI 95% (1.8 – 5.4)
What does it mean?
Relative Risk (RR)
Indicates how many times is more likely to
develop the disease if exposed to specific
factor compared to non exposed
 RR measure the force of association
Interpretation of RR
CI 95%
RR < 1
Protective factor
RR  1
absence of risk
RR > 1
Risk factor
0
1

15
Example (i)
Disease
Non
disease
Exposed
35
10
45
Non exposed
15
40
55
50
50
Incidence exp =35/45= 0.77
Incidence non exp =15/55= 0.27
RR=0.77/0.27= 2.85
Example (ii)
What is the risk to develop lung cancer if
smoke
Lung cancer
Non lung
cancer
Smoker
1000
300
Non smoker
200
2500
RR=10.8 ; CI95%= (9.06-11.91)
P value= 0.000001
Example (iii): collective toxi-infection
in Mecca, 1979
Meat consumption
Disease
No
Yes
AR
Yes
63
25
71,6%
No
1
6
14,3%
RR
5,0
RR = 71,6 / 14,3 = 5,0
18
Measures in a cohort study: relative risk
exposure level
population
at risk
case incidence
RR
High
N1
a1
Ie1
RR1
medium
N2
a2
Ie2
RR2
low
N3
a3
Ie3
RR3
Nne
c
Ine
reference
Not exposed
19
Example: Outbreak of gastroenteritis, Gourdon,
2000
Number
of drinks
Population Case
At risk
AR
RR
CI95%
> 7 drinks / day
98
56
57,1%
3,9
2,9-5,3
4-7 drinks / day
103
45
43,7%
3,0
2,2-4,2
1-3 drinks / day
99
30
30,3%
2,1
1,4-3,0
373
54
14,5%
Not exposed
20
Measure of association in Cohort
study
In summary:
• a cohort study allow to calculate indicators
that have a clear meaning
• results are immediately intelligible
Measure of association

Cohort study
Case control study
22
Measure of association in case
control study
• no direct calculation of risk
• proportion of exposure
• RR estimation
23
What is Odds
Probability that an event will happen
Probability that an event will not happen
Example of Odds calculation
Won
Lost
Total
------------------------------------------------------------------------------------------------------------------------------------------------
Football game, team A
14
1
15
--------------------------------------------------------------------------------------------------------------------------------------------------
14 / 15
Odds of winning = ------------1 / 15
= 14 : 1
= 14
Odds of a rare event
is equal to the risk of the event
The number of hepatitis cases during an outbreak
Cases
Non cases
Population
-----------------------------------------------------------------------------------Hepatitis A
30
49,970
50,000
------------------------------------------------------------------------------------
30 / 50,000
Odds of disease = -----------------------49,970 / 50,000
Risk of disease =
30/50,000
= 0.006
= 0.006
Odds Ratio (OR)
case
Controls
Exposed
a
b
Not exposed
c
d
Odds of exposure in cases
a / (a + c)
c / (a + c)
=
a+c
a
c
OR =
b+d
a/c
b/d
=
ad
Odds of exposure in controls
b / (b + d)
d / (b + d)
=
b
d
bc
27
Odds Ratio (OR)
case
Controls
Exposed
a
b
Not exposed
c
d
a+c b+d
a
( )
Odds exposition among cases
a *d
Odds ratio (OR) =
 c 
b
Odds exposition among controls
( ) c*b
d
28
Odds Ratio
• Indicates how many times is more likely the exposure
among cases than in controls
• Similar interpretation as RR
Interpretation of odds ratio
CI 95%
OR < 1
Protective factor
OR = 1
absence of risk
OR > 1
Risk factor
0
1

OR = estimator of RR
30
Example: Collective toxi-infection in Mecca, 1979
Meat consumption
Case
Controls
Yes
63
25
No
1
6
64
31
63 / 1
25 / 6
Odds of exposure
OR = (63 / 1) / (25 / 6) = 15,1
31
Different levels of exposure
Odds Ratio (OR)
Level of exposure
Case
Controls
OR
High
a1
b1
OR1
medium
a2
b2
OR2
low
a3
b3
OR3
c
d
Not exposed
reference
32
Example Relationship between exposure to
hydrocarbons and bladder cancer, USA, 1984
Exposure
time
Case
controls
OR
CI95%
> 19 years
18
27
7,2
2,4-24,3
10 to 19 years
24
50
5,2
1,9-16,6
1 to 9 years
37
113
3,6
1,4-10,8
6
65
< 1 year
reference
33
Relation between RR and OR
RR et OR
In a cohort study:
RR = [a /(a+ b)] / [c /(c + d)]
If the disease is rare:
a << a + b
c << c + d
a
b
c
d
bb
c+dd
a+
RR = (a / b) / (c / d) = ad / bc = OR
35
RR and OR
in conclusion, if the disease is rare
OR = good estimator of RR
36
OR est un bon estimateur du RR ?
In cohort study:
RR = [a /(a+ b)] / [c /(c + d)]
If the disease is rare:
a << a + b
c << c + d
then
bb
c+dd
a+
RR = (a / b) / (c / d) = ad / bc = OR
a
b
c
d
????
????
5 = 15,1 !!!!
37
OR est un bon estimateur du RR ?
Mathematical relationship between OR and RR :
OR =
R0=
RR (1 - R0)
-----------------------1 - (RR x R0)
frequency of the disease in the not exposed population
More the disease is rare and the RR is low
more the OR is a good estimator of the relative risk
RR=1,5
RR = 2,0
RR = 5,0
R0 =0,001
OR= 1,5
OR = 2,0
OR= 5,0
R0 =0,01
OR= 1,5
OR = 2,0
OR = 5,2
R0 =0,02
OR= 1,5
OR = 2,0
OR = 5,4
R0 =0,05
OR = 1,5
OR = 2,1
OR = 6,3
Compared values ​of OR and RR according to Disease risk
in not exposed
Interpretation of measures
Association statistically significant
“The risk of developing lung cancer was eleven
times higher (RR=10.8) among smokers
compared to non-smokers”
 How confident can we be in this result?
 What is the precision of this estimate?
Interpretation of the p-value
Females report liking statistics more
frequently than men do (PR:1.3;
p=0.03)
• There is a 3% probability of observing this or a
greater difference between genders if there is
no true difference
Interpretation of the p-value
• P-values are a measure of probability
• Probability of observing this result
(OR/RR/RD) or a more extreme one, if there is
no true difference between groups
Interpretation of p-value
• The smaller the p-value the stronger the
evidence against the Null hypothesis
– Small p-value: Data are not very consistent with
the null hypothesis
– Large p-value: Reasonable chance that the
differences seen are due to sampling variation
(not much evidence against Null hypothesis)
Cut-off levels of significance
• 0.01, 0.05, 0.10
• P-value < 0.05
H0 rejected (significant)
• P-values > 0.05 H0
not rejected
(non-significant)
• Degrades epidemiology into a simple
dichotomy: yes/no
– Reality is much more interesting
– The picture is rarely black or white
What information does the p-value
provide?
• The p-value does not tell us about the
– direction of association (protective/risk factor)
– magnitude of association
– precision around the point estimate
Confidence intervals
Population
Point estimate
Confidence interval
P-value
Sample
The epidemiologist needs measurements
rather than probabilities
The best estimate = point estimate
Confidence interval  precision of the point estimate
Confidence interval definition
• Range of values, on the basis of the sample data,
in which the population value (or true value)
may lie.
• Formal definition: If the measurement of the
estimate could be replicated many times, the
correct value is inside the interval 95% (or 90% or
80%...) of the time
• Pragmatic definition: We can be reasonably
confident that the correct value is inside the
confidence interval
Confidence Interval of RR
Semi exact method:
(1R1) (1R0)
IC95%RRexp (1,96

L R L R
1
calculation: exp(A)= eA
1
0
avec e= 2,7183
If the calculation of the incidence rate is possible:

IC95%RRexp 1,96
Miettinen Method:
1 a1 c


IC95%RR
1 1,96
 2 
0
Confidence interval of OR
• Semi exact method:
IC95%ORexp1,96
1 a1 b1 c1 d 
• Miettinen method:
IC95%  OR
 
1 1, 96
 


2 
52
CI terminology
Indicates amount of random error around the point estimate
Point estimate
Confidence interval
RR = 1.45 (0.99 – 2.13)
Lower
confidence
limit
Upper
confidence
limit
Width of confidence interval
depends on …
•
amount of variability in the data
•
size of the sample
•
level of confidence (usually 90%, 95%, 99%)
Looking at the CI
A
B
RR = 1
Large RR
A common way to use CI regarding OR/RR is :
If 1.0 is included in CI  non significant
If 1.0 is not included in CI  significant
Norovirus on a Greek island
• How confident can we be in the result?
• Relative risk = 21.5 (point estimate)
• 95% CI for the relative risk:
(8.9 - 51.8)
The probability that the CI from 8.9 to 51.8
includes the true relative risk is 95%.
Norovirus on a Greek island
“The risk of illness was higher among people
who ate raw seafood (RR=21.5, 95% CI 8.9 to
51.8).”
Example: Chlordiazopoxide use and
congenital heart disease (n=1 644)
C use
No C use
Cases
Controls
4
4
386
1 250
OR = (4 x 1250) / (4 x 386) = 3.2
p = 0.080 ; 95% CI = 0.6 - 17.5
From Rothman K
Precision
• Wide confidence interval: low precision
– Small study
• Narrow confidence interval: high precision
– Large study
• Are all values in the confidence interval just as
likely?
P-value function
• A graph showing the p-value for all possible
values of the estimate (not just RR=1)
• All confidence intervals can be read from the
curve
• The stronger the association, the more to the
right is the peak
• The narrowness of the curve indicates the
precision
• The function can be constructed from the
confidence limits
Point estimate
Strength
p-value
95% confidence interval
Precision
Example: Chlordiazepoxide use and
congenital heart disease
C use
No C use
Cases
4
386
Controls
4
1250
OR = (4 x 1250) / (4 x 386) = 3.2
p = 0.08 (not significant)
From Rothman K
So chlordiazepoxide use is safe?
OR = 3.2 (0.81 – 13)
”The confidence interval includes 1 so the
association is not significant”
3.2
p=?
Odds ratio
0.81 - 13
3.2
p=0.08
Odds ratio
0.81 - 13
Example: Chlordiazepoxide use and
congenital heart disease – large study
C use
No C use
Cases
1090
14 910
Controls
1000
15 000
OR = (1090 x 15000) / (1000 x 14910) = 1.1
p = 0.04 (significant)
From Rothman K
3.2
p=0.080
0.6 – 17.5
Example: Chlordiazopoxide use and congenital
heart disease – large study (n=17 151)
C use
No C use
Cases
Controls
240
211
7 900
8 800
OR = (240 x 8800) / (211 x 7900) = 1.3
p = 0.013 ; 95% CI = 1.1 - 1.5
Precision and strength of association
Strength
Precision
So chlordiazopoxide use is safe?
OR = 1.1 (1.0– 1.2)
”The confidence interval does not include 1
so the association is significant”
COMMON ERRORS
WHEN INTERPRETING P-VALUES
Be careful!
Common error 1
• Not taking the point estimates and Cis into
account in small studies
• P-value > 0.05: reject hypothesis
20 studies
0
1
RR
Common error 2
• All findings with p<0.05 are assumed to be
due to a true association
• With a 0.05 significance level:
1 in 20 comparisons in which the null
hypothesis is true: p<0.05
http://www.jerrydallal.com/LHSP/multtest.htm
Common error 3
• All statistically significant (p<0.05) findings
have public health importance
Common error 3
• Does vaccine A prevent disease A?
• 100,000 participants
• RR: 0.92; 95% CI: 0.91 – 0.93;
p-value< 0.001
• But: VE only 8%
– Not much disease prevented through vaccine
Confidence interval provides more
information than p value
• Magnitude of the effect
(strength of association)
• Direction of the effect
(RR > or < 1)
• Precision of the point estimate of the effect
(variability)
Comments on p-values and CIs
• Presence of significance does not prove clinical or
biological relevance of an effect
• A lack of significance is not necessarily a lack of
an effect:
“Absence of evidence is not evidence of absence”
Comments on p-values and CIs (ii)
• A huge effect in a small sample or a small effect in a
large sample can result in identical p values.
• A statistical test will always give a significant result if
the sample is big enough.
• p values and CIs do not provide any information on
the possibility that the observed association is due to
bias or confounding
Recommendations
• Always look at the raw data (2x2-table). How many
cases can be explained by the exposure?
• Interpret with caution associations that achieve
statistical significance
• Double caution if this statistical significance is not
expected
• Use confidence intervals to describe your results
2
What we have to evaluate the study
Test of association, depends on sample size
p value
Probability that equal (or more extreme)
results can be observed by chance alone
OR, RR
Direction & strength of association
if > 1 risk factor
if < 1 protective factor
(independently from sample size)
CI
Magnitude and precision of effect
Suggested reading
• KJ Rothman, S Greenland, TL Lash, Modern Epidemiology,
Lippincott Williams & Wilkins, Philadelphia, PA, 2008
• SN Goodman, R Royall, Evidence and Scientific Research,
AJPH 78, 1568, 1988
• SN Goodman, Toward Evidence-Based Medical Statistics.
1: The P Value Fallacy, Ann Intern Med. 130, 995, 1999
• C Poole, Low P-Values or Narrow Confidence Intervals:
Which are more Durable? Epidemiology 12, 291, 2001
Presentations
• IntoEpi, Alain Moren, Preben Aavitsland,Esther
Kissling, EpiConcept
• CEC Méthodologie statistique et épidémiologie,
Faculté de Médecine Tunis
• EpiTun
• IDEA
• T Ancelle / A Bosman / D Coulombier / A Moren / P
Sudre / M Valenciano/ J Fitzner/ S Hahné, P
Penttinen/ P Kreidl / B Schimmer
Thank you!
Prof. Nissaf Bouafif
Observatoire National des Maladies Nouvelles et Emergentes - Tunisia
[email protected]