Transcript ppt

Statistics 101
Why statistics ?
• To understand studies in clinical journals.
• To design and analyze clinical research studies.
• To be better able to explain epidemiologic
research to patients.
• To answer questions on board examinations.
Types of Clinical Research Studies
• Cohort: all patients have some condition or something in
common (e.g., healthy and living in Framingham, MA)
• Case-Control: cases have some condition; controls do not
– Often an aspect of cohort study, which controls are ‘matched’ with cases for age, gender,
and sometimes other variables such as date of admission or date of encounter
• Randomized, placebo-controlled treatment trial: all patients
have the condition
• May be unblinded, single blinded or double blinded
• Randomized, active-treatment controlled trial: all patients
have the condition
• often phase 3 trial
• Meta analysis: multiple studies of same condition, although
definition of the condition may vary from study to study
Types of Variables
CONTINUOUS
–
–
–
–
–
–
–
–
AGE
BP
CRP
AST, CK, glucose, etc
HEIGHT
WEIGHT
BMI
Etc.
CATEGORICAL
–
–
–
–
–
–
–
GENDER
OBESE
CURE
MI
RACE
OLD vs YOUNG
Etc.
Basic Statistical Terms
• Range: the two extreme values (min and max)
• Mean: the average value (uses all values)
• Median: the middle value (ignores extreme values), which
divides population into two subgroups
• Quartiles: divides all values into 4 groups
– Tertiles, Quintiles, Percentiles
• Standard deviation of the mean: measure degrees of
difference among all values (uses all values)
SD= ((differences from the mean2 )/n-1)
A simple example of standard deviation
Values (n=5) Difference
Differences2
from mean,d
d2
12
2
4
10
0
0
5
5
25
15
5
25
d2(n-1)= 58/4=14.5
14.5 = 3.8
SD = 3.8
8
Mean=10
Median=?
2
4
d2=58
Serum [Na+] in 135 normals
146
144
serum Na
142
140
138
136
134
0
20
40
60
80
100
120
140
subject number
Mean, 140; median 140; range, 135-145 mM; standard deviation 2
160
The normal (bell-shaped) distribution
mean
• Imagine 2 curves with the same
mean, but different SDs ( one
wider and less precise; the other
narrower and more precise)
– Confidence intervals will differ
• Now imagine two curves with
different means and standard
deviations from this curve
n
– Statistical tests are designed to
tell us to what extent these
different curves could have
occurred by chance
Standard deviations (SD) from the mean.
95% of values are within 1.96 SD of mean
Some important statistical concepts
• Confidence intervals (usually reported as 95% CI)
• Number needed to treat (or harm)
• Absolute and relative risk or benefit reductions (or increases)
• 2-by-2 tables (Chi square, Fisher exact, Mantel Haenszel, others)
• Odds or hazard ratios
• Type 1 and 2 errors (Statistics 102)
• Estimating sample size needed for a study (Statistics 102)
• Pre- and post-test probabilities and likelihood ratios (Statistics 102)
Ann Int Med 2009: 150: JC6-16
95% Confidence interval (CI): Example 1
H. pylori eradication/NSAID study with outcome
of ulcer or no ulcer (categorical outcome):
5 of 51 (10%, or .10) Hp+ pts. who received
antibiotics got ulcers when exposed to NSAID.
… and 15 of 49 (31%, or .31) Hp+ pts. who did
not receive antibiotics got ulcers when exposed
to NSAID.
What is the chance this difference in outcome
occurred due to chance and not the antibiotics?
Lancet 2002; 359:9-13.
95% CIs
The proportions, p1 and p2, of patients who
got ulcers in the 2 groups are an estimate of
the true rate. However, from this estimate
we can be 95% confident that the actual
rates ranges from A to B, with p1 and p2 in
the center of the interval from A to B. A and
B are the 95% confidence intervals.
p1
A
B
A→B is t h e 9 5 % c o n f i d e n c e i n t e r v a l
95% Confidence interval (CI)
To calculate the 95% CI for p (i.e., A and B),
use this formula:
p ± 1.96 [(p)(1-p)/n]
The larger the n, which is in the denominator, the smaller (more precise) the CI
5 of 51 (p1=10%, or .10) of the antibiotic group got
ulcers when exposed to NSAID for a fixed time
– 95% CI =.10  1.96(.1)(.9)/51=.10±.08=[.02, .18] [2%,18%]
15 of 49 (p2=31%, or .31) of the placebo- group got
ulcers when exposed to NSAID for a fixed time
– 95%CI =.311.96(.31)(.69)/49 =.31±.13=[.18,.44][18%, 44%]
Note: the two 95% CIs do not overlap, which means that differences
are unlikely to be due to chance. But is the ARR significant?
Absolute risk reduction (ARR)
(and its 95% CI)
• The ARR with antibiotics was 31% minus 10%, or 21%.
• The 95% CI of the ARR =
21%  1.96  (p1)(1-p1)/n1+(p2)(1-p2)/n2)=
21% 15%, or [6%, 36%].
• The ARR with antibiotics is somewhere between 6% and
36%, with 95% confidence.
• This CI does not overlap zero and thus is unlikely due to
chance.
Number needed to treat (NNT)
• If Absolute Risk reduction (ARR) = 31%-10%=21%,
the number needed to treat = 1/ARR = 1/.21=5.
• Number needed to harm is the same concept as number needed to
treat except that the intervention caused harm rather than good
– e.g.: how many patients needed to be treated with antibiotics to
produce one drug rash
• Easy to calculate 95% CI of NNT
• http://www.graphpad.com/quickcalcs/index.cfm
Example : A new protease inhibitor is tested in chronic
hepatitis C, genotype 1. The new therapy (added to the
standard therapy, interferon alpha/ribavirin) or standard
therapy is randomly given to 200 patients for 48 weeks.
Sustained viral response rates were as follows:
STANDAR D
RX (n=101)
NEW + STANDARD (n=99)
SVR
No SVR
50
51
83
16
What is the N needed to treat to achieve 1 additional SVR?
Number (n) needed to treat (NNT)
1
NNT=
NNT=
(SVR, NEW / # NEW) – (SVR,CONTROL / # CONTROL)
1
(83/99) –( 50/101)
=
1
.343
3
Note that the denominator , .343 (34.3%) , is the absolute risk reduction ( ARR).
NNT= 1/ARR.
Using http://www.graphpad.com/quickcalcs/index.cfm
95% CI of ARR = 0.222 to 0.465.
95% CI of NNT = 2.2 to 4.5.
RRR
• Relative Risk Reduction (RRR) = ARR/risk with placebo..
• In this example, RRR= 21%/31% = 68%.
–
–
–
–
Treat 1,000 pts. with NSAID 310 ulcers (31%)
Treat 1,000 pts. with NSAID + Abs 100 ulcers (10%)
Antibiotic use prevented 210 ulcers (210/310 = 68% = RRR)
Antibiotic use reduced ulcers from 310 to 100, or to 32% of expected, a RRR of 68%.
• Note: Length of exposure to NSAID in this study in the 2 groups was
identical. If two groups were not followed for an identical time, often
the case in trials, outcomes may be higher in the group followed
longer and thus events need to be expressed per unit of time (e.g.,
events per 100 patient-years)
Example 2:
VTE or no VTE (categorical outcome)
14 of 255 (p1=5.5%, or .055) patients with VTE
switched to low-intensity warfarin developed
another VTE
–
95% CI = [2.6%, 8.4%]
… and 37 of 253 (p2=14.6%, or .146) switched to
placebo developed another VTE
– 95% CI = [10.3%, 18.9%]
Is this 9.1% difference in VTE likely to be due to chance?
New Engl. J. Med. 2003; 348: 1425-1434
Example 3: Chi Square/Fisher Exact Tests
(used for categorical outcomes)
• A new treatment for colitis is compared to the standard
treatment in 245 patients.
• 120 patients are randomized to the new treatment and 125 to
the standard treatment.
• 90 given the new treatment group go into remission (75%)
and 30 (25%) do not.
• 75 given the standard treatment go into remission (60%) and
50 (40%) do not.
• Is this a significant improvement in outcome, or to what
extent could this have been due to chance? Let’s vote!
Step 1: standard 2X2 table
REMIT NO REMIT
New Rx
Standard Rx
a
c
a+c
b
d
b+d
a+b
c+d
a+b+c+d=n=total patients in study
Enter the data from our study
REMIT NO REMIT
New Rx:
Standard Rx:
90(a) 30(b)
75(c) 50(d)
165 80
(a+c)
(b+d)
120(a+b)
125(c+d)
245(a+b+c+d)=n
Calculate chi square (2) by plugging in
numbers into handheld or online calculator
2 = n (ad-bc- n/2)2
(a+b)(c+d)(a+c)(b+d)
2 = 6.264 (p=0.0123)
http://www.graphpad.com/quickcalcs/index.cfm
Fisher exact test, p=0.0143
We could also have calculated the
odds ratio (OR) for a remission :
New Rx
Standard Rx
a=90
c= 75
b=30
d=50
odds ratio = ad/bc
odds ratio = 4,500/ 2,250= 2
But this odds ratio of 2 could have occurred by chance.
We can calculate the 95% CI of the odds ratio to see if the
CI overlaps 1 or not. If not, it favors the new treatment
with >95% confidence.
95% CI of the odds ratio (OR)
ln 95% CI = ln OR  1.96 1/a+1/b+1/c+1/d
The OR = 2.00, and so the ln 2.00= 0.693 (e2.72)
Thus ln 95% CI= 0.693  0.508 = 0.185, 1.201.
To find the CI, we need the antiln of 0.185 and of 1.201.
Antiln 0.185 = e.185 =1.20; antiln 1.201 = e1.201 =3.32.
 95% CI =1.20, 3.32.
Thus, the odds ratio for a remission with the new treatment
is 2.00 (95% CI= 1.20, 3.32).
• As this odds ratio does not cross 1.00, the difference is
unlikely due to chance and is significant at the 0.05 level.
•
•
•
•
•
•
•