Transcript Slide 1

Randomized
Clinical Trials
(RCT)
Randomized
Clinical Trials
(RCT)
Designed to compare two or more treatment
groups for a statistically significant difference
between them – i.e., beyond random chance –
often measured via a “p-value” (e.g., p < .05).
Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx
 Let X = decrease (–) in cholesterol level (mg/dL); possible expected distributions:
Experiment
Treatment
population
Control
population
significant?
1
2
H0 : 1  2
X
0
Patients
satisfying
inclusion
criteria
R
A
N
D
O
M
I
Z
E
Treatment
Arm
End of Study
RANDOM
SAMPLES
Control
Arm
T-test
F-test
(ANOVA)
Randomized
Clinical Trials
(RCT)
Designed to compare two or more treatment
groups for a statistically significant difference
between them – i.e., beyond random chance –
often measured via a “p-value” (e.g., p < .05).
Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx
 Let T = Survival time (months); population survival curves:
S(t) = P(T > t)
1
survival
probability
Kaplan-Meier
estimates
AUC difference
significant?
S2(t)
Control
S1(t)
Treatment
0
H0 : S1 (t )  S2 (t )
T
End of Study
Log-Rank Test,
Cox Proportional
Hazards Model
Case-Control
studies
Cohort
studies
Observational study designs that test for a statistically significant association
between a disease D and exposure E to a potential risk (or protective) factor,
measured via “odds ratio,” “relative risk,” etc. Lung cancer / Smoking
Case-Control
studies
Cohort
studies
PRESENT
PAST
FUTURE
cases
E+ vs. E– ?
controls
reference group
D+ vs. D– E+ vs. E–
 relatively easy and inexpensive
subject to faulty records, “recall bias”
D+ vs. D– ?
 measures direct effect of E on D
expensive, extremely lengthy…
Example: Framingham, MA study
Both types of study yield a 22 “contingency table” of data:
D+
D–
E+
a
b
a+b
E–
c
d
c+d
a+c b+d
n
where a, b, c, d are the
numbers of individuals
in each cell.
H0: No association
between D and E.
End of Study
Chi-squared Test
McNemar Test
Surveys,
prevalence
studies, etc.
Surveys,
prevalence
studies, etc.
The analytical techniques that apply to longitudinal studies
(i.e., observations over time) are also appropriate for these
cross-sectional studies (i.e., observations at a fixed time).
PRESENT
PAST
FUTURE
Many publicly available datasets from large federal studies on disease
prevalence / incidence rates, birth / mortality rates, population trends,
etc., are archived at the National Center for Health Statistics
.
As seen, testing for association between categorical variables – such as
disease D and exposure E – can generally be done via a Chi-squared Test.
But what if the two variables – say, X and Y – are numerical measurements?
Furthermore, if sample data does suggest that one exists, what is the nature
of that association, and how can it be quantified, or modeled via Y = f (X)?
JAMA. 2003;290:1486-1493
correlation
coefficient
regression
methods
Other statistical issues along the way…
• BIAS - Sources? What can we do about it?
• How do we check if standard assumptions
are valid, and what do we do if they are
violated, or we can’t tell?
• Do these techniques generalize, and if so,
how? What are some other applications?