Randomization - Amstatphilly.org

Download Report

Transcript Randomization - Amstatphilly.org

Outline
•
•
•
•
Preliminaries
- A typical scenario
- Why randomize?
Bias
Advantages
Disadvantages
- Blinding
- Primary and secondary questions
- Clinical goals, objectives, hypotheses
- Control groups
Clinical study design
- Parallel
- Crossover
- Fixed sequence rising dose
Allocation procedures
- Fixed probability
- Variable probability
Questions and comments
1
Where are we going?
The role of clinical study design,
randomization, and allocation and component
ID schedules in the scientific process
Scientific
Questions
Study Results
and Conclusions
Allocation and
Component ID
Schedules
Clinical
Study
Design
Randomization
Scheme
2
Some General Considerations
•
•
•
•
•
Major question of interest?
Clinical goals, objectives, hypotheses?
Which hypertensive patients?
How to measure/evaluate effectiveness
of therapy?
Which study design?
- Randomization scheme
- Blinding
•
•
•
Sample size, power, detectable outcome
Resources (time, money, manpower,
drug availability)
How to interpret results?
3
A Typical Scenario
•
Does MK lower blood pressure in
hypertensive patients?
•
Some prior evidence that MK works
•
MD wants to make “right” decision
•
Experiment (clinical trial)
•
MD not sure how to proceed
•
How can we help him / her?
4
Primary Question
•
What is the primary question?
Most interest; clearly defined; state in
advance, do not modify; realistic; all
study planners must agree
Example
Condition
Hypertension
Drug
MK
Primary question Is MK effective in the
treatment of hypertension?
5
Primary Question
•
Difference
- Is our drug different from placebo?
From active controls?
•
Equivalency
- Is our drug equivalent in different
formulations?
To active controls?
6
Secondary Questions
•
Two types
- Different response variable
- Patient subgroup investigations
•
Methodological and statistical
concerns
- Randomization
7
Secondary Questions
Example
Condition
Hypertension
Drug
MK
Primary question
Is MK effective in the
treatment of
hypertension?
Secondary questions
1. Different
response
variable
Is MK associated with
an increase in cough?
2. Subgroup
investigation
Is MK equally effective
in the treatment of
Blacks and non-Blacks?
8
Clinical Goals, Objectives, Hypotheses
•
Primary and secondary questions
translated to study design
Example
•
Goal
To observe the effect of maintenance
level oral dosing of MK in patients with
hypertension.
•
Objective
To compare the reduction in supine
diastolic blood pressure of MK to that of
Drug A in patients with hypertension.
9
Clinical Goals, Objectives, Hypotheses
Example (cont’d)
•
Primary Hypothesis
The proportion of hypertensive patients
whose supine diastolic blood pressure is
reduced to 80 mmHg following 12 weeks
of oral dosing with MK is at least 20
percentage points greater than that of
Drug A.
•
Secondary Hypothesis
The proportion of black hypertensive
patients whose supine diastolic blood
pressure is reduced to 80 mmHg following
12 weeks of oral dosing with MK is within
10 percentage points of the proportion for
non-Blacks.
10
Questions of Interest
Study Basics
•
What is my primary endpoint?
•
What study design should I use?
Hypothesis Testing
•
What sample size, n, do I need?
•
What power, 1 - b, do I have?
•
What is the minimal detectable difference, dMD?
11
Clinical Study Designs
12
Control Groups
•
Controlled study
- Some patients receive control
- Compare study treatment vs.
control
- Observed differences in treatments:
patients, treatment, or chance?
- Examples: Placebo (negative
control), vehicle control, active
control
•
Placebo control
- Like study therapy (shape, weight,
color, taste, odor, packaging, route
of administration, dosing regimen)
•
Placebo effect
13
mm Hg
14
-30
-20
-10
0
10
Placebo
+
+
+
++
+
+
++ +
+
++
+
+
++
+
+
+
50
100
Mean
150
Change in 24 Hour Trough Level of SuDBP
Placebo Effect
Active Control
14
Random Sampling
Population of Changes in SuDBP
-5
-7
-6
-2
-6
-8
-4
-3
m = -6.2
-8
-7
s2 = 8.8
-5
-7
-5
-13
-14
-4
-2
-1
-7
-5
-6
-6
-8
-5
-9
-4
-10
-6
-6
Sample
2
3
4
5
6
7
(n=5) 1
-2.4 -10.8 -6.2 -6.6 -6.4 -6.0 -6.0
x
1.3 6.7 0.2 40.3 7.3 2.5 0.0
s2
15
{Sample Values}
{-1, -2, -2, -6, -4}
{-8, -9,-10, -13, -14}
{-6, -6, -6, -6, -7}
{-1, -14, -2, -13, -6}
{-3, -5, -6, -8, -10}
{-5, -6, -7, -4, -8}
{-6, -6, -6, -6, -6}
1
2
3
4
5
6
7
-2.4
-10.8
Sample Characteristics
5 smallest decreases
5 largest decreases
-6.4
-6.0
-6.0
x, s2 close to m, s2
“Highest” probability values
All same, x close to m
0.0
2.5
7.3
-6.6 40.3
Most extreme values
0.2
-6.2
x close to m
6.7
1.3
Sample
Estimates
x
s2
Random Sampling
Sample
Number
16
16
xA
Calculated mean
(x) change
Observed
XA
Sample mean
(X) change
Statistic
xB
XB
mB
mA
Population mean
(m) change
Parameter
Hypertensive
Patients
Drug B
Hypertensive
Patients
Week 12 change
from baseline in SuDBP12-SuDBPBS SuDBP12-SuDBPBS
SuDBP
Hypertensive
Patients
Drug A
Response
Variable
Target
Population
In General
Example
Random Sampling
17
17
18
Eligibility
Qualification
Study drug
Control drug
Reference point
Data collection
Data collection
Blinding
Drug discontinued
Randomization
Untreated condition
Data collection
Follow-up
Treatment
Baseline
Overall Clinical Study Design Theme
18
Example Clinical Trial
•
•
•
Hypertensive patients
Drug A vs. Drug B
Difference of 5 mmHg in mean change
from baseline SuDBP is important.
Examples:
•
•
Drug A
-15
-5
Drug B
-10
-10
Clinical Hypothesis: The true mean change
in SuDBP following treatment with Drug A
IS 5 mmHg different from the true mean
change following treatment with Drug B.
- Conjecture about truth (parameters)
- Present tense
Naïve Statistical Analysis: Two sample
pooled variance t-test.
19
20
SuDBP
Randomization
12 Weeks
Drug B
Drug A
Study Design
Example Clinical Trial
SuDBP
t-test
20
Crossover
Parallel
21
Qualification
Baseline
Drug
Washout Placebo
Placebo Washout
Drug
Placebo
Drug
Treatment
Parallel vs. Crossover Designs
Follow-up
21
Parallel vs. Crossover Designs
•
Parallel
- Each patient receives one treatment
- Treatments (e.g., drug, dose,
formulation) are indicated by the
comparison(s) of interest
•
Crossover
- Each patient receives more than
one treatment
- May receive all treatments
- Washout period between the
treatments
22
Parallel vs. Crossover Designs
Issue
Parallel
Crossover
Precision
(Positive
Correlation)
Less
More
Preference
Evaluation
No
Yes
Duration,
Dropouts, Cost
(Trade-offs)
Shorter, Less,
Longer,
Less?
More, More?
Ethics
Many Patients Few Patients
(Phase III)
(Phases I, II)
Disease Condition
Most
Not Curable,
Chronic and
Stable
Carryover Effects?
Yes
No
Drug Development Confirmatory
Stage
Exploratory
23
Crossover Treatment
Study Designs
24
25
12
0
4
8
Cmax
mg/mL
0
Tmax
10
RSYQ3WP.45 9/3/98 JOB#3466 BRADSTREET
5
Hours Post-Dosing
AUC
25
Randomization
50
50
25
100
25
50
100
25
50
100
Time
25
100
50
25
100
100
50
25
Design and Randomization Scheme
26
3,3,6 Crossover: 3 Treatments, 3 Periods, 6 Treatment Sequences
26
Randomization
Tablet
Capsule
I.V.
Caplet
Time
I.V.
Caplet
Tablet
I.V.
Capsule
Caplet
Capsule
Tablet
Caplet
Capsule
Tablet
I.V.
Design and Randomization Scheme
27
4,4,4 Crossover: 4 Treatments, 4 Periods, 4 Treatment Sequences
27
Randomization
C
F
R
N
NT
F
R
N
NT
C
NT
C
F
R
N
NT
C
F
R
N
Time
NT
N
R
F
C
F
C
NT
N
R
R
F
C
NT
N
R
F
C
NT
N
R
C
NT
N
C
F
R
F
NT
N
Design and Randomization Scheme
28
5,5,10 Crossover: 5 Treatments, 5 Periods, 10 Treatment Sequences
28
Other Complete Crossovers
Example Allocation Schemes
Crossover Trt Subjects/
Design
Seq Patients
2,2,2
3,3,6
4,4,4
1
Treatment Period
2
3
4
1
1,...,13
Pbo
MK
2
2,...,14
MK
Pbo
1
2,...,36
25
50
100
2
6,...,31
50
100
25
3
4,...,33
100
25
50
4
1,...,34
25
100
50
5
3,...,35
50
25
100
6
5,...,32
100
50
25
1
4,...,16
I.V.
Tablet
Caplet
Capsule
2
2,...,13
Tablet
Capsule
I.V.
Caplet
3
1,...,15
Capsule
Caplet
Tablet
I.V.
4
3,...,14
Caplet
I.V.
Capsule
Tablet
Randomization: To one sequence of treatments
29
Other Complete Crossovers
Example Allocation Schemes
Crossover
Design
Trt
Seq
5,5,10
1
Subjects/
Patients
Treatment Period
1
2
3
4
5
4, 19
NT
C
R
N
F
2
9, 11
C
F
N
NT
R
3
1, 15
F
R
NT
C
N
4
8, 14
R
N
C
F
NT
5
7, 20
N
NT
F
R
C
6
3, 12
NT
F
C
N
R
7
10, 18
C
R
F
NT
N
8
2, 13
F
N
R
C
NT
9
6, 17
R
NT
N
F
C
10
5, 16
N
C
NT
R
F
Randomization: To one sequence of treatments
30
Bioequivalence Study
Film-Coated Tablet vs. Effervescent Tablet
AUC Ratio (F/E)
1.25
(1.08)
(1.03)
1.00
(0.98)
0.80
31
32
0
1
2.5
C/S Ratio
AUC
Drug A
Cmax
Cmax
Drug B
AUC
Cmax
RSYQ3WP.50 10/22/96 JOB#4248 BRADSTREET
Metabolite
AUC
32
Bioequivalence of Manufacturing
Processes A and B
Manufacturing Process A vs.
Manufacturing Process B
Drug 1
A/B Ratio
1.6
1.25
(1.08)
(1.00)
(0.92)
1.00
(0.94)
(0.84)
(0.75)
0.80
0.4
AUC
Cmax
Drug 2
1.3
A/B Ratio
1.25
(1.01)
1.00
0.80
0.5
(0.97)
(0.92)
(0.84)
(0.86)
(0.77)
AUC
Cmax
33
34
(A+C)/C Ratio
0
1
3
Cmax
ERC
Parent Compound
AUC
Metabolite
Cmax
ERC
RSQ3WP.52 10/22/96 JOB#4248 BRADSTREET
AUC
34
RSYQ3WP.53 10/22/96 JOB#4248 BRADSTREET
35
Fed/Fasted Ratio
0.0
0.5
1.0
1.5
2.0
Parent
Compound
AUC
Metabolite
Parent
Compound
Cmax
Metabolite
35
Drug/Placebo Ratios
36
0.8
1.0
1.25
1.5
AUC
0.7
0.8
1.0
1.25
1.3
0.8
Seq 1: Drug / Pbo
Cmax
1.0
1.25
1.9
EE
Seq 2: Pbo / Drug
AUC
0.3
0.8
1.0
1.3
1.25
NET
Oral Contraceptive Interaction Study
Cmax
36
37
RSYQ3WP.132 10/23/96 JOB4248 BRADSTREET
AUC Ratio (A/B)
0.50
0.80
1.00
1.25
1.80
2.32
2.70
Trial 1
(0.98)
(1.08)
(1.03)
Trial 2
(0.91)
(1.01)
(1.12)
Trial 3
(0.98)
(1.12)
(1.27)
37
Dose Proportionality Study of Carbidopa
AUC
1600
ng• hr/mL
1200
800
400
0
25
50
Carbidopa Dose (mg t.i.d.)
100
38
Dose Proportionality Study of Carbidopa
Cmax
400
ng/mL
300
200
100
0
25
50
Carbidopa Dose (mg t.i.d.)
100
39
Pilot Transdermal Patch Study
AUC
Geometric Mean
2000
1
5
pghr/mL
1600
9
4
8
5
6
2
1200
4
5
7
8
9
1
7
800
6
8
2 4 7
5
9
1
268
7
4
2
9
6
1
400
I.V.
A
B
C
Patch
RSYQ3WP.88 10/22/96 JOB#4248 BRADSTREET
tebGTN1 Dec. 3, 2003
40
0
10
20
30
40
50
60
41
NT
T1
T2
T3
NT
Treatment
T4
AUC (mghr/dl)
T1
T2
T3
Cmax (mg/dl)
Alcohol Interaction Study in Men
T4
Non-Asian
Asian
41
Alcohol Interaction Study in Men
Cmax: Active Treatments vs. No Treatment
All Subjects
Ratio
1.25
(1.29)
(1.21)
(1.20)
(1.03)
(1.02)
(1.24)
(1.09)
1
(0.92)
(0.86)
(0.86)
(1.05)
(0.88)
0.80
Without Subject #21
(1.29)
1.25
(1.27)
(1.27)
(1.10)
(1.10)
(0.95)
(0.95)
Ratio
(1.18)
(1.12)
1
(1.02)
(0.97)
(0.88)
0.80
T1
T2
T3
T4
42
4, 2, 12 Incomplete Crossover:
4 Treatments, 2 Periods,
12 Treatment Sequences
Design and Randomization Scheme
Randomization
10
20
30
40
10
30
20
40
10
40
20
20
30
10
40
30
30
40
10
20
40
10
30
20
Time
43
Dose strengths are mg.
43
4, 2, 12 Incomplete Crossover:
4 Treatments, 2 Periods,
12 Treatment Sequences
Example Allocation Scheme
Treatment Subjects/ Treatment Period
1
2
Sequence Patients
10
20
1
8, 20
30
40
2
1, 14
10
30
3
6, 22
20
40
4
10, 18
10
40
5
3, 15
20
30
6
12, 24
20
10
7
9, 17
40
30
8
2, 13
30
10
9
7, 21
40
20
10
11, 16
40
10
11
4, 19
30
20
12
5, 23
Randomization: To one sequence of two treatments
44
Dose strengths are mg.
44
Parallel Treatment
Study Designs
45
46
Randomization
Time
Active Control
150 mg MK
100 mg MK
50 mg MK
Placebo
Design and Randomization Scheme
Five Treatment Parallel
46
47
Active Control
3, 9, ..., 99
6
Randomization: To one multiple dose treatment
AC AC AC AC AC
150 150 150 150 150
150 mg MK
4,10,...,100
P
100 mg MK
50
P
5
2, 8, ..., 96
50
P
4
50 50
100 100 100 100 100
50
P
3
50 mg MK
P
Placebo
1, 7, ..., 97
2
5, 6, ..., 98
1
Treatment
Subject /
Patient
Dosing Period (Days)
Example Allocation Scheme
Five Treatment Parallel
7
47
Multiclinic Inpatient Hypertension Study
-20
-10
5
Day 0
-20
-10
Day 10
P
P
50
50
100
100
Hour 6
150
150
A
A
P
P
50
50
150
A
A
P
P
50
50
150
100 150
100
Hour 24
A
A
RSYQ3WP.97 10/23/96 JOB #4248 BRADSTREET
100 150
100
Hour 12
Mean Change from Baseline in Supine Diastolic Pressure
(N=14 to 20)
mmHg
mmHg
48
48
Multiclinic Inpatient Hypertension Study
Changes vs. Baseline at Each Time Point Over 24 Hours
Supine Diastolic Blood Pressure (mmHg)
Day 1
Day 5
Mean Area About Zero Over 24 Hours
mmHg * Hours
0
0
-100
-100
-200
-200
-300
-300
-400
-400
mmHg
Mean Maximum Decrease in First 12 Hours
0
0
-10
-10
-20
-20
-30
-30
-40
-40
Pbo 50mg 100mg 150mg AC
(n=23) (n=20) (n=18) (n=19) (n=8)
Pbo 50mg 100mg 150mg AC
(n=23) (n=20) (n=18) (n=19) (n=8)
49
Method Comparison Study
Angiotensin II Level (pg/mL)
HPLC RIA
18
50
Baseline
0
First Dose
0
0
80
18
0
30
Week 2
50
Week 6
0
0
0
80
0
30
Direct RIA
RSYQ3WP.99 10/23/96 JOB#4248 BRADSTREET
50
51
Placebo
Drug A
Placebo
Drug A
Placebo
Drug A
Congestive heart
failure (CHF)
Gastroesophageal
reflux disease
(GERD)
Elevated
intraocular
pressure (IOP)
IOP
Healing of
esophageal ulcers
Percent reflux time
Usually two eyes
to evaluate
Direct vs.
indirect endpoints
Consistency of
measurement
Blacks vs.
non Blacks
Drug L
Supine diastolic
HCTZ
blood pressure
Combination
(SuDBP)
Hypertension
Time on treadmill/
bicycle
NYHA cardiac status
Yale scale
Comments
Disease
Endpoints
Treatment
Groups
Some Phase III Studies
51
52
Some Phase III Studies
52
53
Randomization
Time
Armpit, Patch A
Armpit, Patch B
Armpit, Patch C
Abdomen, Patch A
Abdomen, Patch B
Abdomen, Patch C
Groin, Patch A
Groin, Patch B
Groin, Patch C
Design and Randomization Scheme
Parallel Design: Factorial Arrangement of Treatments
53
Factorial Arrangement of Treatments
•
•
Each patient simultaneously receives more
than one level of each treatment factor
Treatment factor
- Basic treatment classification
(e.g., body site, patch type)
•
Treatment factor level
- Treatment factor classification scheme
(e.g., body site: armpit, abdomen, groin)
(e.g., patch type: A, B, C)
•
Treatment
- Combinations of treatment factor levels
(e.g., groin + patch B; armpit + patch C)
- Number of treatments equals product of
numbers of treatment factor levels
(e.g., 3x3=9)
54
3x3 Factorial Arrangement of Treatments
Alternative Design Scheme
Patch Type
Body
Site
A
B
C
Armpit
1
2
3
Abdomen
4
5
6
Groin
7
8
9
55
Listing of Treatments in 3x3 Factorial
Example Allocation Scheme
Treatment Factor
Level Combinations
Treatment Subject Body Site Patch Type
1
4, 12
Armpit
A
2
7, 11
Armpit
B
3
8, 14
Armpit
C
4
3, 17
Abdomen
A
5
6, 13
Abdomen
B
6
1, 16
Abdomen
C
7
5, 18
Groin
A
8
9, 15
Groin
B
9
2, 10
Groin
C
Randomization: To one of the nine treatments
(combination of treatment factor levels)
56
56
Panel 1
50
Pbo
Randomization
100
Pbo
Time
Panel 2
Randomization
Panel 3
Design and Randomization Scheme
150
Pbo
Time Lagged Parallel Panel Rising Dose
Randomization
57
57
Time Lagged Parallel Panel Rising Dose
Example Allocation Scheme
Panel
1
Dosing Day / Period
Subject
1
2
3
2, 3, 4
50
1
P
2
5, 6, 8
7
3
9, 11, 12
150
10
P
P = Placebo
50 = 50 Drug A
100
P
100 = 100 Drug A
150 = 150 Drug A
Randomization: Within a panel, to either placebo or
that panel’s dose. (Sometimes
randomization to a panel.)
58
59
Young
Middle Aged
Elderly
Panel
Time
100 mg MK
100 mg MK
100 mg MK
Treatment
Design and Randomization Scheme
Parallel Panel: Same Drug Regimen
59
Parallel Panel: Same Drug Regimen
Example Allocation Scheme
Panel
Subjects
Treatment
Elderly
1 - 10
100 mg MK
Middle Aged
11 - 20
100 mg MK
Young
21 - 30
100 mg MK
No randomization
60
Carryover Effects at Period Baseline
Supine Blood Pressure
Difference (Period 2 - Period 1)
in Period Baseline Values
Median
20
mm Hg
10
0
-10
-20
EN NE
Systolic
EN NE
EN NE
Diastolic
MAP
61
Fixed Sequence Rising
Dose Study Designs
62
Multiple (Variable) Dose Titration
80
40
20
15
5
10
Time
63
64
Total Daily Dose (mg)
0
6
18
30
42
1
7
21
28
35
RSYQ3WP.41 9/3/98 JOB#3466 BRADSTREET
Relative Study Day
14
Dose Titration In Parkinson's Patients (N=10)
64
Multiple (Variable) Dose Titration
Example Titrations
(Variable) Dosing Period
Subject 1 2 3 4 5
6
1
2
3
4
5
6
7
8
9
10
5
5
5
5
5
5
5
5
5
5
10
10
10
10
10
10
10
10
10
10
15 20
15
15
15
15
15
15
15
15
20
20 40 80
20
20
Randomization: All subjects assigned to the single
5, 10, 15, 20, 40, 80 sequence.
65
Randomization
Pbo
15
10
5
Time
20
Pbo
10
20
5
15
Pbo
5
20
10
Pbo
15
Design and Randomization Scheme
Fixed Sequence Rising Dose - Placebo Substitution
66
66
Fixed Sequence Rising Dose
Placebo Substitution
Example Allocation Scheme
Dosing Day / Period
Subject
1
2
3
4
2, 8
P
10
15
20
3, 5
5
P
15
20
1, 6
5
10
P
20
4, 7
5
10
15
P
P = Placebo
5 = 5 Drug A
10 = 10 Drug A
15 = 15 Drug A
20 = 20 Drug A
Randomization: To a fixed sequence of rising doses
with placebo substituted for one of
the doses.
67
68
Randomization
15
10
5
Time
Pbo
15
10
5
20
15
Pbo
15
10
10
Pbo
5
15
5
10
5
Pbo
Design and Randomization Scheme
Pbo
20
20
20
20
Fixed Sequence Rising Dose - Placebo Insertion
68
Fixed Sequence Rising Dose
Placebo Insertion
Example Allocation Scheme
Dosing Day / Period
Subject
1
2
3
4
5
1, 9
P
5
10
15
20
4, 10
5
P
10
15
20
2, 6
5
10
P
15
20
3, 8
5
10
15
P
20
5, 7
5
10
15
20
P
P = Placebo
5 = 5 Drug A
10 = 10 Drug A
15 = 15 Drug A
20 = 20 Drug A
Randomization: To a fixed sequence of rising doses,
with placebo inserted between two
of the doses.
69
Panel 1
Panel 2
Random.
Random.
Panel 1
Panel 2
20
10
40
Pbo
20
40
10
20
30
40
15
5
Pbo
Pbo
15
5
30
30
10
Pbo
5
Pbo
15
Pbo
Design and Randomization Scheme
60
60
60
Pbo
Alternating Panel Rising Dose
(Randomization)
70
Pbo
80
80
80
70
10
10
10
9,13
10,15
5
2, 7
11,16
5
4, 5
Pbo
5
1, 6
2
12,14
Pbo
3, 8
1
15
15
Pbo
15
3
20
20
Pbo
20
4
30
Pbo
30
30
5
Dosing Day / Period
40
40
6
40
Pbo
Example Allocation Scheme
Alternating Panel Rising Dose
Pbo
60
60
60
7
Pbo
80
80
80
8
Randomization: To one sequence of rising dose / placebo treatments within
each panel. (Could also randomize subjects to a panel.)
71
2
1
Panel Subj.
71
72
Panel 2
(Randomization)
Panel 1
Random.
Random.
Pbo
100
Pbo
100
20
Pbo
20
Pbo
20
Pbo
Panel 1
Panel 2
Pbo
50
Pbo
Pbo
Pbo
50
50
Pbo
20
10
Pbo
50
10
Pbo
10
Pbo
Pbo
10
Design and Randomization Scheme
Pbo
100
Pbo
100
Alternating Panel Rising Dose Crossover
72
73
2
15
14
16
13
1
10
P
10
P
P = Placebo
10 = 10 Drug A
20 = 20 Drug A
11,
9,
10,
12,
Subject
3, 5
2, 7
4, 6
1, 8
2
P
10
P
10
P
20
P
20
50 = 50 Drug A
100 = 100 Drug A
20
P
20
P
8
P
100
100
P
P
100
100 P
7
Randomization: To one sequence of rising dose/placebo treatments
within each panel. (Could also randomize subjects
to a panel).
Panel
1
Dosing Day / Period
4
5
6
3
50
P
50
P
50
P
P
50
Example Allocation Scheme
Alternating Panel Rising Dose Crossover
73
Randomization
and
Bias
74
Why Randomize?
•
•
To support an unbiased comparison of
the treatment regimens
In concert with 
- Blinding procedures
- Complete data collection
- Other bias reduction measures
75
Bias
•
Prejudice in planning, conduct, analysis,
interpretation, publication of clinical trials
- Conscious or subconscious factors
- All study points: design to publication
- Bias never completely controlled
•
Common concerns
- Patient entrance into trial
- Allocation of patients to treatment
Ethics
Drug favoritism
- Measurement/evaluation of response
76
Bias - Allocation of Patients to Treatment
Conscious or unconscious bias in the
allocation of patients to new therapy
based upon ethics and/or drug favoritism
Situation Compare MK to placebo
Belief
MK is effective
Ethics
Sicker patients receive MK
Less severe patients receive placebo
Drug Favoritism
Less severe patients receive MK
(“stacking the deck”)
Sicker patients receive placebo
(“hopeless”)
77
Bias - Allocation of Patients to Treatment
•
“Beating” the randomization scheme
- Staggered patient entry
- Limited effectiveness of blinding
strategy or no blinding
- Tell-tale lab results or AEs
- Small block size in randomization
scheme
78
Randomization
•
Advantages
- Decreases bias in treatment assignment
- Asymptotically (“really large samples”),
tends to balance unknown baseline
prognostic/ concomitant variables among
treatments
- Supports blinding procedures
More difficult to break blind
Lessens bias in assessment of response
- Supports statistical design
- Guides statistical analysis
79
Randomization
•
Disadvantages
- For small or moderate sample sizes,
may not balance unknown baseline
prognostic/concomitant variables
among treatments
- Most appropriate randomization
scheme may be difficult or costly to
administer
- Patient recruitment problems
Ethics
Odds of receiving a given treatment
80
Blinding
•
•
No knowledge of the patient’s treatment
Candidates for blinding
- Patient
- Evaluating Physician
- Other study site personnel
- Sponsor personnel
•
Blinding strategies
- Number of groups blinded
81
82
May Be
May Be
May Be
May Be
May Be
May Be
Blinded Blinded
May Be Blinded Blinded
May Be May Be
Partial
Partial
Partial
Investigator
Sponsor Monitor
Sponsor Statistician
Other Sponsor Personnel
Other Site Personnel
Person/Group
Usually
82
Combined
Strategy
Patient
Triple
Blind
Double
Blind
Single
Blind
OpenLabel
(Unblinded)
Blinding Strategies
83
Less
Less
More
Least
Least
Most
Complexity
Cost
Individual patient
management
Least
Some
Some
Less
More
Most
Bias
Quality
Double
Blind
Single
Blind
Open-Label
(Unblinded)
Least
More
More
Least
Triple
Blind
Some
Most
Most
Partial
Combined
Strategy
Comparison of Blinding Strategies
83
Package to Maintain Blinding Strategy
•
Placebo control
- More than one placebo entity?
•
Active control
- Double (or more) dummy scenario?
•
Vaccines
- Different vaccine (or placebo) vials
must be indistinguishable
- Dummy vaccinations often
unacceptable
•
Arrangement of clinical materials within
a package must follow the ordering
provided by either an allocation or
component ID schedule.
84
Intragastric Titration Study (OTC)
Cumulative Hourly Milliequivalents of
Maalox® Titrated
140
Placebo
Gastric Mean
120
100
80
60
X Drug B
Drug A
7
8
X
40
X
20
0
X
X
X
1
2
X
X
3
4
5
6
Hours From First Meal
85
Gastroesophageal Reflux Disease Study
% Reflux Time
Upright
30
20
10
Normal
0
# Episodes/Hr
30
20
10
0
# Episodes
>5 min/12 hrs
8
6
4
2
0
PBO
10
20
Dose (mg a.m.)
40
teb 37abc Apr. 1, 2003
86
Gastroesophageal Reflux Disease Study
Supine
% Reflux Time
50
40
30
20
10
Normal
0
# Episodes/Hr
20
15
10
5
0
# Episodes
>5 min / 12 Hrs
15
12
9
6
3
0
PBO
10
20
Dose (mg a.m.)
40
teb 38abc Apr. 2, 2003
87
Gastroesophageal Reflux Disease Study
24 Hours
% Reflux Time
30
20
10
Normal
0
# Episodes/Hr
20
15
10
5
0
# Episodes
>5 min / 12 Hrs
8
6
4
2
0
PBO
10
20
Dose (mg a.m.)
40
teb 39abc Apr. 2, 2003
88
Allocation Procedures
89
Fixed Probability Allocation Procedures
Example
Condition
Hypertension
Drug
MK, HCTZ, Combination
Study type
Efficacy
Allocation ratio
2 to 2 to 1, respectively
Fixed probabilities
2/5 to 2/5 to 1/5, respectively
90
Fixed Probability Allocation Procedures
Patient
Procedure
1
Systematic
M M M M H H H H C
(a)
2
3
4
5
6
7
8
9 10
C
Simple
M H M C M H C H H M
Blocked
M C M H H C H M M H
(a)
Stratified Simple
Black
H H H H
Non-black
C C M M M M
Stratified Blocked
Black
H M M H C
Non-black
(a) Worst
H M C H M
case allocations not shown
M = MK
H = HCTZ
C = Combination
91
Fixed Probability Allocation Procedures
Systematic
•
Fixed assignment scheme; not
randomized
92
Fixed Probability Allocation Procedures
Simple (complete)
•
•
•
•
Study design factors not considered
Assignment based only upon chance
Treatment assignment bias minimized
May not get intended treatment ratios
93
Fixed Probability Allocation Procedures
Blocked (restricted)
•
•
•
•
•
Randomization within block
Block size must be large enough given
design
Constant vs. variable block size
Allocations at end of block may be
deterministic
Need to use full blocks of allocation
numbers
94
Fixed Probability Allocation Procedures
Stratified (by covariate levels)
•
•
•
•
•
•
•
Accommodates prognostic factors
Separate randomization within levels
of prognostic factor
May not achieve intended treatment
group ratios
In small samples, tends to improve
efficiency of estimators and power of
statistical tests; less so for large
samples
Need to limit number of strata and
number of levels in each strata; focus
on most important covariates
Better understanding of disease
process
Generate new clinical hypotheses
95
Comparison of Fixed Probability Randomization Procedures
96
96
Yes
Yes
No
No
Adaptable to most study designs
Randomization procedure implies
usual statistical analysis
Improves Statistical Procedures
No
Usually
Probably
No
Yes
Yes
Yes
Yes
Easy to implement
Probably
Typically
No
Harder
Harder
Hard
Harder
Easy
Stratified
Unblinding of study
Blocked
Asymptotic Asymptotic Asymptotic
Simple
No
Systematic
Accommodates unknown
prognostic factors
Quality
Comparison of Fixed Probability Randomization Procedures
97
97
Variable Probability Allocation Procedures
•
Response Adaptive
- Treatment assignment depends
upon previous patients’ responses
- Useful for allocating more (or fewer)
patients to treatments achieving
desired (not desired) outcomes
- Some use in Clinical Pharmacology
for panels of subjects
•
Covariate Adaptive
- Treatment assignment targeted at
minimizing covariate imbalances
within treatment groups
98
General Reading
Cox, D.R., Planning of Experiments, New York: John Wiley
and Sons, 1958.
Friedman, L.M., Furberg, C.D., DeMets, D.L., Fundamentals of
Clinical Trials, Littleton, MA.: PSG Publishing Company,
1985.
Hicks, C.R., Fundamental Concepts in the Design of
Experiments, Fort Worth: Saunders College Publishing,
1982.
Peterson, R.G., Design and Analysis of Experiments, New
York: Marcel Dekker, 1985.
Piantadosi, S., Clinical Trials - A Methodological Perspective,
New York: John Wiley and Sons, 1997
Pocock, S.J., Clinical Trials: A Practical Approach, New York:
John Wiley and Sons, 1983.
Rosenberger, W.F., Lachin, J.M., Randomization in Clinical
Trials - Theory and Practice, New York: John Wiley and
Sons, 2002.
Shapiro, S.H., Louis, T.A., editors, Clinical Trials: Issues and
Approaches, New York: Marcel Dekker, 1983.
99
Acknowledgements
•
Cindy White
100
Insight Into Initial
Sample Size Calculations
Thomas E. Bradstreet, Ph.D.
Merck Research Labs
Reinventing Patient Recruitment
and Retention
PEA Conference
Philadelphia, PA
December 2-3, 2004
101
Outline
•
•
•
•
•
•
•
•
•
•
The Question
Example Clinical Trial
Random Sampling
Hypothesis Testing (Difference)
Two Sample Pooled Variance t-test
Experimental planning
- Sample size
- Power
- Detectable difference
Starting and Completing Sample Sizes
Blocking Factors and Covariates
Dropouts, Missing Data, Patient Drift
Inadequate Sample Sizes and
Scientific Inference
102
Outline
•
•
•
•
•
•
•
•
Estimation (Difference)
- Sample size
- Precision
- Confidence level
Similarity
- Sample size
- Probability of concluding similarity
- True difference/ratio
Simulation
Multiplicity
Summary
General Reading
Acknowledgements
Questions and Comments
103
The Question
•
Once study design and hypotheses
have been chosen, what types of
clinical, PD, PK, statistical, and other
clinical trials information are required
to estimate starting and completing
sample sizes?
104
Hypothesis Testing
Experimental Outcome
Fail to
reject H0:
No difference
between
Drugs A, B
H0: mA = mB
Reject H0:
Good
(1-a)
False
Positive
(a)
False
Negative
(b)
Power
(1-b)
Truth
Difference
between
Drugs A, B
H1: mA  mB
Caution: Failing to reject the null
hypothesis does not mean that
you have proven it to be true.
105
Two Sample Pooled Variance t-Test
Calculation
Tst
Sp2

X A  XB   m A  mB 


Sp2
1 nA  1 nB 
1
2

n A  1 S2A  nB  1 SB2

nA  nB  2
where
and
Tst ~ Central t nA nB 2  distributi on under H0 :,
m A , mB = population means,
X A , XB = sample means,
S 2A , SB2 = sample variances,
Sp2 = pooled sample variance,
n A , nB = sample sizes.
106
Experimental Planning
•
•
•
•
Set Type I error rate: a
Appropriate sample sizes: nA, nB
Appropriate power: 1 - b
Differences
d = mA-mB = True population difference
dCM = Clinically meaningful difference
dMD = Minimally detectable difference
Ideally: dMD  dCM  d
107
Clinically Meaningful Difference (dCM)
•
Clinical difference of interest
- Would act upon, meaningful result
•
Example
- Drug A (mA) vs. Drug B (mB)
- Mean change from baseline
(SuDBP12 - SuDBPBS)
- Important difference: dCM 5 mmHg
 dCM  d
108
Fix
NA
Calculate
NA
Minimal detectable difference (dMD)
Sample size (n)
Calculate
State
Power (1 - b)
Estimate
Calculate
State
False negative rate (b)
Estimate
State
State
False positive rate (a)
Variability (s2)
Define
Sample
Size (n)
Calculate
State
State
Estimate
State
State
Define
109
Minimal
Detectable
Power
(1 - b) Difference (dMD)
Three Questions of Interest
Clinically meaningful difference (dCM) Define
109
Sample Size (n)
•
Number of patients per treatment group, n
- To find a clinically meaningful
difference, dCM
-
•
Statistically significant, p < a
Accepted false positive rate, a
Accepted false negative rate, b
Accepted power, 1 - b
Variability (estimated), s2 (s2)
Real difference, d = mA - mB, between
Drug A and Drug B
Completing patients with useable data
110
Sample Size (n)
Example
•
Clinically meaningful difference:
dCM = ±5 mmHg
•
•
•
•
False positive rate: a = .05
•
True difference d = mA - mB = ±5 mmHg
between Drugs A and B in population
•
False negative rate: b = .20
Power: 1 - b = .80
Estimated standard deviation of mean
change from baseline: s = 9 mmHg
Sample size n = 52 (completing)
patients per treatment group (with
useable data)
111
±12 mmHg
±9 mmHg
.05
.10
Increase variability (s)
Increase false positive rate (a)
.80
.80
.95
.80
41
92
86
52
Clinically meaningful difference between Drugs A (mA) and B (mB) in mean
change from baseline dCM = 5 mmHg, two-tailed test
±9 mmHg
±9 mmHg
.05
.05
Increase power (1 - b)
Sample size calculations
Drug A (mA) vs. Drug B (mB)
112
False
Sample Size (n)
Positive
Power Per Treatment
Rate (a) Variability (s) (1-b)
Group
Some Relationships: Sample Size (n)
Original calculation
112
Power (1 - b)
•
Probability, 1 - b,
- Of finding a clinically meaningful
difference, dCM
- Statistically significant, p<a
- Accepted false positive rate, a
- Fixed sample size per treatment, n
- Variability (estimated), s2 (s2)
- Real difference, d = mA-mB, between
Drug A and Drug B
113
±12 mmHg
±9 mmHg
.05
.10
Increase variability (s)
Increase false positive rate (a)
.87
.55
.40
.80
52
52
20
52
Clinically meaningful difference between Drugs A (mA) and B (mB) in mean
change from baseline dCM = 5 mmHg, two-tailed test
±9 mmHg
±9 mmHg
.05
.05
Decrease sample size (n)
Sample size calculations
Drug A (mA) vs. Drug B (mB)
114
False
Sample Size (n)
Positive
Power Per Treatment
Rate (a) Variability (s) (1-b)
Group
Some Relationships: Power (1 - b)
Original calculation
114
Minimal Detectable Difference (dMD)
•
Smallest true difference, dMD, detectable
- Statistically significant, p<a
- Accepted false positive rate, a
- Accepted false negative rate, b
- Accepted power, 1 - b
- Variability (estimated), s2 (s2)
- Fixed sample site per treatment, n
115
±12 mmHg
.05
Increase variability (s)
.80
.95
.80
.80
52
52
20
52
116
6.7 mmHg
6.5 mmHg
8.2 mmHg
5 mmHg
Clinically meaningful difference between Drugs A (mA) and B (mB) in mean
change from baseline dCM = 5 mmHg, two-tailed test
±9 mmHg
.05
Increase power (1 - b)
±9 mmHg
±9 mmHg
.05
.05
Decrease sample size (n)
Detectable outcome calculations
Drug A (mA) vs. Drug B (mB)
Original calculation
Minimal
False
Sample Size (n)
Detectable
Positive
Power Per Treatment
Difference (dMD)
Rate (a) Variability (s) (1-b)
Group
Some Relationships: Minimal Detectable Difference (dMD)
116
Starting and Completing Sample Sizes
•
Must accommodate and/or take
advantage of:
- Final ANOVA model degrees-offreedom (d.f.)
- Effectiveness of blocking factors
and covariates
- Dropouts
- Other missing data
- Patient drift
117
Blocking Factors and Covariates
•
•
•
•
Randomization, “Poolers” vs. “Non-poolers”
Blocking factors
- Ineffective: use up d.f. without reducing
MSE; can reduce power; may increase
completing sample size.
- Effective: use up d.f. but reduce SSE
(MSE) at a proportionally faster rate; can
increase power; may reduce completing
sample size.
Covariates
- Reduce SSE (MSE) while spending few
d.f.; can increase power; may reduce
completing sample size.
Carefully consider original source of
variance estimate.
118
116
87 MSE
1 MST
1 MSW
1 MSA
5 MSR
1 MSG
7 MSC
119
<104 (usually)
SSTOT 103
SSE
SST
SSW
SSA
SSR
SSG
SSC
Effective
Blocking Factors
and Covariates
SS d.f. MS
NOTE: Carefully consider original sources of variance estimate.
104
SSTOT 103
SSTOT 103
Total
# Patients
89 MSE
SSE
--
SSE 102 MSE
--
--
--
Error
--
--
--
0
1 MST
--
--
Weight
--
5
0
0
SST
--
--
Age
--
1
0
0
1 MST
--
--
Race
--
7
0
SST
--
--
Gender
--
Treatment
--
--
Sample Size Given
In Most Books:
Ineffective
t-test Only
Blocking Factors
SS d.f. MS
SS d.f. MS
Blocking Factors and Covariates
Clinic
Factor
119
Dropouts, Missing Data, Patient Drift
•
•
Dropouts, other missing data
- Amount, type, pattern
- Increase starting sample size
- Competing statistical analyses
Per protocol (NP)
Dropout (ND)
Intention-to-treat, LVCF (NLVCF)
NP ND  NLVCF
- Missing at random?
- Competing covariance structures
Patient drift
- Between and within subject sample
variances may be artificially larger.
120
Inadequate Sample Sizes, Hypothesis
Testing, and Scientific Inference
•
•
•
•
Failure to reject null hypothesis, H0: mA = mB
- No true difference, (d0 between therapies?
- Lack of power, 1 - b?
Under estimation of variance, s2
Over estimation of true difference dMD≥dCM≥d
Dropouts, missing data, patient drift, other
issues not accounted for
Lack of power
- Beneficial therapies wrongfully eliminated
- Little research value, confuses issues
Adequate power
- Probably no clinically important true difference
between therapies (dMD≤dCM≤d
Proper interpretation of failure to reject null
hypothesis is based upon careful consideration
of power, 1 - b
121
Estimation
Three Questions of Interest
•
•
Sample Size: What sample size, n,
per treatment group do I need to
estimate the true difference, d=mA-mB,
in mean change from baseline SuDBP
within dP mmHg given that I want (1a)% confidence?
Precision: Given a fixed sample size,
n, per treatment group, with what
precision, dP mmHg, can I estimate the
true difference, d=mA-mB, in mean
change from baseline SuDBP given
that I want (1-a)% confidence?
122
Estimation
Three Questions of Interest
•
Confidence Level: Given a fixed
sample size, n, per treatment group,
with what (1-a)% confidence can I
estimate the true difference, d=mA-mB,
in mean change from baseline SuDBP,
to within dP mmHg?
123
Calculate
Sample size (n)
State
Confidence level (1 - a)
Estimate
State
False positive rate (a)
Variability (s2)
Define
Sample
Size (n)
Fix
Estimate
State
State
Calculate
Precision (dP)
Fix
Estimate
Calculate
Calculate
Define
124
Confidence
Level (1-a)
Estimation: Three Questions of Interest
Precision (dP)
124
Similarity
•
•
•
Two active areas
- Bioequivalence (PK  Clinical)
- Clinical equivalence (Clinical or PD)
Bioequivalence (and Drug Interaction)
- AUC and Cmax
- Criteria: 0.80<mA/mB<1.25
Two one-sided test (TOST) approach
- NOT same as usual frequentist testing
H0: mA/mB0.80 or mA/mB1.25
H1: 0.80<mA/mB<1.25
- Typically a1=.05 and a2=.05
- Algebraically equivalent to 90%
confidence interval (CI) approach
125
Similarity
•
•
Sample size calculations
Case 1: mA/mB=1 (equality)
Case 2: mA/mB1, 0.80< mA/mB<1.25
(equivalence), sample size
Three Questions:
Sample Size: What total sample size,
n, do I need to have .90 probability that
the 90% CI for the ratio of the true
means, mA/mB, is contained in [0.80,
1.25] given that mA/mB=1 (or some other
value specified between 0.80 and
1.25)?
126
Similarity
•
•
Probability of Concluding Bioequivalence:
Given a fixed total sample size, n, what
probability do I have that the 90% CI for the
ratio of the true means, mA/mB, is contained
in [0.80, 1.25] given that mA/mB=1 (or some
other value specified between 0.80 and
1.25)?
True Ratio: Given a fixed total sample size,
n, what are the smallest and the largest
ratios of the true means, mA/mB, such that
the 90% CI for the true ratio falls between
[0.80, 1.25]?
127
Simulation
•
•
•
Closed form sample size formulas
- Straight forward
- Difficult
- Do not exist
- Asymptotic approximations
Stochastic (vs. deterministic)
Wide range of sophistication
- Hypothesis tests or estimation
- Virtual patients, virtual clinical trial
- Assumptions are everything
128
Simulation
Two Sample Pooled Variance t-Test
Statistical Assumptions
•
•
•
•
•
•
Random sampling
Independent random samples
Additive treatment effects
- change: additive
- % change, ratio: multiplicative (use log)

 
Normality: N m A , s2A , N mB , sB2

Homogeneity of variance: s2A  sB2  s2 ,
but s2 unknown
Add in PD, PK, clinical information and
assumptions.
129
Simulation
Mock Sample Size Simulation: t-test
Original Situation and Calculation:
Drug A (mA) vs. Drug B (mB), dCM = d = mA - mB =
±5mmHg, s = sA = sB ± 9mmHg, a = .05, 1 b = .80, 2-tailed pooled variance t-test  52
per treatment group (sample size formula).
Simulation Results:
# Rejections H0:
per 1000
Situation
n
Simulations
1
2
3
4
5
6
7
8
9
10
10
20
30
40
50
60
55
53
51
52
218
402
562
689
785
855
823
809
793
801
130
Empirical
Power (%)
21.8
40.2
56.2
68.9
78.5
85.5
82.3
80.9
79.3
80.1130
Simulation
Sample Size: Virtual Clinical Trials
•
•
Allows for evaluation of competing trial
designs and their corresponding
sample size requirements (power and
detectable outcomes, too).
Generate virtual patients with virtual
responses according to a proposed
trial design taking into account:
- Covariate distribution models
Height, weight, age, gender
Renal function
Concomitant drug use
Correlation between covariates 131
Simulation
Sample Size: Virtual Clinical Trials
- Input-output models
Disease progression
PK/PD relationships
- Execution models
Adherence to dosing regimen
Dropouts
Other patient missing data
Measurement error
Lost samples
• Sensitivity analyses
• Assumptions are everything.
132
Multiplicity
•
•
Sample size increase may be needed
Multiple hypothesis testing (estimation)
within a study
- Multiple endpoints
- Multiple timepoints
- Multiple treatment regimens
- Multiple covariates
- Subgroup analyses (planned or
exploratory)
- Interim analyses (planned or
“administrative”)
- Multiple statistical methods
133
Multiplicity
•
Potential inflation of familywise false
positive rate
Number of Tests (k)
at a = .05
Probability of
at Least One
False Positive(a)
1
.05
2
.098
3
.142
4
.186
5
.226
(a)
1 – (1 - a)k
134
Multiplicity
•
When to increase sample size?
- Multiple hypothesis tests (estimations)
are joined by “At Least” or “Or”
Example: As compared to Placebo,
Drug A is effective for at least one
of the five endpoints.
Example: Drug B is effective either as
compared to Placebo or as
compared to Drug A.
135
Multiplicity
•
When not to increase sample size?
- A single hypothesis test (estimation)
- Multiple hypothesis tests
(estimations) are joined by “All” or
“And”.
Example: As compared to
Placebo, all 5 doses of Drug
A are effective.
Example: Both Drug A and Drug B
are effective as compared to
Placebo.
- Closed testing procedures where
the multiplicity issues can be
ordered/ranked.
Example: Dose response studies 136
Summary
•
•
•
•
•
•
•
•
•
•
The Question
Example Clinical Trial
Random Sampling
Hypothesis Testing (Difference)
Two Sample Pooled Variance t-test
Experimental planning
- Sample size
- Power
- Detectable difference
Starting and Completing Sample Sizes
Blocking Factors and Covariates
Dropouts, Missing Data, Patient Drift
Inadequate Sample Sizes and
Scientific Inference
137
Summary
•
•
•
•
Estimation (Difference)
- Sample size
- Precision
- Confidence level
Similarity
- Sample size
- Probability of concluding similarity
- True difference/ratio
Simulation
Multiplicity
138
General Reading
•
•
•
Brush, G.G. (1988). How to Choose
the Proper Sample Size, Volume 12,
Milwaukee, WI: American Society for
Quality Control.
Cohen, J. (1988). Statistical Power
Analyses for the Behavioral Sciences,
2nd edition, Hillsdale, NJ: Lawrence
Erlbaum Associates.
Desu, M.M. and D. Raghararao (1990).
Sample Size Methodology, New York:
Academic Press.
139
General Reading
Kimko, H.C. and S.B. Duffull (2003).
Simulation for Designing Clinical Trials
– A Pharmacokinetic-Pharmacodynamic Modeling Perspective, New
York: Marcel Dekker.
Machin, D. and M.J. Campbell (1987).
Statistical Tables for the Design of
Clinical Trials, Boston: Blackwell
Scientific Publications.
Odeh, R.E. and M.F. (1991). Sample
Size Choice – Charts for Experiments
with Linear Models, 2nd edition, New
York: Marcel Dekker.
140
General Reading
Parker, R.A. and N.G. Berman (2003).
“Sample Size: More Than
Calculations”, The American
Statistician, 57(3):166-170.
Senn, S. (1997). Statistical Issues in Drug
Development, New York, John Wiley
and Sons, Chapter 13.
Zar, J.H. (1999). Biostatistical Analysis,
4th ed., Upper Saddle River, NJ:
Prentice-Hall, 122-136, and other
sections on sample size, power, and
detectable outcomes.
141
Acknowledgements
•
•
Cindy White
Laurie Rittle
142
Questions and Comments
143
Backup Slides
144
Random Sampling
Experimentation
•
Gather information
- Target population (truth)
- Random sample (experiment)
•
Reach decision about target population
- Uncertainty and variability
- Probability of correct decision: large
- Probability of wrong decision: small
145
Random Sampling
•
Response (random) variable
- Patient medical characteristic
- Example: Change in SuDBP from
baseline to Week 12
•
Parameter
- Describes medical characteristic in
target population (fixed true value)
Summary
Population
Parameter
Mean
m
Variance
s2
Standard Deviation
s
146
Random Sampling
•
Statistic
- Describes medical characteristic in
random sample
- Sample statistic estimates population
parameter (realizations observed in
study)
Population
Summary Parameter
Sample
Statistic
Observed
Mean
m
X
x
Variance
s2
S2
s2
Standard
Deviation
s
S
s
147
Null and Alternative Hypotheses
•
Null hypothesis, H0:
- Statement about target population to reject
- Example:
H0: mA = mB
Mean change Drug A = mean change Drug B
- Reject / fail to reject null hypothesis
Cannot prove true
•
Alternative hypothesis, H1:
- Statement about target population to
assume once null hypothesis rejected
- Example:
H1: mA  mB
Mean change Drug A  mean change Drug B
148
Experimentation vs. Truth
Experimental Outcome
Fail to
reject H0:
Reject H0:
No difference
between
Drugs A, B
H0: mA = mB
Truth
Difference
between
Drugs A, B
H1: mA  mB
149
False Positive Result
Experimental Outcome
Fail to
reject H0:
No difference
between
Drugs A, B
H0: mA = mB
Reject H0:
False
Positive
(a)
Truth
Difference
between
Drugs A, B
H1: mA  mB
150
False Positive Result
•
•
•
•
Sample results reject null hypothesis, H0:
In truth, no difference between
Drug A (mA) and Drug B (mB) in target
population, mA= mB
Falsely assume true difference between
Drug A (mA) and Drug B (mB)
False positive (Type I error) rate, a
- Accepted risk of false positive
- Traditionally, a = .05
151
False Negative Result
Experimental Outcome
Fail to
reject H0:
No difference
between
Drugs A, B
H0: mA = mB
Reject H0:
False
Positive
(a)
Truth
Difference
between
Drugs A, B
H1: mA  mB
False
Negative
(b)
152
False Negative Result
•
•
•
•
Sample results fail to reject null
hypothesis, H0:
In truth, difference between Drug A (mA)
and Drug B (mB) in target population,
mAmB
Falsely assume no true difference
between Drug A (mA) and Drug B (mB)
False negative (Type II error) rate, b
- Accepted risk of false negative
- Traditionally, b = .05, .10, or .20
153
Power
Experimental Outcome
Fail to
reject H0:
No difference
between
Drugs A, B
H0: mA = mB
Reject H0:
False
Positive
(a)
Truth
Difference
between
Drugs A, B
H1: mA  mB
False
Negative
(b)
Power
(1-b)
154
Power
•
•
•
•
Probability of rejecting null hypothesis, H0:
In truth, difference between Drug A (mA)
and Drug B (mB) in target population,
mAmB
Correctly conclude difference between
Drug A (mA) and Drug B (mB)
Power = complement of false negative
rate, 1-b
- Traditionally, 1 - b = .95, .90, .80
155
The Last Situation
Experimental Outcome
Fail to
reject H0:
No difference
between
Drugs A, B
H0: mA = mB
Reject H0:
Good
(1-a)
False
Positive
(a)
False
Negative
(b)
Power
(1-b)
Truth
Difference
between
Drugs A, B
H1: mA  mB
Caution: Failing to reject the null
hypothesis does not mean that
you have proven it to be true.
156
Two Sample Pooled Variance t-Test

 
Normality : N m A , s2A , N mB , sB2
•
•

If nonnormally distributed populations:
Type I Error
- Usually, not always, inflated 
- Detection of differences between
populations other than in means
- Should not use small (a.01) Type I
error rates
•
Power
- Skewed: one-tailed tests problematic;
fewer problems with two-tailed tests
- Platykurtic: (less central) less power
- Leptokurtic: (more central) more power
157
Two Sample Pooled Variance t-Test

 
Normality : N m A , s2A , N mB , sB2
•
•

In general, penalties for nonnormality
decrease when sample sizes, nA and nB,
- Increase: nA  and nB 
- Are (approximately) equal: nA = nB
Distribution-free methodology
- Wilcoxon Rank Sum Test
- Does not require normality
- Different, but related hypothesis
- Still requires equality of scale
(contrary to many introductory level
statistics books and naïve opinion)
158
Two Sample Pooled Variance t-Test
Homogeneit y of Variance : s2A  sB2  s2
 Heterosceda sticity : s2A  sB2
- Variance population A  variance
population B
•
Type I error rates not maintained at a
- Inflated (too high)
- Conservative (too low) 
- Simulation study results follow
159
Observed Type I Error Rates, a
^
Smaller Sample Sizes
s2A , sB2  9, 90
s2A , sB2  9, 18
(a)
(b)
.1
.1
.05
.05
.01
0
.01
0
a=.05
5, 10
5, 5
10, 5
5, 10
5, 5
(c)
(d)
.1
.1
.05
.05
.01
0
.01
0
ST
STRK
AW
AWRK
WX
Nominal
a=.01
5, 10
5, 5
10, 5
10, 5
5, 10
5, 5
nA, nB
Bf2a Mar. 20, 2003
10, 5
160
Observed Type I Error Rates, a
More Moderate Sample Sizes ^
s2A , sB2  9, 90
s2A , sB2  9, 18
(a)
(b)
0.10
0.10
a=.05
0.05
0.05
0.01
0.00
0.01
0.00
20, 40
20, 20
20, 40
40, 20
20, 20
(c)
(d)
0.10
ST
STRK
AW
AWRK
WX
Nominal
0.10
a=.01
0.05
0.05
0.01
0.00
0.01
0.00
20, 40
20, 20
40, 20
40, 20
20, 40
20, 20
nA, nB
Bf2b Mar. 20, 2003
40, 20
161
Observed Type I Error Rates, a
^
Larger Sample Sizes
s2A , sB2  9, 90
s2A , sB2  9, 18
(a)
(b)
.1
.1
.05
.05
.01
0
.01
0
a=.05
50, 100
50, 50
100, 50
50, 100
(c)
50, 50
(d)
.1
.1
.05
.05
.01
0
.01
0
ST
STRK
AW
AWRK
WX
Nominal
a=.01
50, 100
50, 50
100, 50
100, 50
nA, nB
50, 100
50, 50
100, 50
162
bradstreet/bf2a.axg July 11, 1996
Less
Less
More
Least
Least
Most
Complexity
Cost
Individual patient
management
Least
Some
Some
Less
More
Most
Bias
Quality
Double
Blind
Single
Blind
Open-Label
(Unblinded)
Least
More
More
Least
Triple
Blind
Comparison of Blinding Strategies
Some
Most
Most
Partial
Combined
Strategy
163
Questions and Comments
164
Questions and Comments
165
Allocation Procedures
Questions and Comments
167
Part II
Principles of Randomization
with Applications to
Allocation Schedules
and IVRS
168
Block Randomization
Block Randomized Trials
Block randomization: procedure whereby
randomization occurs within subsets of
the total number of allocation numbers
(ANs).
Block: range of consecutive ANs
guaranteed to contain the exact ratio of
treatment regimens specified in the
protocol.
Block Size*: number of ANs in a block.
*also called Blocking Factor
170
Block Randomized Trials
Example: 2 treatment regimens (A and B)
1:1 planned trt. regimen ratio
Block Size: 4
Total number of ANs: 16
Allocation Schedule
Block AN
1 00001
00002
00003
00004
2
00005
00006
00007
00008
Trt.
Trt.
Regimen Block AN Regimen
3 00009
A
A
00010
B
B
00011
B
A
00012
A
B
B
A
A
B
4
00013
00014
00015
00016
B
B
A
A
Note that each block has a 1:1 ratio of A to B.
171
Why Use Block Randomization?
•
Ensure (approximate) proper treatment
regimen balance at any point in the
trial
– Allow valid interim analyses while
the trial is ongoing
– Account for patient drift in long trials
(patients who enroll early in the trial
often differ from those who enroll
late)
•
Support blinding
172
Why Use Block Randomization?
•
•
Ensure (approximate) proper treatment
regimen balance within
– each study center
– important subgroups, e.g.,
race
disease severity
gender
age
Ease logistics of supply management
and shipping
.
173
Why Use Block Randomization?
Note: Technically, all randomized studies
are block randomized. If no smaller
block size is specified, then the block
size = the total number of ANs.
In this case, there is a single block that
contains all of the allocation numbers.
174
How to Choose the Block Size
•
Block size must be a multiple of the sum
of the trt. regimen ratio (expressed in
lowest terms), e.g.,
Number
Ratio
of Trt.
of Trt.
Regimens Regimens
Sum of Trt.
Regimen
Ratio
Possible
Block Sizes
2
1:1
2:1
3:1
1+1=2
2+1=3
3+1=4
2,4,6, ...
3,6,9, ...
4,8,12, ...
3
1:1:1
1:2:1
2:2:3
5:1:1
1+1+1=3
1+2+1=4
2+2+3=7
5+1+1=7
3,6,9, ...
4,8,12, ...
7,14,21, ...
7,14,21, ...
4
1:1:1:1
1:2:2:1
1:3:3:1
1+1+1+1=4
1+2+2+1=6
1+3+3+1=8
4,8,12, ...
6,12,18, ...
8,16,24, ...
175
How to Choose the Block Size
•
If block size is too small, it can lead to
unblinding
Example
Number of trt. regimens:
2
Trt. regimen ratio:
1:1
Block size:
2
Allocation Schedule
Block
1
AN
00001
00002
Trt. Regimen
A
B
2
00003
00004
B
A
If subject 00001 is unblinded (due to lab
test, SAE, etc.), subject 00002 is unblinded
as well!
176
How to Choose the Block Size
•
If block size is too large, it can lead to
trt. regimen imbalance.
Example
Number of trt. regimens: 2
Trt. regimen ratio:
1:1
Block size:
8
Allocation Schedule
Block
1
AN
00001
00002
00003
00004
00005
00006
00007
00008
Trt. Regimen
B
If interim analysis
B
occurs here, the ratio
A
of A:B will be 1:4
B
B
A
A
A
177
Questions and Comments
178
Generating the
Allocation Schedule
Stratified Randomization
Stratified Randomization
Stratified randomization: A technique
used to ensure approximate trt. regimen
balance with respect to prospectively
identified factors that may be related to
the outcome.
181
Stratified Randomization
Why use stratified randomization?
•
•
•
Reduce variability due to major factors
that can influence the response
Strengthen the validity of the comparison
between trt. regimens
Simplify logistics of packaging and/or
shipping
182
Stratified Randomization
Terminology
Stratification factor: Variable potentially
related to the outcome of interest.
Two types of stratification factors:
•
•
Biological
- Race
- Medical Condition
- Gender
- Age
Non-biological
- Study Center
183
Stratified Randomization
Terminology
Clearly, biological factors may be related to a
biological outcome.
Why do we often stratify on study center?
•
•
•
Study conduct may differ among centers
Centers often enter and/or leave the
study at different times
Important biological factors that are either
unknown or difficult to measure may differ
in different populations
Thus, there may be differences among study
centers that are related to the outcome.
184
Stratified Randomization
Terminology
Stratification factor level: Specific value
of a stratification factor.
Examples
Stratification
Factor
Stratification
Factor Levels
Study Center
Centers 1, 2, 3
Race
White
Hispanic
Medical
Condition
Diabetic
Non-Diabetic
Gender
Male, Female
Age
0 to 18
19 to 35
36 to 50
Black
Other
185
Stratified Randomization
Terminology
Stratum
•
•
Specific combination of stratification
factor levels if there are multiple
stratification factors
Same as stratification factor level if
there is only one stratification factor
186
Stratified Randomization
Terminology
Example: 2 Stratification Factors
Factor 1: Study Center, with 3 levels:
- Center 1, Center 2, and Center 3
Factor 2: Medical condition, with 2 levels:
- Diabetic, Non-Diabetic
This produces 3x2=6 strata in the study:
Diabetic
Medical
Condition
NonDiabetic
Study Center
Center 1 Center 2 Center
3
Stratum Stratum
Stratum
1
3
5
Stratum
2
Stratum
4
Stratum
6
187
Stratified Randomization
How is stratified randomization performed?
•
•
One or more complete blocks of ANs are
assigned to each stratum
Thus, the treatment regimen ratio is correct
in each stratum
Note: It is not necessary for each stratum to
have the same number of blocks of
ANs
188
Example 3:
1:1 Randomization,
Single Stratification Factor
Number of trt. regimens
Trt. regimen ratio
Block size
Stratification factor 1
2
1:1
2
Disease severity
Levels
Mild, Mod., Sev.
Ratio of blocks (Mild:Mod:Sev)
1:1:2
Number of ANs
Range of Ans
8
00001 to 00008
189
Example 3:
1:1 Randomization,
Single Stratification Factor
The allocation schedule you receive might
contain the following randomization
Block
AN
Trt.
Regimen
1 (Mild)
1
00001
00002
B
A
2 (Moderate)
2
00003
00004
A
B
3 (Severe)
3
00005
00006
A
B
4
00007
00008
A
B
Stratum
190
00005
00006
00007
00008
3 (Severe)
A
B
A
B
A
B
A
B
1
A
B
B
A
A
B
A
B
2
B
A
A
B
A
B
A
B
3
B
A
B
A
A
B
A
B
4
A
B
A
B
B
A
A
B
5
A
B
B
A
B
A
A
B
6
B
A
A
B
B
A
A
B
7
B
A
B
A
B
A
A
B
8
A
B
A
B
A
B
B
A
A
B
B
A
A
B
B
A
B
A
A
B
A
B
B
A
B
A
B
A
A
B
B
A
B
A
B
A
A
B
B
A
B
A
B
A
A
B
A
B
B
A
A
B
B
A
B
A
B
A
B
A
B
A
B
A
9* 10 11 12 13 14 15 16
* randomization displayed on previous slide
4
3
00003
00004
2 (Moderate) 2
AN
00001
00002
Block
1
1 (Mild)
Stratum
Potential Randomization
All potential randomizations
Example 3:
1:1 Randomization,
Single Stratification Factor
191
Example 4
1:1 Randomization, Two Stratification Factors
Number of trt. regimens
Trt. regimen ratio
Block size
Stratification factor 1*
2
1:1
2
Study Center
Levels
Center 1, Center 2
Ratio of blocks (Center 1:Center 2)
2:1
Stratification factor 2*
Disease Severity
Levels
Mild, Mod., Sev.
Ratio of blocks (mild:mod:sev)
1:1:2
Number of ANs
Range of ANs
24
Center 1: 01001 to 01016
Center 2: 02001 to 02008
*2 centers x 3 severity categories = 6 strata
There are 4096 possible randomizations!
192
Example 4
1:1 Randomization, Two Stratification Factors
The allocation schedule you receive might contain
the following randomization
Strat. Factor
Disease
Center Severity Stratum Block
1
Mild
1
1
2
Moderate
2
3
4
Severe
3
5
6
7
8
2
Mild
4
9
Moderate
5
10
Severe
6
11
12
AN
Trt. regimen
00001
00002
00003
00004
00005
00006
00007
00008
00009
00010
00011
00012
00013
00014
00015
00016
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
02001
02002
02003
02004
02005
02006
02007
02008
A
B
A
B
A
B
B
A
193
Example 4
1:1 Randomization, Two Stratification Factors
Strat. Factor
Disease
Center Severity Stratum Block
1
Mild
1
1
2
Moderate
2
3
4
Severe
3
5
6
7
8
2
Potential Randomization
AN
1
2*
3
. . . 4096
00001
00002
00003
00004
A
B
A
B
A
B
A
B
A
B
A
B
...
00005
00006
00007
00008
A
B
A
B
A
B
A
B
A
B
A
B
...
00009
00010
00011
00012
00013
00014
00015
00016
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
...
...
...
...
...
...
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
Mild
4
9
02001
02002
A
B
A
B
A
B
...
B
A
Moderate
5
10
02003
02004
A
B
A
B
A
B
...
B
A
Severe
6
11
02005
02006
02007
02008
A
B
A
B
A
B
B
A
B
A
B
A
...
B
A
B
A
12
...
194