Presentation Title (38pt) Can Accommodate Up to Three Lines
Download
Report
Transcript Presentation Title (38pt) Can Accommodate Up to Three Lines
Analysis of Stability Data with
Equivalence Testing for Comparing New
and Historical Processes Under Various
Treatment Conditions
Ben Ahlstrom, Rick Burdick, Laura Pack, Leslie Sidor
Amgen Colorado, Quality Engineering
May 19, 2009
Agenda
1. Purpose of comparability for stability data
2. Problems with the p-value approach
3. Equivalence approach and acceptance criteria
methods
4. Example
2
Example Data
Packaging Data
(Chow, Statistical Design and Analysis of Stability
Studies, p. 116, Table 5.6)
P erc en t
10 7
B lister
B o ttle
10 6
10 5
10 4
10 3
10 2
2 package types (Bottle,
Blister)
10 1
10 0
99
10 lots (5 for each
package type)
98
97
96
0
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
T im e (M o n t h s)
3
1
8
6 time points (0 to 18
months)
Comparability Analysis
for Stability Data
Purpose
– Compare the rates of degradation
P-value Analysis Steps
– Fit the regression lines (process*time interaction)
– Calculate p-value for process*time
– Compare p-value to =0.05
– Draw conclusion about comparability
• pass (comparable) if p-value > 0.05
• fail (not-comparable if p-value < 0.05)
I.E.: Evaluate the slopes of the treatment conditions
4
P-value Analysis to Evaluate Comparability for
Stability Data
P erc en t
10 7
B lister
B o ttle
10 6
Percent Label Claim
10 5
10 4
10 3
10 2
10 1
10 0
99
98
97
96
0
1
2
3
4
5
6
7
8
9
1
0
1
1
T im e (M o n t h s)
Bottle vs. Blister:
Are the processes comparable?
5
1
2
1
3
1
4
1
5
1
6
1
7
1
8
P-value Approach
Hypotheses
– H0: slopes are comparable
– HA: slopes are not comparable
If p-value < 0.05, reject H0
If p-value >0.05, fail to reject H0
– Does not imply they are comparable, but rather that there
isn’t enough evidence to say the slopes are different
6
P-value Analysis to Evaluate Comparability for
Stability Data
P erc en t
10 7
B lister
B o ttle
10 6
10 5
10 4
10 3
10 2
Packaging: Bottle vs. Blister
10 1
Do we pass or fail the p-value test?
10 0
99
98
97
Pass: p=0.8453
96
0
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
T im e (M o n t h s)
We compare the slopes using p-values
(Pass if p-value > 0.05 and Fail if p-value < 0.05)
7
Problems with P-value Approach
Reporting a P-value only tells us something about
statistical significance.
– A statistically significant difference in slopes does not
necessarily have any practical importance relative to patient
safety or efficacy.
– P-values are non-informative because they do not quantify
the difference in slopes in a manner that allows scientific
interpretation of practical importance.
– A p-value approach provides a disincentive to collect more
data and learn more about a process.
8
Equivalence Testing Method
1. Fit the model with all historical and new process data (includes
different storage conditions, orientations, SKU’s, container
types)
2. Compute the difference in slopes for the desired comparison
Bottle vs. Blister
3. Compute the 95% one-sided confidence limits around the
difference observed over the time frame of interest
4. If the confidence limits are enclosed by the equivalence
acceptance criteria, conclude that the historical and new
processes are comparable
9
Statistical Model
Yijk i a j i Xijk ijk
Parameters i and βi are the overall regression
parameters for the ith process
Random variables aj allow the intercepts to vary for
each lot
Xijk is the time value for process i, lot j, and time k.
Model can be extended to more levels
10
Statistical Equivalence Acceptance Criteria (EAC)
Goal Post is the space of expected historical performance
Football = 95% one-sided CLs
around difference between slopes
over time frame of interest
11
Methods to Calculate Equivalence Acceptance
Criteria (EAC)
Equivalence Acceptance Criteria (EAC) provide a
definition of practical importance
The scientific client has the responsibility to
determine a definition of practical importance (based
on science, safety, specification, reg. commit., etc.)
Statistical methods can help establish a starting point
for these decisions
Three statistical methods include:
– Method 1: Common cause variability
– Method 2: Excursion from Product Specification
– Method 3: Historic Variability of Slope Estimates
12
3 Statistical Approaches for Defining EAC
Method 1
Method 2
Method 3
EAC based on common cause
variability of the historic process
EAC based on product
specification
EAC based on historic
variability of slope estimates
Column 5
0.4
0
0.3
0.2
0.1
-3
1
2
Response
0
N-Quantile probability
3
2
Error
New
-1
K2
2
Lots
-2
New
Pth lower
percentile
centered at
historic mean
where P is
probability
of excursion
Mean of historical
at expiry
Overlay Plo t
Response
Hist
1
Response
Hist
3
K
0.4
0
0.3
0.2
0.1
-3
Overlay Plo t
Column 5
-2
0
1
N-Quantile probability
-1
Spec (LSL)
2
3
Acceptable difference in slopes is = K/T
0
Time (months)
Acceptable difference in slopes is = K/E.
T
-EAC is expressed as average
change in response per month
Pth lower
percentile
centered
at new mean
0
Time (months)
E (Expiry)
-Requires a specification
-EAC is expressed as average
change in response per month
13
0
Time (months)
T
-Requires at least 3 different lots
in historic data set
-EAC is expressed as change
response per month
Comparability in Profile Data
Reference condition
A
Quality attribute
Difference
between
intercepts t = 0
difference
B-A Totalbetween
B conditions at
New condition
time T
(intercept and
slope)
T
0
Time (months)
14
Difference in response
averages attributed to
the difference in slopes
B–A=δ
T
EAC Method 1:
Common Cause Variability
Criteria is based on historical
performance at various
conditions
2
2 (Lot
e2 )
T
Lot to Lot
variability
Measurement
variability
Multiplier aligned with other
statistical limits used to separate
random noise from a true signal
Goal Post is the space
of expected historical
performance
15
EAC Method 1:
Common Cause Variability
1
2
2
2 (Lots
Error
)
T
2
2
Lots
Error
is unknown; replace with a
95% upper bound on this quantity
T = Expiry = 18 months
2 2.4498
1
0.2722 % per Month
18
16
Percent Label Claim,
P-value approach vs. Equivalence Test
P-value
Equivalence
Slope Bottle
-0.2892
-0.2892
Slope Blister
-0.2783
-0.2783
Key Point
P-value
0.8453
NA
Slope difference over 18
months
• Slope estimates are the
same for both approaches
NA
-0.08267
Goal Post
NA
+/-0.2722
Result
PASS
PASS
0.1046
P erc en t
10 7
B lister
B o ttle
10 6
Percent Label Claim
10 5
Equivalence graph
10 4
10 3
10 2
10 1
10 0
99
98
97
96
-0. 2722
0
0.2722
0
1
2
3
4
5
6
7
8
9
1
0
1
1
T im e (M o n t h s)
Difference in Slopes
17
1
2
1
3
1
4
1
5
1
6
1
7
1
8
EAC Method 2:
Product Specification
Maximum allowable difference in slopes where new
and historic have < p% excursion rate at expiry
Typically p=0.01, 0.025, 0.05
Use historic data
Relates comparability to specification
18
EAC Method 2:
Product Specification
Hist
Overlay Plo t
Column 5
0.4
0.3
0.2
0.1
0
-3
-2
Response
Pth lower
percentile
centered at
historic mean
where P is
probability
of excursion
Mean of historical
at expiry
0
1
N-Quantile probability
-1
2
3
New
0.4
0.3
0.2
0.1
0
-3
Overlay Plo t
Column 5
K
-2
0
1
N-Quantile probability
-1
Spec (LSL)
2
3
Pth lower
percentile
centered
at new mean
Acceptable difference in slopes is = K/E.
0
Time (months)
19
E (Expiry)
EAC Method 2:
Product Specification
2
K
Expiry
2
2
LSL
K Predicted Y at expiry Z1P Lots
Error
K is unknown, so replace term in brackets with lower one-sided
(1-P)*100% individual confidence bound based on historical
(prediction bound)
Assume Lower Spec Limit (LSL) = 95
Expiry = 18 months
97.403 95
2
0.1335 % per month
18
20
EAC Method 3:
Historic Slope Variability
Use historical data for calculation
Historical dataset provides nH independent estimates
of the common slope β
EAC based on 99.5th percentile of distribution of
difference in slopes from same lot.
If observed slope difference is consistent with this
variability, equivalence is demonstrated.
21
EAC Method 3:
Historic Slope Variability
Response
^ 1
^
^ 3
0
Time (months)
22
T
EAC Method 3:
Historic Slope Variability
1
1
3 2.576 U
nH nN
ˆ -βˆ
θ3 is the 99.5th percentile of the distribution of β
H
N
2.576 is the 99.5th percentile of the standard normal
distribution
U is a 95% upper bound on the standard error for an
estimate of β based on a single lot
1 1
3 2.576 .09176
0.1495
5 5
23
Comparison of Equivalence Acceptance Criteria
Criteria
Method
1
2
3
Theta
Slope
Difference
over18Hard for a client to know
a difference in slopes
MonthswhatResult
of, say, 0.1 % looks like in a
-0.08267
table Pass
-/+0.2722
to 0.1046
Once client sees graph, they
-0.08267
can get a feel for what a
-/+0.1335
Pass
difference
in slope means
to 0.1046
Can visualize what the
-0.08267
-/+0.1495
Passrange of regression
possible
to 0.1046
lines could be to still claim
equivalence
24
Comparison of Equivalence Acceptance Criteria
EAC Based on Bottle
104
103
Percent Label Claim
102
101
Bottle
Method1
Method2
Method3
100
99
98
97
96
95
94
0
6
12
18
Time (months)
25
Based only on
historical data
Graph is created
before data for the
new process is
collected
Results by Method
Historical New
HA: Show δ is less than some
amount deemed practically
important
Equivalence is demonstrated by
computing two one-sided tests
(TOST)
If the 95% lower one-sided
confidence bound on δ is greater
than -θ and the 95% upper onesided confidence bound is less
than θ, then equivalence is
demonstrated
26
Slope
Difference
over 18
Months
Criteria
Method
Theta
1
-/+0.2722
-0.08267
to 0.1046
Pass
2
-/+0.1335
-0.08267
to 0.1046
Pass
3
-/+0.1495
-0.08267
to 0.1046
Pass
Result
P-value Approach vs. Equivalence Approach
P-value Approach
Equivalence Approach
Ho: slopes are comparable
Ho: slopes are not comparable
HA: slopes are not comparable
HA: slopes are comparable
P-value
Equivalence acceptance criteria
set a priori
Based on interval estimates of
slope difference using mixed
regression model with random
lots
Statistical convention is to have research objective in HA
27
Summary
P-value approach to comparability has numerous
issues
– High p-values do NOT prove equivalence
– High p-values only indicate that there is NOT enough
evidence to conclude slopes are different
– At times, leads to ad hoc analysis requests when p-value is
small
– P-values sensitive to sample size
Goal posts allow you to state equivalence
– Industry is moving in the direction of equivalence tests
Can be extended to accelerated studies
Move to Equivalence Testing for Comparability
28
References
Limenati, G. B., Ringo, M. C., Ye, F., Bergquist, M. L., and
McSorley, E. O. (2005). Beyond the t-test: Statistical equivalence
testing. Analytical Chemistry, June 2005, pages 1A-6A.
Chambers, D. , Kelly, G., Limentani, G., Lister, A., Lung, K. R.,
and Warner, E. (2005) Analytical method equivalency: An
acceptable analytical practice. Pharmaceutical Technology,
Sept 2005, pages 64-80.
Richter, S. , and Richter, C. (2002). A method for determining
equivalence in industrial applications. Quality Engineering,
14(3), pages 375-380.
Park, D. J. and Burdick, R. K. (2004). Confidence Intervals on
Total Variance in a Regression Model with an Unbalanced
Onefold Nested Error Structure, Communications in Statistics,
Theory and Methods, 33, No. 11, pages 2735-2743.
29
Back up slides
30
Back up slides
EAC Method 2
Equal Difference Assumption:
Controlled room temperature Recommended temperature =Any temperature
This assumption may not always hold
– The p-value for the interaction between time, process, and
temperature tests this assumption
31
Comparison of Equivalence Acceptance Criteria
Plot regression line for
historical process
At time=0 the value is
ˆ
Bottle vs. Blister
Calculate
104
103
ME 2 estimated standard error of ˆ 1.645
102
Percent Label Claim
Plot 2 additional lines
Value at time=0 is
Values at time=T are
ˆ
101
Bottle
Method1
Method2
Method3
100
99
98
97
96
95
ˆ ˆ ME T
94
0
6
12
Time (months)
ˆ ˆ ME T
32
18