A Statistical Approach to Method Validation and Out of Specification

Download Report

Transcript A Statistical Approach to Method Validation and Out of Specification

A Statistical Approach to Method
Validation and Out of Specification
Data
Outline of talk
• Basic statistics
– Averaging, confidence intervals
• Fitness-for-purpose and analytical capability.
• Quantifying variability and producing a capable
method.
• Out-of-specification results.
• Conclusions.
Repeat measurements
994.765
996.8626
1000.665
1017.53
981.7084
998.3029
1003.802
998.3409
1002.779
1007.732
1008.048
1008.842
995.1794
1004.904
1002.433
1013.802
1008.136
998.0636
1004.67
1006.48
992.7641
988.0834
1002.151
1011.441
1005.991
993.7479
996.3199
997.8086
1005.854
997.1728
999.4718
1004.641
1002.325
996.136
1000.387
12
10
Occurances
1005.081
8
6
4
2
0
980
990
1000
Value
1010
1020
Distribution of measurements
0.0600
average
probability
0.0500
standard
deviation
0.0400
0.0300
0.0200
95% confidence
interval
2.5%
0.0100
0.0000
960
2.5%
970
980
990
1000
1010
1020
1030
1040
measurement
The 95% confidence interval is the range of values around the
mean in which 95% of the measurements are expected to lie.
Relative standard deviation, RSD
s
RSD(%)  100
x
For a strength of ~100%, a 0.7% RSD equates to a
standard deviation of ~0.7%. This means that the
range of values encompassing 99% of all possible
measures is approximately +/- 2.1%.
0.7% RSD at 100% strength has a 99% confidence
interval of 97.9% to 102.1%.
• The standard deviation is a
measure of variability.
• The effect of variability can be
reduced by taking the average
of a number of repeat
measures.
• The standard deviation
associated with the mean of n
measures is:
s
sx 
n
standard error of mean
Effect of averaging
1.2
1
0.8
0.6
0.4
0.2
0
0 1 2 3 4 5 6 7 8
n
Distribution of the mean
0.120
Probability
0.100
n=4
n=3
n=2
n=1
0.080
0.060
0.040
0.020
0.000
970
980
990
1000
1010
1020
1030
true value
The confidence in the mean improves as the number of
measurements increases.
How many measurements should I
average?
• Depends upon:
– The amount of variability present in the
measurements.
– The degree of confidence I wish to achieve.
WHAT IS FITNESS FOR PURPOSE?
Capability of an analytical method
Incapable method
Capable method
0.250
0.900
lower spec.
limit
0.800
upper spec.
limit
0.200
lower spec.
limit
upper spec.
limit
0.700
0.150
probability
probability
0.600
0.100
0.500
0.400
0.300
0.200
0.050
0.100
0.000
0.000
90
95
100
Concentration
105
110
90
95
100
Concentration
105
110
How to measure capability?
Use measures from statistical process control
0.250
USL  LSL
cp 
C.I .
103  97
cp 
12
 0.5
upper spec.
limit
0.200
probability
e.g., specification between
97 mg/l and 103 mg/l, width
of confidence interval of
12mg/l:
lower spec.
limit
0.150
0.100
Conf.
Interval
0.050
0.000
90
95
100
Concentration
105
110
Interpreting cp
Batch failure rate purely due to variability in analytical
method.
Bx failure rate due to
analysis / %
50
40
30
20
10
0
0.2
0.4
0.6
0.8
1
1.2
cp
1.4
1.6
1.8
2
One-sided specifications
c p ,l
x  LSL

 1   C.I .
 
 2
0.250
lower spec.
limit
expected
value
0.200
probability
c p ,u
USL  x

 1   C.I .
 
 2
0.150
0.100
Half Conf.
Interval
0.050
Where x is the expected
average value of the
parameter.
0.000
90
95
100
strength
105
110
Method development/validation
• To determine the number of repeat measurements to
ensure that the analytical capability is acceptable, for
example >1.
• Acceptance criteria are then product dependent, rather
than technique specific.
• How do I determine the amount of variability?
• How do I determine the number of repeat
measurements required?
Quantifying variability (e.g. HPLC)
• Need to assess two sources
of variability (repeatability):
Experimental Design
– Between “weighings”
– Instrumental.
• Between weighings
quantifies variability due to
sample inhomogeneity and
the sample preparation
process.
• Instrumental quantifies the
variability associated with
the instrumental
measurement.
Sample
weighings
measures
Quantify a source of variability by determining its standard deviation.
Example
Weighing
1
2
3
4
5
6
1
975.20
928.77
992.30
1047.96
1036.10
1109.29
2
971.41
934.27
1035.73
1069.98
1064.50
1074.81
Can use Analysis of Variance (ANOVA) to determine:
Standard deviation for “weighing”, sw = 57.9
Standard deviation for instrument, s = 19.2
These values refer to the measured response (e.g. weightcorrected area)
Confidence interval for analysis
Confidence interval for future number of weighings
(n1) and measurements per weighing (n) is given
by:
  t / 2, N
1  2 s 2 

sw  

n1 
n
: degree of confidence (usually 0.05 for 95% confidence)
N: number of degrees of freedom to determine sw and s.
t: Students t-value for  and N.
: confidence interval for measurement (area)
Analytical Capability
Number of weighings
per weighing
Number of measures
1
2
3
4
5
1
0.574
0.812
0.994
1.148
1.283
2
0.589
0.833
1.020
1.177
1.316
3
0.594
0.840
1.029
1.188
1.328
4
0.596
0.844
1.033
1.193
1.334
5
0.598
0.846
1.036
1.196
1.337
The analytical capability, cp, changes with n1 and n.
External Standards
x
ˆ
S   Ss
xs
S s : strength of external standard
xs : average measure for external standard
x : average measure for sample
Sˆ : estimated strength for sample
If  is the confidence interval for x and xs , then the
confidence interval for Sˆ  2   , i.e. if x has an RSD of
0.7%, the RSD for the estimated strength is ~1.0%.
Practical consequences: finding
result Out-of-Specification
Probability of OOS result
Consumers risk
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Lower spec: 97%
RSD of injection: 0.7%
No weighing variability
measures
1
2
3
4
5
Producers risk
95
96
97
98
True strength
99
100
Dealing with OOS results
• Can re-test samples.
• On re-testing, FDA guidelines for industry state “if no …errors
are identified in the first test, there is no scientific basis for
invalidating OOS results in favour of passing re-test results.”
• Scientifically, the issue of whether the re-test results “pass” or
“fail” is of little consequence. The issue is whether the re-test
results are statistically the same as the original OOS result.
• Can use the t-test to assess the similarity between OOS and retest.
Example 1
LSL
0.60
0.50
Probability of
• Specification >97.0%
• OOS result 96.5% with
confidence interval +/- 2.1%.
• Re-test 97.7% with
confidence interval +/- 2.1%.
• No evidence that the OOS
and re-test are different from
t-test.
• Average the OOS and re-test
gives 97.1% with confidence
interval +/- 1.5%.
OOS
Re-test
0.40
0.30
0.20
0.10
0.00
90
92
94
96
Strength
98 100
Example 2
LSL
1.00
0.90
0.80
Probability
• Specification >97.0%
• OOS 96.0% with confidence
interval +/- 0.9%.
• Re-test 98.0% with
confidence interval +/- 0.9%.
• No evidence that the OOS
and re-test are the same.
• Cannot average the OOS
and re-test result.
• Consequently must doubt
both results.
OOS
Re-test
0.70
0.60
0.50
0.40
0.30
0.20
0.10
0.00
90
92
94
96
Strength
98
100
Conclusions
• Understanding and determining the confidence
interval associated with an analytical result is an
important part of method development/validation.
• The relationship between the confidence interval and
the product specification is an important aspect of
defining method fitness-for-purpose.
• The analytical capability is quantifiable measure of
fitness-for-purpose for precision.
• Understanding the confidence interval is important
during out-of-specification investigations.