µ - Iowa State University

Download Report

Transcript µ - Iowa State University

Getting an estimate of % of GM in a sample
2. Qualitative laboratory methods
May 8-10, 2006
Iowa State University, Ames – USA
Jean-Louis Laffont
Kirk Remund
Overview
• Impurity estimators and confidence
intervals
• Quantitative information from a
qualitative assay
• Limitations to quantification with a
qualitative assay
ISTA Statistics Committee
2
Impurity Estimate
μˆ  ?2%
Our best guess of
what the true lot
impurity/purity is
based on the sample…
µ= lot impurity/purity
(sometimes called p)
µ=1%
truth
μˆ  impurity/purity
estimate
ISTA Statistics Committee
3
Estimate based on sample
Lot
-
-
+
-
-
-
-
-
-
-
Sample
+
- -
-
-
μˆ  1/4  25%
+
- - - +
- - + μ  4/28  14%
ISTA Statistics Committee
4
Confidence intervals are
like nets…
Oops, looks
like one got
away!!
Then what are we trying
to catch?
Answer: true level of
impurity (µ) in the lot
Lot 3
µ3
Lot 1
µ1
Lot 2
µ2
ISTA Statistics Committee
5
Confidence level
• Net “interval” size is function of sampling variability,
assay errors and confidence level
• If we fix the sampling and assay variability then:
lower conf.
level
small
higher conf.
level
large
ISTA Statistics Committee
6
What does 95% confidence mean?
• Statement: “We are 95% confident that the true lot
impurity is contained within the interval (net)”
• Overall we expect that 95% of the time the interval will
catch the true lot impurity (µ)
µ
expect 5% of time
µ will fall out of net
µ
µ
µ
µ
ISTA Statistics Committee
µ
µµ
7
Estimator of GM purity/impurity
(Individual Seed Testing)
• Estimator:
• UCL:
μˆ 
d
# of deviant seeds

n total # seeds sampled
μˆ UL 
(d  1)F1α,2d2,2n2d
(n  d)  (d  1)F1α,2d2,2n2d
where F is the 1- quantile from an F-distribution with 2d+2 and 2n-2d degrees of freedom
• Individual seed testing used to test purity of GM
material for proficency test
• Used to test purity of GM variety seed
• Implemented in Seedcalc
ISTA Statistics Committee
8
Estimator of GM impurity
(Seed Pool Testing)
d

• Estimator: μˆ  1   1  
 n
1/m
where m is the number seeds per pool, n is the number of seed pools
and d is the number of deviant seed pools
• UCL:
μˆ UL


(d  1)F1α,2d2,2n2d

 1 1 
 (n  d)  (d  1)F

1

α,2d

2,2n

2d


1/m
• Used to estimate AP levels of GM in conventional seed
• Used to estimate level if GM impurity in conventional
seed for proficency test
• Implemented in Seedcalc
ISTA Statistics Committee
9
1 & 2 sided Confidence limits
• Upper confidence limit (UCL)
– “95% confident that true impurity is below
upper confidence limit”
– Caution: do not use as estimate
• Two-sided confidence limit
– “95% confident that the true impurity is
between the lower and upper limit
– Similar to form on earlier slide formulas
and is implemented in Seedcalc
ISTA Statistics Committee
10
Two-sided confidence interval
(put ½ of alpha in each tail)
1- confidence that
interval contains true purity
of lot
/2
1- confidence interval
ISTA Statistics Committee
/2
11
One-sided confidence interval
(alpha in one tail)
1- confidence that
interval contains true purity
of lot
1- confidence interval
ISTA Statistics Committee

12
The following slides illustrate that a
simple presence/absence answer
per pool of seeds allows estimation
of % of seeds presence
The statistical computation takes into account the fact
that for a given level of GM presence, some subsamples will “by chance” contain 0 GM seeds, others 1
GM seed, others 2 GM seeds, etc..
The formula is : % GMestimate
 1  (1  d / n ) 1 / m
Where d is the number of deviant sub-samples , n is the
number of sub-samples, m is the number of seeds per
sub-sample
ISTA Statistics Committee
13
ISTA Statistics Committee
14
1
2
3
4
5
6
7
8
9
10
From 1500 seeds, 10 pools of 150
seeds have been made
ISTA Statistics Committee
15
Each sub-sample is
tested for
presence/absence of
GM seeds
4 sub-samples
are positives
0.34%
ISTA Statistics Committee
Positive
control
Negative
control
16
% estimate can be obtained in Seedcalc or in ISTA documents
nb of
p
o
o seeds per
l
po
s
ol
0
1
1500
0
####
2
750
0
0,09%
####
3
500
0
0,08%
0,22%
####
4
375
0
0,08%
0,18%
0,37%
####
5
300
0
0,07%
0,17%
0,30%
0,54%
####
6
250
0
0,07%
0,16%
0,28%
0,44%
0,71%
####
7
214
0
0,07%
0,16%
0,26%
0,40%
0,58%
0,91%
####
8
187
0
0,07%
0,15%
0,25%
0,37%
0,52%
0,74%
1,11%
####
9
166
0
0,07%
0,15%
0,24%
0,35%
0,49%
0,66%
0,90%
1,31%
####
10
150
0
0,07%
0,15%
0,24%
0,34%
0,46%
0,61%
0,80%
1,07%
1,52%
1
2
3
4
5
6
7
8
9
10
####
4 positive pools from 10 pools of 150 seeds
=> 0.34% of GM seeds
ISTA Statistics Committee
17
Statistical computation take into account that some
sub-samples may have more than a GM seed
GM seed
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
ISTA Statistics Committee
18
( 5 GM Seeds 3 times 5 positive, 1 time 4 positive)
Qualitative Test/Quantitative Information
Example of seed pool testing strategy:
-
-
-
-
+
-
-
-
+
+
-
-
<0.25%
<0.46%
<0.77%
seed
seed
seed
(4 pools of 300 seeds)
ISTA Statistics Committee
19
How confident are we that the qualitative data is
appropriate to describe a quantitative result?
Distribution of attribute within pooled samples: How many positive seeds
in 2 positive pools?
(4 pools of 500 seeds)
+
+
-
-
<0.46%
0.14 =
seed
Best Estimate
# positive
seeds
Probability*
2-3
~65.4%
4-5
~24.9%
6-7
~7.8%
8-9
~1.4%
>9
~0.4%ISTA Statistics Committee
all seeds negative
(1000 seeds)
* Probability of set number of
positives given that two pools
are negative
20
Inputs
Outputs
ISTA Statistics Committee
21
Threshold Testing VS UCL
LQL
μ̂
μ̂
μ̂
0.0%
μ̂
μ̂UL
μ̂UL
μ̂UL
μ̂UL
0.5%
1.0%
1.5%
ISTA Statistics Committee
2.0%
22
Threshold Testing VS UCL
ISTA Statistics Committee
23
Estimation limitations for small # of pools
2 of 1500
Real-time PCR assays also has
Problem estimating higher
AP impurity levels due to asymptote
of cycles at higher impurity
3 of 1000
5 of 600
10 of 300
15 of 200
20 of 150
30 of 100
60 of 50
0
2
4
% impurity
ISTA Statistics Committee
6
8
24
Limited information if all pools are positive
• Test 10 pools of 300 seeds and all are
positive
– Impurity estimate = 100% BUT
– 95% confident that impurity in lot is
between 0.45% and 100%!!!
• Test 3 pools of 1000 and all positive
– 95% confident that impurity in lot is
between 0.05% and 100%!!!
ISTA Statistics Committee
25
Demonstration and
Exercises in Seedcalc
ISTA Statistics Committee
26