View/Download powerpoint presentation

Download Report

Transcript View/Download powerpoint presentation

National Research
Council Canada
Conseil national
de recherches
POOLED DATA
DISTRIBUTIONS
GRAPHICAL AND STATISTICAL TOOLS
FOR EXAMINING COMPARISON
REFERENCE VALUES
Alan Steele, Ken Hill, and Rob Douglas
National Research Council of Canada
E-mail: [email protected]
Measurement comparison data sets are generally summarized using a simple statistical reference value calculated
from the pool of the participants’ results. Consideration of the comparison data sets, particularly with regard to the
consequences and implications of such data pooling, can allow informed decisions regarding the appropriateness of
choosing a simple statistical reference value. Graphs of the relevant distributions provide insight to this problem.
Introduction
• Comparison data collection and analysis continues to grow in
importance among the tasks of international metrology
• Sample distributions and populations are routinely considered
when preparing the summary of the comparison
• Reference values (KCRVs) are often calculated from the
measurement data supplied by the participants
• We believe that graphical techniques are an aid to
understanding and communication in this field
Steele, Hill, and Douglas: Pooled Data Distributions
2
The Normal Approach
• Generally, initial implicit assumption is to consider that all of the
participants’ data, as xi/ui, represent individual samples from a
single (normal) population
• A coherent picture of the population mean and standard
deviation can be built from the comparison data set that is fully
consistent with the reported values and uncertainties
• Most outlier-test protocols rely on this assumption to identify
when and if a given laboratory result should be excluded, since
its inclusion would violate this internal consistency
Steele, Hill, and Douglas: Pooled Data Distributions
3
Pooled Data Distributions
• Creating pooled data distributions tackles this problem from the
opposite direction
• The independent distributions reported by each participant
(through their value and uncertainty) are summed directly
• Result is taken as representative of the underlying population as
revealed in the comparison measurements
• Monte Carlo methods are useful when calculations involve
Student distributions or medians rather than means
Steele, Hill, and Douglas: Pooled Data Distributions
4
Monte Carlo Calculations
1.0
0.18
0.9
 = -1
= 2
=4
0.14
0.12
0.6
0.10
0.5
0.08
0.4
0.06
0.3
0.04
0.2
0.02
0.1
0.00
0.0
5
Student Histogram (10 Events)
-10
• Example shows Student
distribution transform
• Our Excel Toolkit includes an
external DLL for doing fast
Monte Carlo simulations with
multiple large arrays
-5
0
x
5
10
0.20
1.0
0.18
0.9
 = -1
= 2
=4
0.16
0.14
0.8
0.7
0.12
0.6
0.10
0.5
0.08
0.4
0.06
0.3
0.04
0.2
0.02
0.1
0.00
0.0
-10
Steele, Hill, and Douglas: Pooled Data Distributions
0.7
-5
0
x
5
Student CDF
• Transformation from uniform
to any distribution done via
cumulative distribution
0.8
Student CDF
0.16
Student PDF
• High quality linear congruent
uniform random number
generators are easy to find
0.20
10
5
Dealing with Student Distributions
100%
100%
90%
80%
99%
CDF(x, =0,=1, )
CDF(x, =0,=1, )
70%
60%
50%
2
3
4
5
6
7
8
9
10
40%
30%
20%
10%
98%
2
3
4
5
6
7
8
9
10
k
97%
96%
0%
95%
-5
-4
-3
-2
-1
0
1
x
2
3
4
5
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
x
• Student Cumulative Distribution Functions for different
Degrees of Freedom ( = 2…10)
• Note that the line at 97.5% cumulative probability crosses each
curve at the coverage factor, k, appropriate for a 95%
confidence interval
Steele, Hill, and Douglas: Pooled Data Distributions
6
Example Data From KCDB
Lab
• Recent results for CCAUV.UK1
PTB
NIST
NPL
CSIRO
NIM
• Low power, 1.9 MHz: 5 Labs
• Finite degrees of freedom
specified for all participants
P Ref (mW) u (mW)
97.4
99
97.6
114.5
94
0.84
0.64
1.01
6.75
1.16

8.3
6.3
11
6.7
12.5
140
135
130
125
PRef (mW)
• Data failed consistency
check using weighted mean
120
115
110
105
• Median chosen as KCRV
100
95
90
PTB
Steele, Hill, and Douglas: Pooled Data Distributions
NIST
NPL
CSIRO
NIM
7
Statistical Distributions
CSIRO
NPL
NIST
PTB
75
95
105
115
125
PRef (mW)
Pooled Data
75
Steele, Hill, and Douglas: Pooled Data Distributions
85
PDF
• Results of Monte Carlo
simulation:
– lab distributions used to
resample comparison
– pooled data histogram
incremented once for
each lab per event
– mean, weighted mean,
and median calculated
for each event
• Population revealed by
measurement is multi-modal
and evidently not normal
NIM
85
95
105
PRef (mW)
115
125
8
Statistical Distributions
CSIRO
NPL
NIST
PTB
75
95
105
115
125
PRef (mW)
Pooled Data
Weighted Mean
Median
Simple Mean
75
Steele, Hill, and Douglas: Pooled Data Distributions
85
PDF
• Results of Monte Carlo
simulation:
– lab distributions used to
resample comparison
– pooled data histogram
incremented once for
each lab per event
– mean, weighted mean,
and median calculated
for each event
• Population revealed by
measurement is multi-modal
and evidently not normal
NIM
85
95
105
PRef (mW)
115
125
9
Advantages of Monte Carlo
• Technique is simple to implement
• Allows calculation of confidence intervals for statistics
• Covariances can be accommodated in straightforward manner
• Possible to include outlier rejection schemes
• Easy to track quantities of interest, such as probability of a given
participant being median laboratory
• Can consider other candidate reference values
Steele, Hill, and Douglas: Pooled Data Distributions
10
Example: CCT-K3 Argon Point
• Another example from KCDB
• CCT-K3 Argon Triple Point
2
• Large variation in reported
values
• Large variation in stated
uncertainties
TLab - TPilot (mK)
1
0
-1
-2
-3
-4
• No KCRV was assigned,
based on data pooling
analysis
Steele, Hill, and Douglas: Pooled Data Distributions
Laboratory
11
Algorithmic Reference Values
• Linear combinations of
simple estimators can be
used as robust estimators of
location
• Evaluation of any such
algorithmic estimator is easy
to do with Monte Carlo
Steele, Hill, and Douglas: Pooled Data Distributions
PDF
• For CCT-K3, proposal to use
simple average of mean,
weighted mean, and median
Weighted Mean
ARV
Simple Mean
-0.50
-0.25
Median
0.00
0.25
0.50
TLAB - TARV (mK)
12
Quantifying the Comparison
• Calculating a reference value – typically the variance-weighted
mean or the median - is a routine part of reporting comparisons
• The suitability of these statistics for representing the data set
can be checked using chi-squared testing
• It is also possible to perform such tests without invoking a
reference value by considering the data in pair wise fashion
• Advantages of pair-statistics
– Always works, even before choosing a reference value
– More rigorous, since can handle correlations exactly
– Explicit, following metrological chains of inference
Steele, Hill, and Douglas: Pooled Data Distributions
13
Pair-Difference Distributions
Xj - PTB
Xj - NIST
Xj - NPL
Xj - CSIRO
Xj - NIM
• Similar to exclusive statistics
• Consider difference between
one lab and “rest of world”
• Sum of per-lab differences is
the all-pairs-difference (APD)
distribution; this is symmetric
-40
-30
-20
-10
0
10
20
Measurement Difference (mW)
30
40
Sum: APD
Median ± MAD
• Width of APD is a measure
of “global” quality assurance
for independent calibration of
an artifact by two different
labs chosen at random
-40
Steele, Hill, and Douglas: Pooled Data Distributions
-30
-20
-10
0
10
20
Measurement Difference (mW)
30
40
14
Reduced Chi-Squared Testing
• Normalizing the pair differences by the pair uncertainties
allows us to build tests of the measurement capability claims
• This is still independent of any chosen reference value
χ 2j  ( N  1) 1 i1,i j ( xi  x j ) 2 (ui2  u 2j  2rijui u j )
N
APD
χ 2r  N 1

N
2
χ
j1 j
• This All Pairs Difference reduced 2 has N-1 degrees of freedom
2
Pr(2 > 2obs)
PTB
3.57
<5.810-2
NIST
5.78
<1.610-2
NPL
3.25
<7.110-2
CSIRO
6.65
<9.910-3
NIM
8.57
<3.410-3
APD
5.57
=1.810-4
• If a data set fails the APD 2 test, it will fail for every possible KCRV
Steele, Hill, and Douglas: Pooled Data Distributions
15
Conclusions
• Monte Carlo technique is fast and simple to implement
• Graphs provide a powerful tool for visual consideration of:
– Pooled data (sum distribution)
– Simple Estimators (mean, weighted mean, median)
– Other Estimators (any algorithm can be used)
• All-pairs reduced chi-squared statistic is egalitarian over
participants, and independent of choice of KCRV
• No single choice of KCRV can adequately represent a
comparison that fails the all-pairs-difference chi-squared test
Steele, Hill, and Douglas: Pooled Data Distributions
16