Polsko-Norweski Fundusz Badań Naukowych / Polish

Download Report

Transcript Polsko-Norweski Fundusz Badań Naukowych / Polish

Estimation of uncertainty
in status class assessment
for Wel waterbodies
Jannicke Moe (NIVA)
deWELopment project meeting
10.02.2011, Warszaw
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
From the deWELopment project description:
Third phase: from single metrics to BQE-level assessment
• We will test different methods of combination of these single
metric results to obtain a total result at the whole element
level, taking into account the uncertainty in the different single
metrics
• In our project we will test alternative approaches
[ to the one-out-all-out principle] using different methods
– simple averaging, weighted averaging, multimetric approach
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
From the deWELopment project description:
Fourth phase: from BQE-level to waterbody-level assessment
• Testing different ways of combining the assessment results for
different BQEs into one final result for the whole waterbody.
– Here the recommended by one-out-all-out rule will be compared
to other alternative methods.
• The risk of misclassification will be estimated
– software STARBUGS ( WISERBUGS)
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Outline
•
Uncertainty and risk of misclassification at BQE level
•
Integration of uncertainty from BQE level to waterbody level
•
WISERBUGS tool: examples with deWELopment results
•
Next steps
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Uncertainty and risk of misclassification
at BQE level
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Uncertainty: "accuracy" vs. "precision"
• High accuracy, but low precision
– "roughly right"
True value
5
metric
True value
• High accuracy and high precision
– optimal result
5
metric
True value
• Low accuracy, but high precision
– "precisely wrong"
5
metric
• We can never know the ”true value”of a BQE metric
– only the measured value
• Standard Deviation (SD) is a measure of precision, not accuracy
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Uncertainty in BQE level
• If we can never know the ”true value” of a BQE metric
– how can we say something about uncertainty???
• We must assume that the measured mean value represents
the true value and the true status class
• We can assume that measured metric values follow
normal distribution due to sampling uncertainty
Histogram
of DATA
• We can let
measured
SD represent sampling uncertainty
0.8
Example:
Measured metric values: 3, 5, 5, 6, 6
Mean: 5
Standard Deviation: 1.22
Density
0.6
0.4
0.2
0.0
0
2
4
6
BQE metric value
8
10
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Risk of misclassification - BQE level
• With the given mean and SD, we can test:
if we re-sample the BQE many times, how often will we get
the ”true” status class?
• Assuming that measured status class = true status class
probability density function
0.8
Risk of misclassification:
= proportion of
”new samples” which
result in wrong status class
= 0.7% + 20% + 20% + 0.7%
= 41.4%
p=0.7%
p=20%
p=58.6%
p=20%
p=0.7%
0.6
0.4
SD = 1.22
mean
0.2
0.0
0
2
4
6
8
10
BQE metric value
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Higher SD gives flatter distribution
=> higher probability that
new BQE values fall
outside the true class
Example:
- SD increases from 1.0 to 1.5
- Probability of misclassification
increases from 32% to 50%
- ”True class” is Moderate, but
25% probability that new samples
will result in Good or High
0.8
p=0.1% p=15.7% p=68.3% p=15.7% p=0.1%
0.6
mean
0.4
SD = 1
0.2
0.0
probability density function
-
probability density function
Risk of misclassifcation increases with SD
0.8
0 p=2.3% 2 p=23% 4p=49.5%6 p=23% 8 p=2.2%10
BQE value
0.6
0.4
SD = 1.5
mean
0.2
0.0
0
2
4
6
8
BQE value
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
10
If the measured mean BQE is
close to a class border, then
new BQE values are more likely
to fall into a neighbour class
Example:
- Mean BQE decreases from 5 to 4.2
- Probability of misclassification
increases from 32% to 46%
- ”True class” is moderate, but
42% probability that new samples
will result in Good or High
0.8
p=0.1% p=15.7% p=68.3% p=15.7% p=0.1%
0.6
mean
0.4
SD = 1
0.2
0.0
probability density function
-
probability density function
Risk of miscl. increases near class borders
0.8
0
2
4
6
8
10
p=1.4% p=40.7% p=54.3% p=3.6% p=0%
BQE value
0.6
mean
0.4
SD = 1
0.2
0.0
0
2
4
6
8
BQE value
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
10
How can we obtain SD for BQE level?
• Can we get reliable SD estimates from deWELopment data?
– Many samples are needed per BQE and waterbody –
we have ”only” 1-2 samples per BQE and waterbody (?)
– Metric values are not necessarily normally distributed
• Other distributions can be considered
• What are the alternatives?
– ”Use best-available information from replicated sampling studies
on environmentally similar waterbodies” (WISER data?)
– Use best guesses
– (Ask WISER WP6.1 for advice)
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Integration of uncertainty from BQE level
to waterbody level
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Macroinvertebrates
Phytobenthos
Combining metrics and BQEs
Hydrology
Acidification
Organic
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Combining metrics and BQEs
Two issues:
1) How to combine status classes for different metrics and BQEs
• Average; weighted average; all-out-one-out; etc.
• Will not be discussed here (see my presentation June 2010)
2) How to combine uncertainty from different metrics and BQEs
• Tool: WISERBUGS
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
WISERBUGS – brief introduction
• WISER Bioassessment Uncertainty Guidance Software
• Excel-based tool developed within EU project WISER
• Purpose: assist in quantifying uncertainty in the assessment of
ecological status of waterbodies
• Can be used for testing impact on classification of:
–
–
–
–
–
Combination rules for metrics and for BQEs
Class boundaries and reference conditions
Sampling uncertainty – SD (per metric)
Sorting/identification uncertainty – SD (per metric)
Uncertainty in reference condition – SD (per metric)
• Can not be used for
– estimating SD for metrics (must be done separately)
– estimating type I/II errors (because true status class is not known)
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
WISERBUGS – brief introduction
WISERBUGS output: probability of each status class for...
• each waterbody (overall assessment)
• each BQE within a waterbody
• each metric within a BQE
0.8
probability
0.6
0.4
0.2
0.0
High
Good
Moderate
Poor
Bad
Status class
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
WISERBUGS – brief introduction
WISERBUGS input
• For each metric:
–
–
–
–
–
–
–
Measured value for each sample in each waterbody
SD representing sampling variation
Class boundaries (H/G, G/M, M/P, P/B)
”E1”: metric value for which EQR = 1 (Reference value)
”E0”: metric value for which EQR = 0 (bottom of metric scale)
(SD representing sorting/identification variation)
(SD for reference value)
• For overall assessment:
– Combination rules for metrics within BQE
– Combination rules for BQEs within waterbody
– (Correlation between metrics)
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
WISERBUGS tool:
examples with deWELopment results
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Data received from deWELopment
• Selected metrics:
– Rivers:
•
•
•
•
MP: River Macrophyte Index
PB: Multimetric Diatom Index for rivers
MI: Benthic Macroinvertebrate Index
FI: European Fish Index +
– Lakes:
•
•
•
•
•
PP: Chlorophyll a; Phytoplankton Metric for Polish Lakes
MP: Ecological State Macrophyte Index
PB: Diatom Index for Lakes
MI: Benthic Quality Index based on Chironomid Pupal Exuvial Techn.
FI: Fish Index 'Summ Best’ (no class boundaries yet)
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Data received from deWELopment
• Metric values:
– 13 rivers + 10 lakes
– Usually all BQEs for each waterbody
• Standard deviations per metric: not available
– Randomised values used for this excercise
– (To be discussed)
• Class boundaries and reference conditions (”E1”) per metric:
– Sometimes waterbody-specific (OK for WISERBUGS)
• ”E0” – given in correct scale?
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Metric specification - 1 (Lakes)
Metric
names
Other
details
Class
boundaries
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Metric specification - 2 (Lakes)
SD from
sampling
variation
Other types
of variation
NB: SD values are made up!
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Metric specification - 3 (Lakes)
Grouping
of metrics
by BQE
Grouping of metrics
by pressure
within BQE
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Metric specification - 4 (Lakes)
Rule for combining
BQEs within waterbody
(here: one-out-all-out)
Rule for combining
metrics within BQE
Weighting of
each BQE
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Results: assessment for individual metrics
NB: Fake SD values - results must not be interpreted as real.
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Results: assessment combined per BQE
and per waterbody
• Combination rule for total assessment: ”worst case” (all 4 BQEs)
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Results: assessment for BQEs and waterbody
• Combination rule for total assessment: ”average” (all 4 BQEs)
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Next steps
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Next steps for data analysis
• Try to obtain SD estimates / best guesses from BQE groups
• Quality-check of class boundaries, E0, E1 etc.
– Confusion EQR scale vs. original metric scale?
• Explore impact on classification of...
–
–
–
–
different level of uncertainty (sampling SD)
different combination rules
etc.
what is most useful for deWELopment?
• Estimate risk of misclassification for selected cases
– ”True class” will be determined by the given metric values,
and an agreed set of SD and combination rules
• Include physico-chemical parameters in assessment?
• Other suggestions?
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Publication
• 1 manuscript on exploring risk of misclassification using the
metrics and classification system developed by
deWELopment and the WISERBUGS tool
• Potential co-authors
–
–
–
–
BQE group leaders ?
Gosia ?
Coordinators: Hanna and Anne (or acknowledgement?)
WISERBUGS author: Ralph Clarke (or acknowledgement?)
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund