Scientific Maturity of Cloud Products

Download Report

Transcript Scientific Maturity of Cloud Products

Preserving Cloud
Information
Bruce R. Barkstrom
&
John J. Bates
NCDC
Outline
► Fundamental
Preservation Commandments
► Questions
 Variability Quantification
 Error Analysis and Physics
► Costs
► What
Can We Do Now?
Four Commandments for Preserving
Information
1.
2.
3.
4.
Thou shalt not be forced to preserve
information before it is ready
Thou shalt not lose information – if
possible
Thou shalt not cost more than necessary
Thou must make data accessible and
valuable


To current users
To future users
When is Data Ready for
Preservation?
► When
we have a good model of the underlying
“natural variability” and “expected climate change”
of the fields being measured
 Not just mean and standard deviation – current
applications need description of extreme events
 Need regional time variations
► When
we have a physical basis for estimating
errors and their impact on climate change
detectability
 Need more than just measurement statistics
 Must include probability distribution of possible biases
Quantification of Field Variability
► The
“variability Turing test”:
 Can you generate an ensemble of computer
generated fields with statistics that is
indistinguishable from those of the real field?
► The
“climate Turing test”:
 Can you generate a model of “trends” whose
statistics are indistinguishable from those of the
expected climate changes?
Current State
► Measurement
“Requirements” for Climate usually
stated as global values of means and standard
deviations
► Corresponding statistics can be generated by
appropriate white noise
► Is this adequate?
 Probably not – clouds variations are more complex than
a global mean and simple latitudinal variations
► Can
we come up with a common basis for stating
variability across Earth science?
 Regional?
 Regional with moving systems?
No Preservation Without
Understandable Error Assessments
► Error
assessments for climate data records are
difficult
 Need physical basis for estimating uncertainties, not just
internally consistent measurement statistics
 Error assessments must be tied to algorithm code –
data editing is as important as coefficients or outlines of
algorithms
 Errors are not believable if entire data production
process is not publicly understandable
Current State
► Algorithm
Theoretical Basis Documents do not
necessarily represent the “as-built” algorithms with
their data editing
► EOS data production systems are “overwhelmingly
complex”
 May need new documentation tools to provide
understanding – 100,000 lines of code is not readable in
a Sunday afternoon
► As
Science Teams disperse, community knowledge
will be lost unless we take steps to prevent it
 May need to develop “data scholars”
Action Items
1.
2.
Can this workshop produce an
understandable, quantitative description of
cloud variability – and of expected cloud
property changes?
Is it possible to develop a communityaccepted standard checklist of errors for
cloud properties?
Sample Error Checklist
► Are
the “as-built” instrument drawings available?
► Is the ground calibration data available?
► Is there a computational math model of the
instrument that includes all of the physics of the
measurement?
► How was the gain determined?
► How was the spectral response determined?
► How was the Point Spread Function measured?
►…
Models for Preservation Funding
► The
Cemetery Model:
 Pay when the body is deposited; live off the interest
► The
Advanced Cemetery Model:
 Pay for the previous bodies, as well as the one you’re
depositing; make sure to add new bodies (the Cemetery
as Pyramid)
► The
Cemetery as Theme Park:
 Make the cemetery interesting to visit; charge admission
► The
Public Broadcasting Approach:
 Beg for support annually – and ask for volunteers
Actions That Can Reduce
Preservation Costs and Risk
► Arrange
a “Submission Agreement” (data
will) with your designated archive
► Gather required original documents and
make sure your archive can accept them




Drawings
Calibration plans and procedures
Science Team minutes
Source Code
► Arrange
peer review of documentation
Summary
► Our
data will not survive without careful thought
to ensure
 Physical insight into the measured variables and the
measurement process
 Adequate public access to the measurement process
 Cost-effective archival
► Archives
know less than you do about your data;
if you don’t act to preserve that information,
archives can’t preserve it!