Extracting Meaningful Data - University of Nebraska–Lincoln

Download Report

Transcript Extracting Meaningful Data - University of Nebraska–Lincoln

Extracting Meaningful Data:
Distinguishing Signal from
Noise in Climate Change
Q. Steven Hu
School of Natural Resources
University of Nebraska-Lincoln
In general, noise is the part of information that we want to
tease out and signal is the part that we want to keep.
Noise is “stubborn” and always present and interferes with
signal, forcing us to identify it and find ways to separate it in
the data.
We don’t want to throw away everything (data) we have
collected because of presence of noise, but we cannot keep
everything either. We want to keep the signal (the baby) after
clearing the noise (the bath water)!
Except for some “absolute noise,” noise and signal
are relative and they are determined by the interest
of studies.
Example 1: The atmosphere contains variations at rather
wide ranges of frequencies and spatial scales.
All those variations are signals to the climate. Yet, if we are
interested in studying a particular frequency variation, e.g.,
interannual variation – changes of rainfall from one summer
to the next, or variation at a specific spatial scale, e.g., the
synoptic scale – in the order of 1000km, all the other signals
become “noises.” We must identify and remove them before
we can examine variations at the interested frequency and
scale and understand their behavior and change.
Example 2: In phonological research, an
overwhelming number of studies have examined
long-term data on phonological patterns and lifeevents of animals and plants, and found earlier
migration to breeding sites, birds laying eggs on
earlier dates, and plants flowering earlier. These
rather diverse yet consistent changes of phenology
are, as believed, responses to a warming climate.
This phenological signal of change is an independent source of
information of climate change. To a great extent, this signal is free of
errors and noises resulting from gathering and manipulating
instrumentation records, thus providing a independent check of this
following result from instrumentation data.
However, the phonological change can only
tell us the direction of climate change and
cannot tell the rate or magnitude of the
change.
Can we know how may degrees the
temperature may have increased from how
many days earlier a bird or a butterfly has
migrated north to certain latitude?
Not yet, because the correlations, which have been
exclusively used in connecting changes in evolution
of life-events of animals and plants with
environmental conditions, “do not allow us to discern
whether the earlier reproduction is a direct response
to warmer temperatures, or to other factors that
may also vary with climate, such as reproductive
resources and inter- and intra-specific competition,”
as elaborated in Post et al. (2001, in Proc. R. Soc.
London, B).
When we try to tease the information and single out
those responses, the one organism that biologically
connects a species reproduction behavior with
warmer temperature and dominates the other
organisms would be considered the signal, and the
rest would be noises.
This signal-noise relationship can change in different
analyses of varying aspects of the problem.
There are, of course, other ways to treat several
major organisms simultaneously, and even include
their nonlinear interactions (e.g., data mining).
To summarize the previous slides:
There are “absolute” noises and
erroneous information in data,
but more often the noises are
relative to the signal we want to
examine.
Now, let’s examine what are the “absolute noises” in
the meteorological and climatic data
Instrumentation drift induced noise to data (ground sensor
and satellite drift)
► Instrumentation upgrading induced changes in data
(sensor differences)
► Station’s geographical location change induced shift to the
data (terrain and surface differences)
► Local surroundings change induced noise (a tree grows
into full canopy and cools the surroundings of the station,
inducing a cooling noise. Similarly the urban expansion
may warm a previously rural area, inducing a warming
noise to the rural station temperature data)
► Different ways observers read the instrument (low vs. high
angle)
► Observation time differences add noise in the data (for
precipitation in some frequencies)
►
How well have we extracted climate signal?
(How have we identified and treated the noises?)
Many of the noises have been identified and their effects
on signals minimized.
►
Instrumentation drift induced noise to data (satellite position drift have been
calculated and included in retrieval schemes for surface temperature and precipitation)
Instrumentation upgrading induced changes in data (may have considered in
developing temperature data series)
► Observation time differences add noise in the data (Observation time effects were
also estimated and included in finalizing the station observed precipitation)
►
Others remain to be specified and their effects removed in
developing climate data (lacking station history has made
these following noises very difficult to clarify).
►
►
►
Station’s geographical location change induced shift to the data
Local surroundings change induced noise
Different ways observers read the instrument (can never be certain)
Those noises and biases in the climate data
have left uncertainties in the results derived
from the data.
They have made it particularly difficult to
examine local climate change – because for
large regions the noise and biases may cancel
each other and reduce their effects on the
results.
A very brief discussion on
Noise in climate models and their outputs
Let me use an example to show the noise resulting
from numerical treatment of the governing equations
of the atmospheric and oceanic motions.
The generalized linear system of governing
equations, describing a number of types of wave
motions in the atmosphere and ocean, can be
written as:
dU
 iU , U  U (t ).
dt
U (t )  U (0)e
it
In finite difference, this equation is written (at time
= n×Δt)
U (n t )  U (0)e
int
So, the true solution to the equation should have an
invariant amplitude, U(0) = the initial amplitude.
In numerical models, various “finite differencing schemes” are
used to calculate the values of U at time t and locations.
These values are estimates of true U’s and contain noises
intrinsic to those schemes.
To evaluate the effect of those “numerical noises” on the
solutions we use the von Neumann method. By defining a
variable λ (“distortion” of the solution from the true one)
U
( n  1)
 U
  e
we can get
U
( n)
(n)
i
( 0) in
  U e
n
Let’s focus on the amplitude of the amplitude of the modeled
solution, |λ|n×U(0). It is different from the analytic (true)
solution, and the coefficient |λ|n measures how different the
amplitude, U(nΔt), at time step n is from the true solution.
We have these possibilities:
1. |λ|>1  the numerical noise grows every time step and
quickly overwhelms the signal (the solution of the equation);
2. |λ|=1  neutral solution, noise is minimal (good); and
3. |λ|<1  damping solution, noise “erodes” the
signal and it will be gone during the model integration.
The value of |λ| for some popular time differencing
schemes is shown below.
To summarize: Although researchers have strived to
minimize the numerical noises resulting from various
numerical schemes used in models, those noises
remain and make numerical models and their
predictions of climate suffer uncertainties.
Concluding remarks:
►
►
►
►
►
Except for the “absolute noise,” noise and signal are
relative and are determined by the nature of a problem.
The sources of noise can be determined after the research
question is well defined.
Various methods can be used to filter out or attenuate the
noises and minimize their effect on signal, although such
effect may always exist to varying magnitudes.
Conventional climate data have many types of noises.
Phenology data of life-events of animals and plants provide
an independent source of information to detect climate and
environmental change. A challenge for us to use the data
effectively is that the signal are biologically and chemically
intertwined with “noises.”
(“Sandhills Cranes in Flight” – photo by Michael Forsberg)