Uncertainty - cloudfront.net

Download Report

Transcript Uncertainty - cloudfront.net

Uncertainty
-Uncertainty is always with us and can never
be eliminated from our lives. Our
understanding of the past and our
anticipation of the future will always be
obscured by uncertainty.
Image removed due
to copyright: Book
cover: Uncertain
Science… Uncertain
World by H.N.
Pollack (see Google
Books)
-All decisions about the future, big and small,
must be made in the absence of certainty.
Waiting until uncertainty is eliminated
before making decisions is always an
implicit endorsement of the status quo,
and often an excuse for maintaining it.
-As the future unfolds, “mid-course
corrections” can be made that take into
account new information and new
technology.
-Uncertainty, far from being a barrier to
progress, is actually a strong stimulus for and important ingredient of - creativity.
1
Uncertainty vs. Scientific
Uncertainty
We are going to distinguish uncertainty in
our everyday lives from scientific
uncertainty.
Much like we distinguish a theory in our
everyday lives (colloquial use of the word
“theory” to mean a guess or conjecture),
from a scientific theory (an explanation
supported by multiple lines of evidence
and accepted by the scientific community).
Uncertainty vs. Scientific
Uncertainty
Everyday uncertainty – “being unsettled or in
doubt or dependent on chance”
Scientific uncertainty – Uncertainty inherent
in the scientific processes.
(Note: Skepticism is the viewpoint that
facilitates acceptance of scientific
uncertainty as a positive aspect of the
scientific process)
But, that does NOT mean that scientists do
not know anything, or that all scientific
knowledge is equally in question. We are
more certain about some knowledge than
others.
Science is the process of separating
between the demonstrably false from the
probably true.
Types of scientific uncertainty
• Unpredictability because of human
interaction (unpredictability)
• Data (value uncertainty)
• Models (structural uncertainty)
Data uncertainty
What could have influenced the results from
rolling balls down the ramp?
As a note of interest, this was the first
scientific experiment ever done.
Addressing scientific uncertainty
How scientists address these:
Unpredictability -> Projected scenarios
Data -> Statistics
Models -> Probabilities
Scientists also:
– Obtain review by peers and the scientific community
– Do the best they can given the methods they have, by
explicitly stating the strengths and weakness of their
data, models, and arguments.
Statistics vs. Probabilities
Statistics – empirical
It uses data to generate models
Probability – theoretical
It uses models to make predictions about
data
Uncertainty
Scientific uncertainty is important,
particularly when science interacts with
society.
Image removed due
to copyright: Book
cover: Uncertain
Science… Uncertain
World by H.N.
Pollack (see Google
Books)
Vested interests (typically non-scientists) will
argue against scientific findings that are
counter to their political or economic
interests by raising questions about scientific
uncertainty.
There are also aspects of name calling,
including “junk” science or “unsound”
science. These terms have no meaning
among scientists. As one cynical critic
noted, “junk” science is defined exclusively
by your political viewpoint.
Uncertainty
http://everythingscool.org/article.php?id=50
How do scientists deal with
uncertainty in data? Statistics!
A mathematical treatment of the data,
in particular the use of means,
medians, and standard errors.
Use empirical data, requires inductive
approach.
Statistics have a bad reputation and can
be used incorrectly. BUT, they are a
consistent and objective way of dealing
with uncertainty.
There are three kinds of lies: lies,
damned lies, and statistics.
Image source: Wikipedia.
Twain
Statistics
• Accuracy vs. Precision – language to talk about
our ball rolling results
• Mean, Median, and Mode – ability to talk about
data sets, rather than individual results
• Graphing – ways of displaying data sets
• Standard Deviation – a quantitative way of
characterizing the spread of a single data set
(and making error bars)
• T-test - the simplest quantitative way of
comparing two different data sets
Accuracy - the degree of conformity of a measured or calculated quantity to its
actual (true) value.
Precision - the degree to which further measurements or calculations show the
same or similar results (related to reproducibility or repeatability).
Image courtesy of NOAA Magazine.
Let’s make our own probability distribution
Image courtesy of Wikipedia.
So, what’s a probability distribution good for?
1)
Allows a scientist to compare different populations to each other, to
determine if they are similar or different, and the extent to which they
overlap (the t-test does this)
2)
Allows a scientist to know what the “true” (accurate) answer is
Image courtesy of Wikipedia.
So, what’s a probability distribution good for (con’t)?
3) Allows a scientist to know what the don’t know.
Image source: Wikipedia.
To know that we know what we
know, and that we do not know
what we do not know, that is
true knowledge.
Image source: Wikipedia.
Thoreau
Rumsfeld
There are known knowns. These are things we
know that we know. There are known unknowns.
That is to say, there are things that we now know
we don’t know. But there are also unknown
unknowns. These are things we do not know we
don’t know.
Averages
Mean – adding up all the numbers in the
data set and then dividing by the number
of data points
Median – the number in the middle of the
data set
Mode - the number that occurs most often
From Huff
Graphs
Graphs are ways of displaying very complex
data sets in very simple ways. Let’s start
with a good graph
Napoleon’s March to Moscow: The War of 1812
Charles Minard's 1869 chart showing the losses in men, their movements, and the temperature of Napoleon's 1812 Russian campaign.
Image source: Wikipedia.
When graphs go bad
• Unclear labeling
• Breaks in the axes or non-linear axes
• Using volume for a linear measure
Or, put simply, data ambiguity, data
distortion, and data distraction
Data ambiguity
Data distraction
Image removed due to copyright:
Income distribution infographic from the
Los Angeles Times Oct 21, 1984 issue:
available on the “Bad Charts” website.
Image removed due to copyright:
Los Angeles Times Oct 21, 1984 issue:
available on the “Bad Charts” website.
Data
distraction
Image removed due to
copyright: Chart of pre-tax
income distribution (available
on “Bad Charts” website.)
Data
distraction
Data distortion
Two images removed due to copyright:
Bad charts-- The shrinking dollar and the shrinking doctor
(images available at:
http://lilt.ilstu.edu/gmklass/pos138/datadisplay/badchart.htm )
Data distortion
Data distortion
Standard Deviation
The standard deviation is a measure of
variation from the mean, for an entire data
set.
It assumes a “normal” distribution. A normal
distribution of data means that most of the
examples in a set of data are close to the
"average," while relatively few examples
tend to one extreme or the other.
Think about doing a study on people's typical daily calorie
consumption. Like most data, the numbers for people's typical
consumption probably will turn out to be normally distributed. For most
people, their consumption will be close to the mean. Fewer people eat
a lot more or a lot less than the mean.
y-axis: The
likelihood of
a person
having a
particular
value for
calorie
consumption
x-axis: Quantity you care about
(such as daily calorie consumption)
One standard deviation (dark blue): Explains 68% of
the data
Two standard deviations (dark + medium blue):
Explains 95% of the data
Three standard deviations (all blues): Explains 99%
of the data
Image source: Wikipedia.
Standard Deviation
What do you do with it? Error bars!
The error bars represent a description of
how confident you are that the mean
represents the true value.
Error bars
Standard Deviation
How do you calculate it?
Really, scientists generally just use a
statistics computer software package,
because their data sets are huge. A
spreadsheet program, such as Microsoft
Excel, will do it for you. But, we will
calculate it the old fashioned way.
Standard Deviation
What do you do with it? Error bars!
A majority of experimental (and much
natural observation) data will employ error
bars to tell you how certain the scientists
are of their data. An error bar is just
plotted as the amount above and below
the value, typically one standard deviation.