inference - s3.amazonaws.com

Download Report

Transcript inference - s3.amazonaws.com

INFERENCE
• Inference – extension of results obtained
from an experiment (sample) to the general
population
• use of sample data to draw conclusions
about entire population
• Parameter – number that describes a
population
– Value is not usually known
– We are unable to examine population
• Statistic – number computed from sample
data
– Estimate unknown parameters
– Computed to estimate unknown
parameters
• Mean, standard deviation, variability,
etc..
Notations
population mean
sample mean
INFERENCE
• Sampling variability – natural
variation of outcomes from an
experiment that results from inherent
differences among samples
– no two samples are going to give same
statistics
•
E.g. Assume that you have a population of 1000
point locations across a study area and you wish
to estimate the mean value of soil erosion across
the study area using this sample of 10. Choose a
simple random sample this population. How many
different samples can be chosen?
23
1000C10= 2.6 x 10
INFERENCE
• How can experimental results be
trusted? If x is rarely exactly right
and varies from sample to sample,
why is it nonetheless a reasonable
estimate of the population mean μ?
• How can we describe the behavior of
the statistics from different samples?
– E.g. the mean value
ESTIMATION OF MEAN
Example: Sulfur compounds such as dimethyl sulfide
(DMS) are sometimes present in wine. DMS
causes “off-odors” in wine, so winemakers want to
know the odor threshold, the lowest concentration
of DMS that the human nose can detect. Different
people have different thresholds, so we start by
asking about the mean threshold in the
population of all adults. To estimate the
population mean, we present tasters with both
natural wine and the same wine spiked with DMS
at different concentrations to find the lowest
concentration at which they identify the spiked
wine.
•
How can we estimate the mean threshold
value for the population?
LAW OF LARGE NUMBERS
1) If we keep taking larger and larger samples,
the statistic is guaranteed to get closer and
closer to the parameter value.
SAMPLING DISTRIBUTIONS
2)
How else can we estimate the population
mean value, if we can not take very large
samples for our study?
e.g. What can we say about the estimate of mean
from say 10 subjects as an estimate of μ?
Here are the odor thresholds (micrograms of DMS per
liter of wine) for 10 randomly chosen subjects:
28
•
40 28 33 20 31 29 27 17 21
How well would our mean value from this
sample estimate the true parameter value?