Confidence Interval of a Mean
Download
Report
Transcript Confidence Interval of a Mean
Confidence Interval of a Mean
FPP 23
Confidence intervals for proportion review
Generic formula for a confidence interval
estimate ± multiplier*SE
Recall the multiplier depends on the level of confidence
For a population proportion we have
pˆ mult iplier*
pˆ (1 pˆ )
n
The multiplier here is found using the normal distribution
Confidence interval for a mean
Generic formula
estimate ± multiplier*SE
An estimate for a population mean μis the sample mean (typically denoted by
)
x
SE is given by σ/√n
Multiplier found using the normal distribution
But we don’t know σ. So what do we do?
Use the sample standard deviation
s s
2
Thus
n
(x
i
x ) 2 /(n 1)
i1
SE s / n
we use s instead of σ we must use a t-distribution with n
But since
– 1 degrees of freedom (d.f.) instead of a normal distribution to
find the multiplier
t-table
CI of a mean recap
Equation for a confidence interval of a mean
sample mean ± multiplier*SE
x multiplier*s/ n
The multiplier comes from the t-distribution with n – 1 d.f., s is
the sample standard deviation, n is the sample size
All the ideas of confidence intervals for a proportion carry
over to means.
Interpretations
The meaning of statistical confidence.
Application of CI’s: Mercury levels in NC
rivers
Rivers in North Carolina contain small concentrations of mercury which can accumulate
in fish over their lifetimes. Because mercury cannot be excreted from the body it builds
up in the tissues. The concentration of mercury in fish tissues can be obtained at
considerable expense by catching fish and sending samples to a lab for analysis. Directly
measuring the mercury concentration in the water is impossible since it is almost always
below detectable limits
A study was recently conducted by researchers at the Nicholas School of the
Environment at Duke in the Wacamaw and Lumber Rivers to investigate mercury levels
in tissues of large mouth bass. At several stations along each river, a group of fish were
caught, weighted and measured. In addition a filet from each fish caught was sent to the
lab so that the tissue concentration of mercury (in parts per million) could be
determined for each fish.
Mercury in concentrations greater than 1 part per million are considered unsafe for
humans to ingest. Are fish in the Lumber and Wacamaw Rivers too contaminated to eat?
EDA for mercury
The distribution of mercury is
right-skewed in both rivers.
There are a few outliers in
Lumber River, but the large
sample size should allow us to
use the Central Limit Theorem
for CI’s.
river=lumber
Distributions
mercury
Moments
Mean
1.0780822
Std Dev
0.648611
Std Err Mean
0.0759142
upper 95% Mean 1.2294143
lower 95% Mean
0.92675
N
73
0
The sample average mercury
level for both rivers is above
1.0 ppm.
.5
1
1.5
2
2.5
3
3.5
4
river=wacamaw
Distributions
mercury
Moments
Mean
Std Dev
Std Err Mean
upper 95% Mean
lower 95% Mean
N
95% CI’s for population
average mercury levels in two
rivers:
0
.5
1
1.5
2
2.5
3
3.5
4
1.2764286
0.8291484
0.0837566
1.4426623
1.1101948
98
Conclusions based on CI’s
We are 95% confident that the population average
mercury level in fish in the Lumber River is between .93
and 1.23 ppm. Since 1.0 ppm is inside the CI, we do
not feel confident that the average level is below or
above the danger level. More study is needed.
We are 95% confident that the population average
mercury level in fish in the Wacamaw River is between
1.11 and 1.44 ppm. It is likely that the average mercury
level is beyond 1.0 ppm and therefore unsafe. Don’t eat
Wacamaw bass!
Interpretation of CI’s for averages
Wrong:
“95% of all fish in Wacamaw river have mercury levels between
1.11 and 1.44 pm”
Right
“We are 95% confident that the average mercury level of fish in
the Wacamaw river is between 1.11 and 1.44ppm”
Special consideration for CI’s of
averages
Beware of outliers
Outliers can dramatically inflate estimates of the SE. This could
lead to CI’s so wide they aren’t useful.
What to do when you have outliers:
1. Check for data entry errors
2. Do analyses with and without outliers. When results differ
substantially, report both analyses. Otherwise, report original
analyses only.
Example 1
Suppose Brent Matthews, manager of a Sam’s Club, wants to
know how much milk he should stock daily. Brent checked
the sales records for random sample of 16 days and found the
mean number of gallons sold is 150 gallons per day, the
sample standard deviation is 12 gallons. Determine the
number of gallons that Brent should stock daily with a 95%
confidence interval.
Example 2
It is important for airlines to follow the published scheduled
departure times of flights. Suppose that one airline that
recently sampled the records of 246 flights originating in
Orlando found that 10 flights were delayed for severe
weather, 4 flights were delayed for maintenance concerns,
and all the other flights were on time. Determine the
percentage of on-time departures using a 95% confidence
interval.