Transcript 10.1

10.1: Estimating with Confidence


Statistical inference uses language of probability to
express the strength of our conclusions
2 types of formal statistical inference: confidence
intervals (Ch. 10) and significance tests which
assesses evidence for a claim about a population (Ch.
11)

Suppose you want to estimate the mean SAT
Math score for the more than 350,000 high
school seniors in CA. Only about 49% of
these students take the SAT. These selfselected seniors are planning to attend
college and so are not representative of all
CA seniors. You know better than to make
inferences about the population based on any
sample data. At considerable effort and
expense, you give the test to an SRS of 500
CA high school seniors. The mean for your
sample is 461. If the standard deviation of
SAT Math scores is 100, what can you say
about the mean score in the population of all
350,000 seniors?
What we know about 




…it is approximately….?
We want a better estimate of x-bar.
What we need to ask is, “How would the
sample mean vary if we took many samples
of 500 seniors from this population?
3 facts about x-bar
1) CLT: Shape of the mean of 500 scores has
a distribution that is = ?
2) Mean of this sampling distribution = ?
3) Standard deviation of x-bar for an SRS of
500 = ?


Take many samples of n=500 with mean SAT math
score for each sample
If we collect all these sample means and display their
distribution, we get the normal distribution with
mean =  and standard deviation = 4.5.
Steps to statistical estimation
1) To estimate mew, use x-bar of
our random sample.
2) Although x-bar is an unbiased
estimate of mew, it will
rarely be exactly equal of
mew, so our estimate has
some error.
3) In repeated samples, the
values of x-bar follow
(approximately) a normal
distribution.
4) The 68-95-99.7 rule says:
5) Whenever x-bar is within 9
points of mew, mew is also
within 9 points of x-bar (in
95% of samples)
6) Therefore, in 95% of all
samples:
2 possibilities
1) The interval between 452 and 470 contains
the true population mean
or
2) Our SRS was one of the few samples for
which x-bar was NOT within 9 points of the
true population mean.


1)
2)
Users usually pick C > 90% because they
want to be sure of their conclusions.
A confidence interval has 2 Parts:
An interval calculated from the data
A confidence level C, which gives the
probability that the interval will capture the
true parameter value in repeated samples.



Top: Result of drawing many SRS’s from the same population and
calculating a 95% confidence interval from each sample.
Sampling distribution of x-bar shows long-term pattern of this variation.
25 SRS’s, 95% Confidence 
Example 2

To find an 80% confidence interval, we
must catch the central 80% of the
normal sampling distribution of x-bar. In
catching the central 80%, we leave out
20%, or 10% in each tail.
In General, if you catch the
central area C:
Recall: The sampling distribution of x-bar
is normal if the population is normal.
If the population isn’t normal and n is
sufficiently large, the sampling
distribution is still approximately normal
(CLT: N>30).
*
A manufacturer of high-resolution video terminals
must control the tension on the mesh of fine wires
that lies behind the surface of the viewing screen.
Too much tension will tear the mesh and too little will
allow wrinkles. The tension is measured by an
electrical device with output readings in milllivolts.
Some variation ins inherent in the production
process. Careful study has show that when the
process is operating properly, the standard deviation
of the tension readings is 43 mV. Here are the
tension readings from an SRS of 20 screens from a
single day’s production.
269.5 297 269.6 283.3 304.8 280.4 233.5 257.4 317.5 327.4
264.7 307.7 310 343.3 328.1 342.6 338.8 340.1 374.6 336.1
Construct a 90% confidence interval for the mean
tension of all the screens produced on this day.
How confidence intervals
behave
Larger samples give smaller margins of
error.
 The margin of error gets smaller…
 When z* gets smaller. Smaller z* is the
same as smaller confidence level.
 Population standard deviation gets
smaller.
 N gets larger.

Choosing the Sample Size


To obtain a desired margin of error (m),
substitute the value of z* for your desired
confidence level, and solve the inequality for
n.
Ex: Company management wants a report of
the mean screen tension for the day’s
production accurate to within 5 mV with 95%
confidence. How large a sample of video
monitors must be measured to comply with
this request?