Confidence Interval for Estimating a Population Proportion p

Download Report

Transcript Confidence Interval for Estimating a Population Proportion p

Section 7.1
Review and
Preview
Review
Chapters
2 & 3 we used “descriptive statistics” when we
summarized data using tools such as graphs, and statistics
such as the mean and standard deviation.
Chapter
6 we introduced critical values: z denotes the z
score with an area of  to its right.
If  = 0.025, the critical value is z0.025 = 1.96.
That is, the critical value z0.025 = 1.96 has an area of 0.025
to its right.
Preview
This chapter presents the beginning of inferential statistics.
The
two major activities of inferential statistics are (1)
to use sample data to estimate values of a population
parameters, and (2) to test hypotheses or claims made
about population parameters.
We
introduce methods for estimating values of these
important population parameters: proportions, means,
and variances.
We
also present methods for determining sample sizes
necessary to estimate those parameters.
Section 7.2
Estimating a
Population
Proportion
Learning Targets:
In this section….
•
We present methods for using a sample proportion to
estimate the value of a population proportion.
•
The sample proportion is the best point estimate of the
population proportion.
•
We can use a sample proportion to construct a
confidence interval to estimate the true value of a
population proportion, and we should know how to
interpret such confidence intervals.
•
We should know how to find the sample size necessary
to estimate a population proportion.
A point estimator is a statistic that provides an
estimate of a population parameter.
Point estimators that we will be using are x and pˆ .
The value of that statistic from a sample is called a
point estimate. Ideally, a point estimate is our “best
guess” at the value of an unknown parameter.
The sample proportion p̂ is the best point estimate of
the population proportion p.
The sample mean x is the best point estimate of the
population mean µ.
Example 1: Determine the point estimator you would use and
calculate the value of the point estimate.
a) Quality control inspectors want to estimate the mean lifetime
μ of the AA batteries produced in an hour at a factory. They
select a random sample of 30 batteries during each hour of
production and then drain them under conditions that mimic
normal use. Here are the lifetimes (in hours) of the batteries
from one such sample:
16.91 18.83 17.58 15.84 17.42 17.65 16.63 16.84 15.63 16.37 15.80
15.93 15.81 17.45 16.85 16.33 16.22 16.59 17.13 17.10 16.96 16.40
17.35 16.37 15.98 16.52 17.04 17.07 15.73 16.74
b) What proportion, p, of U.S. high school students smoke?
The 2007 Youth Risk Behavioral Survey questioned a
random sample of 14,041 students in grades 9 to 12. Of
these, 2808 said they had smoked cigarettes at least one day
in the past month.
Example 2: A Pew Research Center poll, 70% of 1501
randomly selected adults in the United States believe in
global warming, so the sample proportion is p̂ = 0.70. Find
the best point estimate of the proportion of all adults in the
United States who believe in global warming.
A confidence interval (or interval estimate) is a range
(or an interval) of values used to estimate the true value
of a population parameter. A confidence interval is
sometimes abbreviated as CI.
A confidence level is the probability 1 –  (often expressed
as the equivalent percentage value) that the confidence
interval actually does contain the population parameter,
assuming that the estimation process is repeated a large
number of times. (The confidence level is also called
degree of confidence, or the confidence coefficient.)
Most common choices are 90%, 95%, or 99%.
(α = 10%), (α = 5%), (α = 1%)
Interpreting a Confidence Interval
We must be careful to interpret confidence intervals correctly. There is a
correct interpretation and many different and creative incorrect
interpretations of the confidence interval 0.677 < p < 0.723.
“We are 95% confident that the interval from 0.677 to 0.723 actually
does contain the true proportion of [the topic].”
This means that if we were to select many different samples of size
1501 (from example 2) and construct the corresponding confidence
intervals, 95% of them would actually contain the value of the
population proportion p.
(Note that in this correct interpretation, the level of 95% refers to the
success rate of the process being used to estimate the proportion.)
Example 3: The 96% confidence interval for the true
proportion of all 17 year old boys who own a used car with a
sample size of 426 is (0.189, 0.251). Interpret this confidence
interval.
We are 96% confident that the interval from 0.189
to 0.251 actually does contain the true proportion
of all 17 year old boys who own a used car.
Example 4: The 95% confidence interval for the true
proportion of all New York State Union members who favor
the Republican candidate for governor with a sample size of
300 is (0.147, 0.321). Interpret this confidence interval.
We are 95% confident that the interval from 0.147
to 0.321 actually does contain the true proportion
of all New York State Union members who favor
the Republican candidate for governor.
Critical Values
A standard z score can be used to distinguish between sample
statistics that are likely to occur and those that are unlikely to
occur. Such a z score is called a critical value. Critical values
are based on the following observations:
1. Under certain conditions, the sampling distribution of
sample proportions can be approximated by a normal
distribution.
Critical Values
2. A z score associated with a sample proportion has a
𝜶
probability of of falling in the right tail.
𝟐
Critical Values
3. The z score separating the right-tail region is
commonly denoted by z/2 (z* (“z star”) and is referred
to as a critical value because it is on the borderline
separating z scores from sample proportions that are
likely to occur from those that are unlikely to occur.
Notation for Critical Value
The critical value z/2 (z*) is the positive z value that is
at the vertical boundary separating an area of /2 in the
right tail of the standard normal distribution. (The value
of –z/2 is at the vertical boundary for the area of /2 in
the left tail.) The subscript /2 is simply a reminder that
the z score separates an area of /2 in the right tail of the
standard normal distribution.
Finding zα/2 (z*) for a 95% Confidence Level
α = 5%
α/2 = 2.5% = .025
zα/2
-zα/2
Critical Values
Finding z2 (z*) for a 95% Confidence Level – cont.
α = 0.05
zα/2 = ±1.96
Example 5: Find the appropriate critical value for the given
confidence level. Round to three decimals.
a) 99.9%
b) 95%
c) 90%
Example 5 cont.: Find the appropriate critical value for the
given confidence level. Round to three decimals.
d) 92%
e) 84%
f) 78%
When data from a simple random sample are used to
estimate a population proportion p, the margin of error,
denoted by E, is the maximum likely difference (with
probability 1 – , such as 0.95) between the observed
proportion p̂ and the true value of the population proportion
p.
E  z 2
ˆˆ
pq
n
E  margin of error
z /2  z*  critical value
pˆ  proportion of successes
qˆ  proportion of failures
n  sample size
Margin of Error for Proportions
Example 6: Assume that a sample is used to estimate a
population proportion p. Find the margin of error, E, that
corresponds to the given statistics and confidence level.
Round the margin of error to three decimal places.
a) 95% confidence, sample size is 600, of 32% are successes.
b) 98% confidence, sample size is 1142, of 24% are successes.
Example 7: In a random sample of 203 college students, 75
had part-time jobs. Find the margin of error for the 90%
confidence interval used to estimate the population proportion.
Confidence Interval for Estimating a Population
Proportion p
p = population proportion
p̂ = sample proportion
n = number of sample values
E = margin of error
z/2 (z*) = z score separating an area of /2 in the
right tail of the standard normal
distribution
Confidence Interval for Estimating a
Population Proportion p
pˆ  E  p  pˆ  E
where
E  z 2
ˆˆ
pq
n
3 Different ways for Writing Confidence
Interval for a Population Proportion p
pˆ  E  p  pˆ  E
pˆ  E
 pˆ  E , pˆ  E 
Example 8: A Pew Research Center poll of 1501 randomly
selected U.S. adults showed that 70% of the respondents believe in
global warming. The sample results are n = 1501 and pˆ  0.70 .
a) Find the margin of error E that corresponds to a 95%
confidence level.
b) Find the 95% confidence interval estimate of the population
proportion p.
Calculator Setup
1st: On your graphing calculator, go to STAT  TESTS
2nd: Scroll Up, then Choose A:
1-PropZInt…
Example 9: Use the given degree of confidence and sample data
to construct a confidence interval for the population proportion p.
n = sample size, x = number of successes
a) n = 741, x = 274; 95% confidence
b) n = 267, x = 194; 88% confidence
Sample Size
Suppose we want to collect sample data in
order to estimate some population proportion.
The question is how many sample items must
be obtained?
Determining Sample Size
ˆˆ
pq
n
E  z 2
(solve for n by algebra)
z /2 

n
E
2
2
ˆˆ
pq
Sample Size for Estimating
Proportion p
When an estimate of p̂ is known:
z /2 

n
E
2
ˆˆ
pq
2
When an estimate of p̂ is unknown:
z /2 

n
2
E
2
0.25
Round-Off Rule for Determining Sample
Size
If the computed sample size n is not a whole
number, round the value of n up to the next
larger whole number.
Example 10: The Internet is affecting us all in many different ways, so
there are many reasons for estimating the proportion of adults who use
it. Assume that a manager for E-Bay wants to determine the current
percentage of U.S. adults who now use the Internet. How many adults
must be surveyed in order to be 95% confident that the sample
percentage is in error by no more than three percentage points?
a) In 2006, 73% of adults used the Internet.
b) No known possible value of the proportion.
Example 11: Use the given data to find the minimum sample size
required to estimate the population proportion.
a) Margin of error: 0.006; confidence level: 90%; pˆ and qˆ unknown.
b) Margin of error: 0.02; confidence level: 94%; pˆ and qˆ unknown.
Example 12: Use the given data to find the minimum sample size
required to estimate the population proportion.
a) Margin of error: 0.08; confidence level: 96%; from a prior study, p̂ is
estimated by 0.24.
b) Margin of error: 0.004; confidence level: 92%; from a prior study, p̂
is estimated by 0.123.
Example 13: The Genetics and IVF Institute conducted a clinical trial of the
XSORT method designed to increase the probability of conceiving a girl. As of
this writing, 574 babies were born to parents using the XSORT method, and 525
of them were girls.
a) What is the best point estimate of the population proportion of girls born to
parents using the XSORT method?
b) Use the sample data to construct a 95% confidence interval estimate of the
percentage of girls born to parents using the XSORT method.
Example 14: An interesting and popular hypothesis is that individuals can
temporarily postpone their death to survive a major holiday or important event
Such as a birthday. In a study of this phenomenon, it was found that in the week
before and the week after Thanksgiving, there were 12,000 total deaths, and 6,062
of them occurred in the week before Thanksgiving (based on data from “Holiday,
Birthdays, and Postponement of Cancer Death,” by Young and Hade, Journal of
the American Medical Association, Vol. 292, No. 24.)
a) What is the best point estimate of the proportion of deaths in the week before
Thanksgiving to the total deaths in the week before and the week after Thanks–
giving?
b) Construct a 95% confidence interval estimate of the proportion of deaths in the
week before Thanksgiving to the total deaths in the week before and the week
after Thanksgiving?
Finding the Point Estimate and E from a
Confidence Interval
When you already know the interval and are looking for
p̂
or E
point estimate of p :
upper confidence limit    lower confidence limit 

pˆ 
2
Margin of Error:
upper confidence limit    lower confidence limit 

E
2
Example 15: The following confidence interval is obtained for a
population proportion, p: (0.426, 0.612). Use these confidence interval
limits to find the point estimate, p̂.
Example 16: The following confidence interval is obtained for a
population proportion, p: 0.842 < p < 0.925. Use these confidence
interval limits to find the point estimate, p̂.
Example 17: The following confidence interval is obtained for a
population proportion, p: (0.647, 0.875). Use these confidence interval
limits to find the margin of error, E.
Example 18: The following confidence interval is obtained for a
population proportion, p: 0.542 < p < 0.714. Use these confidence
interval limits to find the margin of error, E.
Example 19: Express the confidence interval 0.052 < p < 0.428 in the
form p̂  E .
Example 20: Express the confidence interval (0.258, 0.789) in the
form p̂  E .