Confidence Intervals for p and mu

Download Report

Transcript Confidence Intervals for p and mu

From the Data at Hand to the
World at Large
Chapters 19, 23
Confidence Intervals
Estimation of population parameters:
•an unknown population proportion p
•an unknown population mean 
Concepts of Estimation
• The objective of estimation is to estimate
the unknown value of a population
parameter, like the mean , on the basis of a
sample statistic calculated from sample
data.
 e.g., NCSU housing office may want to
estimate the mean distance  from campus
to hometown of all students
• There are two types of estimates
– Point Estimate
– Interval estimate
What do we frequently need to
estimate?
• An unknown
population proportion
p
• An unknown
population mean 
?
p?
Point Estimates
• The sample mean x is the best point
estimate of the population mean 
x
• p = n , the sample proportion of x
^
successes in a sample of size n, is the best
point estimate of the population proportion
p
Example: Estimating an
unknown population proportion p
• Is Herb Sendek's departure good or bad
for State's men's basketball team?
(Technician opinion poll; not scientifically
valid!!)
• In a sample of 1000 students, 590 say that
Sendek’s departure is good for the bb team.
^
• p = 590/1000 = .59 is the point estimate of
the unknown population proportion p that
think Sendek’s departure is good.
Example: Estimating an
unknown mean 
• In an effort to improve drive-through
service, a Burger King records the drivethrough service times of 52 randomly
selected vehicles.
• The sample mean service time x =181.3
seconds is the point estimate of the
unknown mean service time 
Shortcoming of Point Estimates
•
x = 181.3 seconds, best estimate of mean
service time 
• p^ = 590/1000 = .59, best estimate of
population proportion p
BUT
How good are these best estimates?
No measure of reliability
Another type of estimate
Interval Estimator
A confidence interval is a range (or an
interval) of values used to estimate the
unknown value of a population parameter .
http://abcnews.go.com/US/PollVault/
95% Confidence Interval for p
x
ˆ
e ter v al
U s ep  t o c o n s t r u cat 9 5 % c o n f id e n c in
n
forp :
ˆ (1  pˆ )
ˆ (1  pˆ )
p
p
)
, pˆ  1 . 9 6
( pˆ  1 . 9 6
n
n
w r it t e n
ˆ (1  pˆ )
p
pˆ  1 . 9 6
n
Standard Normal
P(-1.96  z  1.96) =. 95
Sampling distribution of pˆ
Confidence level
.95
pq
p  1.96
n
p
pq
p  1.96
n
ˆ will be in this interval
95% of the time p
Therefore, the interval

pq
pq 
, pˆ  1.96
 pˆ 1.96

n
n 

will "capture" p 95% of the time
Standard Normal
P(-1.96  z  1.96) =. 95
Example (Gallup Polls)
Vot er preferencepolls t ypicallysample
approximately 1600 vot ers;suppose pˆ  .52.
T henif we desire a 95% confidenceint erval
for p we calculat e
pˆ qˆ
(.52)(.48)
pˆ  1.96
 .52  1.96
n
1600
 .52  .024 (.496, .544)
http://abcnews.go.com/US/PollVault/story?id=1
45373&page=1
Medication side effects (confidence
interval for p)
Arthritis is a painful, chronic inflammation of the joints.
An experiment on the side effects of pain relievers
examined arthritis patients to find the proportion of
patients who suffer side effects.
What are some side effects of ibuprofen?
Serious side effects (seek medical attention immediately):
Allergic reaction (difficulty breathing, swelling, or hives),
Muscle cramps, numbness, or tingling,
Ulcers (open sores) in the mouth,
Rapid weight gain (fluid retention),
Seizures,
Black, bloody, or tarry stools,
Blood in your urine or vomit,
Decreased hearing or ringing in the ears,
Jaundice (yellowing of the skin or eyes), or
Abdominal cramping, indigestion, or heartburn,
Less serious side effects (discuss with your doctor):
Dizziness or headache,
Nausea, gaseousness, diarrhea, or constipation,
Depression,
Fatigue or weakness,
Dry mouth, or
Irregular menstrual periods
440 subjects with chronic arthritis were given ibuprofen for pain relief;
23 subjects suffered from adverse side effects.
Calculate a 90% confidence interval for the population proportion p of
arthritis patients who suffer some “adverse symptoms.”
ˆ  z*
p
ˆˆ
pq
n
What is the sample proportion pˆ ?
23
pˆ 
 0.052
440
For a 90% confidence level, z* = 1.645.
pˆ  z *
ˆˆ
pq
n
.052(1  .052)
90%CI for p :
440
0.052  0.018  (.034,.070)
.052  1.645(0.011)
.052  .018
 We are 90% confident that the interval (.034, .070) contains the true
proportion of arthritis patients that experience some adverse symptoms when
taking ibuprofen.
.052  1.645
Tool for Constructing Confidence
Intervals for  : The Central Limit
Theorem
• If a random sample of n observations is
selected from a population (any
population), then when n is sufficiently
large, the sampling distribution of x will be
approximately normal.
(The larger the sample size, the better will be
the normal approximation to the sampling
distribution of x; we’ll use n  30)
Estimating the Population
Mean  when the Population
Standard Deviation is Known
• How is an interval estimator produced from a
sampling distribution?
– To estimate , a sample of size n is drawn from the
population, and its mean x is calculated.
– Under certain conditions, x is normally distributed
(or approximately normally distributed by the
CLT).
Confidence Interval for a
population mean 
A 95% confidence interval for
a population mean :

 

, x  1.96
 x  1.96

n
n

usually written
x  1.96

n
Standard Normal
P(-1.96  z  1.96) =. 95
EXAMPLE
n  60, x  30.4,   1.6
95% confidence interval for 
30.4  1.96
1.6
60
30.4  .405
(29.995, 30.805)
We are 95% confident that the interval
from 29.995 to 30.805 contains
the true but unknown value of 
Sampling distribution of x
Confidence level
.95

  1.96
n


  1.96
n
95% of thetimex will be in thisinterval
T herefore,theinterval
x 1.96

,
x

1
.
96
n
n


will " capture" 95% of the time
Standard Normal
98% Confidence Intervals
For 
For p

  

pˆ (1  pˆ )
pˆ (1  pˆ ) 
, x  2.33
, pˆ  2.33

 x  2.33
  pˆ  2.33
n
n 
n
n 

written
 

 x  2.33

n

written
pˆ  2.33
pˆ (1  pˆ )
n
Four Commonly Used
Confidence Levels
Confidence Level
.90
.95
.98
.99
Multiplier
1.645
1.96
2.33
2.58
Example (cont.)
n  60, x  30.4,   1.6;
95% CI : (29.995, 30.805)
90% CI : multiplier  1.645
 1.6 
30.4  1.645 
  30.4  .34  (30.06, 30.74)
 60 
98% CI : multiplier  2.33
 1.6 
30.4  2.33 
  30.4  .481  (29.919, 30.881)
 60 
Example (cont.)
99% CI: multiplier  2.58
 1.6 
30.4  2.58 
  30.4  .533  (29.867,30.933)
 60 
Example Summary
•
•
•
•
•
90% (30.06, 30.74)
95% (29.995, 30.805)
98% (29.919, 30.881)
99% (29.867, 30.933)
The higher the confidence level, the wider
the interval
• Increasing the sample size n will make a
confidence interval with the same
confidence level narrower (i.e., more
precise)
Example (cont.)
n  60, x  30.4,   1.6
95% CI : (29.995, 30.805)
n  100, x  30.4,   1.6
1.6
)  30.4  .314 
95% CI : 30.4  1.96(
100
(30.086, 30.714) (narrower , more precise)
Example
• Find a 95% confidence interval for p, the
proportion of small businesses in favor of a
tax increase to decrease the national debt, if
a random sample of 1000 found the number
of businesses in favor of increased taxes
was 50.
Example (solution)
pˆ  50
1000
 .05, so qˆ  .95and the confidence
interval is
(.05)(.95)
.05  1.96
= .05  .014 
1000
(.036, .064)
Interpreting Confidence Intervals
• Previous example: .05±.014(.036, .064)
• Correct: We are 95% confident that the interval from
.036 to .064 actually does contain the true value of p.
This means that if we were to select many different
samples of size 1000 and construct a 95% CI from each
sample, 95% of the resulting intervals would contain the
value of the population proportion p. (.036, .064) is one
such interval. (Note that 95% refers to the procedure we
used to construct the interval; it does not refer to the
population proportion p)
• Wrong: There is a 95% chance that the population
proportion p falls between .036 and .064. (Note that p is
not random, it is a fixed but unknown number)