Introduction to Hypothesis Testing

Download Report

Transcript Introduction to Hypothesis Testing

Chapter 23 Inference for
Means: Part 2
Required Sample Size,
Type II Error
Probabilities
1
Required Sample Size To Estimate a
Population Mean 
• If you desire a C% confidence interval for a
population mean  with an accuracy specified
by you, how large does the sample size need to
be?
• We will denote the accuracy by ME, which
stands for Margin of Error.
Example: Sample Size to Estimate a
Population Mean 
• Suppose we want to estimate the unknown
mean height  of male students at NC State with
a confidence interval.
• We want to be 95% confident that our estimate is
within .5 inch of 
• How large does our sample size need to be?
Confidence Interval for 
In terms of the margin of error ME,
the CI for  can be expressed as
x  ME
The confidence interval for  is
 s 
x t 

 n
*  s 
so ME  tn 1 

 n
*
n 1
So we can find the sample size by solving
this equation for n:
ME  t
*
n 1
 s 


 n
t s
which gives n  

 ME 
*
n 1
2
• Good news: we have an equation
• Bad news:
1. Need to know s
2. We don’t know n so we don’t know the degrees of
freedom to find t*n-1
A Way Around this Problem:
Approximate by Using the
Standard Normal
Use the corresponding z* from the standard normal
to form the equation
 s 
ME  z 

 n
Solve for n:
*
 zs
n

 ME 
*
2
Sampling distribution of
x
Confidence level
.95


  1.96
n
ME
ME
set ME  1.96
 1.96  
n

ME



  1.96
n

n
and solve for n
2
(estimate  with s)
Estimating s
• Previously collected data or prior knowledge
of the population
• If the population is normal or near-normal,
then s can be conservatively estimated by
s  range
6
• 99.7% of obs. within 3  of the mean
Example: sample size to
estimate mean height µ of NCSU
undergrad. male students
 z s 
n

ME


*
We want to be 95% confident that we are within .5
inch of , so
 ME = .5; z*=1.96
• Suppose previous data indicates that s is about
2 inches.
• n= [(1.96)(2)/(.5)]2 = 61.47
• We should sample 62 male students
2
Example: Sample Size to Estimate a
Population Mean -Textbooks
• Suppose the financial aid office wants to estimate the
mean NCSU semester textbook cost  within ME=$25
with 98% confidence. How many students should be
sampled? Previous data shows  is about $85.
2
 z *σ 
 (2.33)(85) 
n
 
  62.76
25


 ME 
round up to n = 63
2
Example: Sample Size to Estimate a Population Mean
-NFL footballs
• The manufacturer of NFL footballs uses a machine to inflate new
footballs
• The mean inflation pressure is 13.5 psi, but uncontrollable
factors cause the pressures of individual footballs to vary from
13.3 psi to 13.7 psi
• After throwing 6 interceptions in a recent game, Peyton Manning
complains that the balls are not properly inflated.
The manufacturer wishes to estimate the
mean inflation pressure to within .025
psi with a 99% confidence interval. How
many footballs should be sampled?
Example: Sample Size to Estimate a n   z * 
 ME 
Population Mean 
• The manufacturer wishes to estimate the mean inflation pressure
to within .025 pound with a 99% confidence interval. How may
footballs should be sampled?
• 99% confidence  z* = 2.576; ME = .025
•  = ? Inflation pressures range from 13.3 to 13.7 psi
• So range =13.7 – 13.3 = .4;   range/6 = .4/6 = .067
 2.58  .067 
n
  47.66  48
 .025 
2
. . .
1
2
3
48
2
Significance Levels and
Rejections Regions
Hypothesis Tests for 
13
 Levels and Rejection
Regions, Right-Tail; n=26 (df=25)
H 0 :   0
t
y  0
s
n

Rej Region
.10
t > 1.316
.05
t > 1.708
.01
t > 2.485
If HA:  > 0 and =.10
then RR={t: t > 1.316}
If HA:  > 0 and =.05
then RR={t: t > 1.708}
If HA:  > 0 and
=.01
then RR={t: t > 2.485}
14
Hypothesis Testing for , Type II Error
Probabilities (Right-tail example)
• Example
– A new billing system for a department store will be costeffective only if the mean monthly account is more than
$170.
– A sample of 401 accounts has a mean of $174 and s =
$65.
– Can we conclude that the new system will be cost
effective?
15
Right-tail example: hypotheses, significance
level
• Hypotheses
– The population of interest is the credit accounts at
the store.
– We want to know whether the mean account for all
customers is greater than $170.
H0 :  = 170
HA :  > 170
– Where  is the mean account value for all customers
– We will choose significance level  = .05
16
A Right - Tail Test: Rejection Region
• The rejection region: reject H0 if the test statistic t
satisfies t > t.05,n-1 = t.05,400 = 1.649
• We will reject H0 if the value of the test statistic t
is greater than 1.649
• Results from the n = 401 randomly selected
customers:
x  $174, s  $65
17
Right-tail example: test statistic and conclusion
– Hypotheses:
H0 :  = 170
HA :  > 170
data: x  174, s  65
x   174  170
test statistic: t 
s
n
Recall that the rejection region is

65
401
 1.23
t  t ,n1  t.05,400  1.649
Since the test statistic t = 1.23, and 1.23 < 1.649,
We do not reject the null hypothesis H0:  = 170.
18
Right-tail example: P-value and conclusion
P-value: The probability of observing a value
of the test statistic as extreme or more
extreme then t = 1.23, given that  = 170 is…
P-value  P(t400  1.23)  .1097
t400
0
Since the P-value > .05, we conclude
that there is not sufficient evidence to
reject H0 : =170.
t  1.23
Type II error is possible
19
Calculating , the Probability of a
Type II Error
• Calculating  for the t test is not at all
straightforward and is beyond the level of this
course
– The distribution of the test statistic t is quite
complicated when H0 is false and HA is true
– However, we can obtain very good approximate
values for  using z (the standard normal) in place
of t.
20
Calculating , the Probability of a
Type II Error (cont.)
• We need to
1. specify an appropriate significance level ;
2. Determine the rejection region in terms of z
3. Then calculate the probability of not being in the
rejection when  = 1, where 1 is a value of 
that makes HA true.
21
Example (cont.) calculating
– Test statistic:
H0 :  = 170
HA :  > 170
Choose = .05
Rejection region in terms of z: z > z.05 = 1.645
rejection region in terms of x :
x  170
z
 1.645
65
400
65
x  170  1.645
 175.34.
400
 = 0.05
170 175.34
22
Example (cont.) calculating 
– The rejection region with  = .05.
Express the rejection
region directly, not in
standardized terms
x  175.34
– Let the alternative value be  = 180 (rather than just
>170)
H :  = 170
0
HA:  = 180
Do not reject H0
=.05
= 170
175.34
Specify the
alternative value
under HA.
180
23
Example (cont.) calculating 
– A Type II error occurs when a false H0 is not
rejected. Suppose =180, that is H0 is false.
H0:  = 170
A false H0…
…is not rejected
H1:  = 180
x  175.34
= 170
175.34
=.05
180
24
Example (cont.) calculating 
 (180)  P( x  175.34 given that H 0 is false)
 P( x  175.34 given that   180)
 P( z 
175.34  180
65
400
)  .0764
H0:  = 170
Power when =180
= 1-(180)=.9236
H1:  = 180
= 170
175.34
180
25
Effects on  of changing 
• Increasing the significance level , decreases
the value of , and vice versa.
2 < 1
= 170
2 > 1
180
26
Judging the Test
• A hypothesis test is effectively defined by the
significance level  and by the sample size n.
• If the probability of a Type II error  is judged to
be too large, we can reduce it by
– increasing , and/or
– increasing the sample size.
27
Judging the Test
• Increasing the sample size reduces 
x 
Recall RR : z 
 z , or
s n
x    z
s
n
By increasing the sample size the
standard deviation of the sampling
distribution of the mean decreases.
Thus, the cutoff value of for the
rejection region decreases.
28
Judging the Test
• Increasing the sample size reduces 
x 
Recall RR : z 
 z , or
s n
x    z
s
n
Note what happens when n increases:
 does not change,
but  becomes smaller
= 170
xxxLxLLxLxLL
180
29
Judging the Test
• Increasing the sample size reduces 
• In the example, suppose n increases from 400 to
1000.
s
65
x    z
 170  1.645
 173.38
n
1000
173.38  180
  P( Z 
)  P( Z  3.22)  0
65 1000
•  remains 5%, but the probability of a Type II
drops dramatically.
30
A Left - Tail Test
• Self-Addressed Stamped Envelopes.
– The chief financial officer in FedEx believes that
including a stamped self-addressed (SSA) envelop
in the monthly invoice sent to customers will
decrease the amount of time it take for customers to
pay their monthly bills.
– Currently, customers return their payments in 24
days on the average, with a standard deviation of 6
days.
– Stamped self-addressed envelopes are included with
the bills for 76 randomly selected customers. The
number of days until they return their payment is
recorded.
31
A Left - Tail Test: Hypotheses
• The parameter tested is the population mean
payment period () for customers who receive
self-addressed stamped envelopes with their bill.
• The hypotheses are:
H0:  = 24
H1:  < 24
• Use  = .05; n = 76.
32
A Left - Tail Test: Rejection Region
• The rejection region: reject H0 if the test statistic t
satisfies t < t.05,75 = 1.665
• We will reject H0 if the value of the test statistic t
is less than 1.665
• Results from the 76 randomly selected
customers:
x  22.95 days, s  6 days
33
A Left -Tail Test: Test Statistic
• The value of the test statistic t is:
x   22.95  24
t

 1.52
s n
6 76
Since the rejection region is
t  t  t.05  1.665
Since the test statistic t = 1.52, and 1.52 > 1.665,
We do not reject the null hypothesis.
Note that the P-value = P(t75 < -1.52) = .066 > .05.
Since our decision is to not reject the null hypothesis,
A Type II error is possible.
34
Left-Tail Test: Calculating , the
Probability of a Type II Error
• The CFO thinks that a decrease of one day in the
average payment return time will cover the costs of
the envelopes since customer checks can be
deposited earlier.
• What is (23), the probability of a Type II error when
the true mean payment return time  is 23 days?
35
Left-tail test: calculating
(cont.)
– Test statistic:
H0 :  = 24
HA :  < 24
Choose = .05
Rejection region in terms of z: z < -z.05 = -1.645
rejection region in terms of x :
x  24
z
 1.645
6
75
6
x  24  1.645
 22.86.
75
 = 0.05
22.86
24
36
Left-tail test: calculating  (cont.)
– The rejection region with  = .05.
Express the rejection
region directly, not in
standardized terms
x  22.86
– Let the alternative value be  = 23 (rather than just
 < 24)
H :  = 24
0
HA:  = 23
Specify the
alternative value
under HA.
Do not reject H0
=.05
22.86 = 23
24
37
Left-tail test: calculating  (cont.)
 (23)  P( x  22.86 given that H 0 is false)
 P( x  22.86 given that   23)

22.86  23 
Power when =23 =
 P z 
  .718
6 75 
1-(23)=.282

H0:  = 24
H1:  = 23
=.05
22.86 = 23
24
38
A Two - Tail Test for 
• The Federal Communications Commission
(FCC) wants competition between phone
companies. The FCC wants to investigate if
AT&T rates differ from their competitor’s rates.
• According to data from the (FCC) the mean
monthly long-distance bills for all AT&T
residential customers is $17.09.
39
A Two - Tail Test (cont.)
• A random sample of 100 AT&T customers is
selected and their bills are recalculated using a
leading competitor’s rates.
• The mean and standard deviation of the bills
using the competitor’s rates are
x  $17.55, s  $3.87
• Can we infer that there is a difference between
AT&T’s bills and the competitor’s bills (on the
average)?
40
A Two - Tail Test (cont.)
• Is the mean different from 17.09?
H0:  = 17.09
H A :   17.09
• n = 100; use  = .05
41
A Two – Tail Test (cont.)
Rejection region
t  t.025,99 or t  t.025,99
t  1.9842 or t  1.9842
x   17.55  17.09
t

 1.19
s n
3.87 100
t99
/2  0.025
-t/2 = -1.9842
/2  0.025
0 t/2 = 1.9842
Rejection region
42
A Two – Tail Test: Conclusion
There is insufficient evidence to conclude that there is a
difference between the bills of AT&T and the competitor.
Also, by the P-value approach:
The P-value = P(t < -1.19) + P(t > 1.19)
= 2(.1184) = .2368 > .05
x   17.55  17.09
t

 1.19
s n
3.87 100
A Type II error is possible
/2  0.025
/2  0.025
-1.19 0 1.19
-t/2 = -1.9842
t/2 = 1.9842
43
Two-Tail Test: Calculating , the
Probability of a Type II Error
• The FCC would like to detect a decrease of $1.50 in
the average competitor’s bill. (17.09-1.50=15.59)
• What is (15.59), the probability of a Type II error
when the true mean competitor’s bill  is $15.59?
44
Two – Tail Test: Calculating  (cont.)
Rejection region
rejection region in terms of x :
x  17.09
z
 1.96
3.87
100
3.87
x  17.09  1.96
100
x  16.33
z
x  17.09
3.87
 1.96
100
x  17.09  1.96
3.87
z   z.025 or z  z.025
z  1.96 or z  1.96
/2  0.025
/2  0.025
Do not reject H0
16.33
17.09
17.85
Reject H0
100
x  17.85
45
Two – Tail Test: Calculating  (cont.)
 (15.59)  P(16.33  x  17.85 given that   15.59)
 16.33  15.59 x  15.59 17.85  15.59 
 P



 3.87 100 3.87 100 3.87 100 
 P (1.912  z  5.84)  .028
Power when
H0:  = 17.09
=15.59 = 1(15.59)=.972
HA:  = 15.59
=.05
17.09
= 15.59
16.63
17.85
46
General formula: Type II Error
Probability (A) for a Level  Test
H A :   0
H A :   0
H A :   0

0   A 
P  z  z 

 n 


0   A 
1  P  z   z 

 n 



0   A 
0   A 
P  z  z /2 
  P  z   z /2 

 n 
 n 


47
Sample Size n for which a level 
test also has (A) = 
   ( z  z )  2

 for a 1-tailed (right or left) test
  0   A 
n
2
   ( z /2  z ) 
 for a 2-tailed test (approx. solution)

  0   A 
48