Power and Sample Size Calculations for

Transcript Power and Sample Size Calculations for

Chapter 23 Inference for
Means: Part 2
Type II Error, Power
and Sample Size
Calculations
1
Hypothesis Testing for , Type II Error
Probabilities (Right-tail example)
• Example
– A new billing system for a department store will be costeffective only if the mean monthly account is more than
$170.
– A sample of 400 accounts has a mean of $174 and s =
$65.
– Can we conclude that the new system will be cost
effective?
2
Example (cont.)
• Hypotheses
– The population of interest is the credit accounts at
the store.
– We want to know whether the mean account for all
customers is greater than $170.
H0 : m = 170
HA : m > 170
– Where m is the mean account value for all customers
3
Example (cont.)
– Test statistic:
H0 : m = 170
HA : m > 170
x  174, s  65
x  m 174  170
t

 1.23
s n 65 400
4
Example (cont.)
P-value: The probability of observing a value
of the test statistic as extreme or more
extreme then t = 1.23, given that m = 170 is…
t399
P-value  P(t399  1.23)  .1097
0
t  1.23
Since the P-value > .05, we conclude
that there is not sufficient evidence to
reject H0 : =170.
Type II error is possible
5
Calculating , the Probability of a
Type II Error
• Calculating  for the t test is not at all
straightforward and is beyond the level of this
course
– The distribution of the test statistic t is quite
complicated when H0 is false and HA is true
– However, we can obtain very good approximate
values for  using z (the standard normal) in place
of t.
6
Calculating , the Probability of a
Type II Error (cont.)
• We need to
1. specify an appropriate significance level ;
2. Determine the rejection region in terms of z
3. Then calculate the probability of not being in the
rejection when  = 1, where 1 is a value of 
that makes HA true.
7
Example (cont.) calculating
– Test statistic:
H0 : m = 170
HA : m > 170
Choose = .05
Rejection region in terms of z: z > z.05 = 1.645
rejection region in terms of x :
x  170
z
 1.645
65
400
65
x  170  1.645
 175.34.
400
a = 0.05
170 175.34
8
Example (cont.) calculating 
– The rejection region with a = .05.
Express the rejection
region directly, not in
standardized terms
x  175.34
– Let the alternative value be m = 180 (rather than just
m>170)
H : m = 170
0
HA: m = 180
Do not reject H0
a=.05
m= 170
175.34
Specify the
alternative value
under HA.
m180
9
Example (cont.) calculating 
– A Type II error occurs when a false H0 is not
rejected. Suppose =180, that is, H0 is false.
H0: m = 170
A false H0…
…is not rejected
H1: m = 180
x  175.34
m= 170
175.34
a=.05
m180
10
Example (cont.) calculating 
 (180)  P( x  175.34 given that H 0 is false)
 P( x  175.34 given that m  180)
 P( z 
175.34  180
65
400
)  .0764
H0: m = 170
Power when =180
= 1-(180)=.9236
H1: m = 180
m= 170
175.34
m180
11
Effects on  of changing a
• Increasing the significance level a, decreases
the value of , and vice versa.
2 < 1
m= 170
a2 > a1
m180
12
Judging the Test
• A hypothesis test is effectively defined by the
significance level a and by the sample size n.
• If the probability of a Type II error  is judged to
be too large, we can reduce it by
– increasing a, and/or
– increasing the sample size.
13
Judging the Test
• Increasing the sample size reduces 
x m
Recall RR : z 
 za , or
s n
x  m  za
s
n
By increasing the sample size the
standard deviation of the sampling
distribution of the mean decreases.
Thus, the cutoff value of for the
rejection region decreases.
14
Judging the Test
• Increasing the sample size reduces 
x m
Recall RR : z 
 za , or
s n
x  m  za
s
n
Note what happens when n increases:
a does not change,
but  becomes smaller
m= 170
xxxLLxLxLxLL
m180
15
Judging the Test
• Increasing the sample size reduces 
• In the example, suppose n increases from 400 to
1000.
s
65
x  m  za
 170  1.645
 173.38
n
1000
173.38  180
  P( Z 
)  P( Z  3.22)  0
65 1000
• a remains 5%, but the probability of a Type II
error decreases dramatically.
16
A Left - Tail Test
• Self-Addressed Stamped Envelopes.
– The chief financial officer in FedEx believes that
including a stamped self-addressed (SSA) envelope
in the monthly invoice sent to customers will
decrease the amount of time it take for customers to
pay their monthly bills.
– Currently, customers return their payments in 24
days on the average, with a standard deviation of 6
days.
– Stamped self-addressed envelopes are included with
the bills for 75 randomly selected customers. The
number of days until they return their payment is
recorded.
17
A Left - Tail Test: Hypotheses
• The parameter tested is the population mean
payment period (m) for customers who receive
self-addressed stamped envelopes with their bill.
• The hypotheses are:
H0: m = 24
H1: m < 24
• Use  = .05; n = 75.
18
A Left - Tail Test: Rejection Region
• The rejection region:
• t < t.05,74 = 1.666
• Results from the 75 randomly selected
customers:
x  22.95 days, s  6 days
19
A Left -Tail Test: Test Statistic
• The test statistic is:
x  m 22.95  24
t

 1.52
s n
6 75
Since the rejection region is
t  ta  t.05  1.666
We do not reject the null hypothesis.
Note that the P-value = P(t74 < -1.52) = .066.
Since our decision is to not reject the null hypothesis,
A Type II error is possible.
20
Left-Tail Test: Calculating , the
Probability of a Type II Error
• The CFO thinks that a decrease of one day in the
average payment return time will cover the costs of
the envelopes since customer checks can be
deposited earlier.
• What is (23), the probability of a Type II error when
the true mean payment return time  is 23 days?
21
Left-tail test: calculating
(cont.)
– Test statistic:
H0 : m = 24
HA : m < 24
Choose = .05
Rejection region in terms of z: z < -z.05 = -1.645
rejection region in terms of x :
x  24
z
 1.645
6
75
6
x  24  1.645
 22.86.
75
a = 0.05
22.86
24
22
Left-tail test: calculating  (cont.)
– The rejection region with a = .05.
Express the rejection
region directly, not in
standardized terms
x  22.86
– Let the alternative value be m = 23 (rather than just
m < 24)
H : m = 24
0
HA: m = 23
Specify the
alternative value
under HA.
Do not reject H0
a=.05
22.86 m= 23
m24
23
Left-tail test: calculating  (cont.)
 (23)  P(fail to reject H 0 : m  24 when m  23)
 P( x  22.86 when m  23)

22.86  23 
Power when =23 is
 P z 
  .58
6 75 
1-(23)=1-.58=.42

H0: m = 24
HA: m = 23
a=.05
22.86 m= 23
m24
24
A Two - Tail Test for 
• The Federal Communications Commission
(FCC) wants competition between phone
companies. The FCC wants to investigate if
AT&T rates differ from their competitor’s rates.
• According to data from the (FCC) the mean
monthly long-distance bills for all AT&T
residential customers is $17.09.
25
A Two - Tail Test (cont.)
• A random sample of 100 AT&T customers is
selected and their bills are recalculated using a
leading competitor’s rates.
• The mean and standard deviation of the bills
using the competitor’s rates are
x  $17.55, s  $3.87
• Can we infer that there is a difference between
AT&T’s bills and the competitor’s bills (on the
average)?
26
A Two - Tail Test (cont.)
• Is the mean different from 17.09?
H0: m = 17.09
H A : m  17.09
• n = 100; use  = .05
27
A Two – Tail Test (cont.)
Rejection region
t  t.025,99 or t  t.025,99
t  1.9842 or t  1.9842
x  m 17.55  17.09
t

 1.19
s n
3.87 100
t99
a/2  0.025
-ta/2 = -1.9842
a/2  0.025
0 ta/2 = 1.9842
Rejection region
28
A Two – Tail Test: Conclusion
There is insufficient evidence to conclude that there is a
difference between the bills of AT&T and the competitor.
Also, by the P-value approach:
The P-value = P(t < -1.19) + P(t > 1.19)
= 2(.1184) = .2368 > .05
x  m 17.55  17.09
t

 1.19
s n
3.87 100
A Type II error is possible
a/2  0.025
a/2  0.025
-1.19 0 1.19
-ta/2 = -1.9842
ta/2 = 1.9842
29
Two-Tail Test: Calculating , the
Probability of a Type II Error
• The FCC would like to detect a decrease of $1.50 in
the average competitor’s bill. (17.09-1.50=15.59)
• What is (15.59), the probability of a Type II error
when the true mean competitor’s bill  is $15.59?
30
Two – Tail Test: Calculating  (cont.)
Rejection region
rejection region in terms of x :
x  17.09
z
 1.96
3.87
100
3.87
x  17.09  1.96
100
x  16.33
z
x  17.09
3.87
 1.96
100
x  17.09  1.96
3.87
z   z.025 or z  z.025
z  1.96 or z  1.96
a/2  0.025
a/2  0.025
Do not reject H0
16.33
17.09
17.85
Reject H0
100
x  17.85
31
Two – Tail Test: Calculating  (cont.)
 (15.59)  P(16.33  x  17.85 given that m  15.59)
 16.33  15.59 x  15.59 17.85  15.59 
 P



 3.87 100 3.87 100 3.87 100 
 P(1.912  z  5.84)  .028
H0: m = 17.09
HA: m = 15.59
Power when
=15.59 is 1(15.59)=.972
m17.09
m= 15.59
16.63
17.85
32
General formula: Type II Error
Probability (A) for a Level  Test
H A : m  m0

m0  m A 
P  z  za 

 n 

H A : m  m0

m0  m A 
1  P  z   za 

 n 

H A : m  m0


m0  m A 
m0  m A 
P  z  za /2 
  P  z   za /2 

 n 
 n 


33
Sample Size n for which a level 
test also has (A) = 
   ( za  z )  2

 for a 1-tailed (right or left) test
  m0  m A 
n
2
   ( za /2  z ) 
 for a 2-tailed test (approx. solution)

  m0  m A 
34

Power and Sample Size Calculations for

Transcript Power and Sample Size Calculations for

Directory