Introduction to Hypothesis Testing

Download Report

Transcript Introduction to Hypothesis Testing

Chapter 8 Hypothesis
Testing for Means: Part 2
Type II Error Probabilities,
Required Sample Size for
Specified Power
1
Hypothesis Testing for , Type II Error
Probabilities (Right-tail example)
• Example
– A new billing system for a department store will be costeffective only if the mean monthly account is more than
$170.
– A sample of 400 accounts has a mean of $174 and s =
$65.
– Can we conclude that the new system will be cost
effective?
2
Example (cont.)
• Hypotheses
– The population of interest is the credit accounts at
the store.
– We want to know whether the mean account for all
customers is greater than $170.
H0 : m = 170
HA : m > 170
– Where m is the mean account value for all customers
3
Example (cont.)
– Test statistic:
H0 : m = 170
HA : m > 170
x  174, s  65
t 
x m
s
n

174  170
65
 1.23
400
4
Example (cont.)
P-value: The probability of observing a value
of the test statistic as extreme or more
extreme then t = 1.23, given that m = 170 is…
P -value  P ( t 399  1.23)  .1097
t399
Since the P-value > .05, we conclude
that there is not sufficient evidence to
reject H0 : =170.
0
t  1 .2 3
Type II error is possible
5
Calculating , the Probability of a
Type II Error
• Calculating  for the t test is not at all
straightforward and is beyond the level of this
course
– The distribution of the test statistic t is quite
complicated when H0 is false and HA is true
– However, we can obtain very good approximate
values for  using z (the standard normal) in place
of t.
6
Calculating , the Probability of a
Type II Error (cont.)
• We need to
1. specify an appropriate significance level ;
2. Determine the rejection region in terms of z
3. Then calculate the probability of not being in the
rejection when  = 1, where 1 is a value of 
that makes HA true.
7
Example (cont.) calculating
– Test statistic:
H0 : m = 170
HA : m > 170
Choose = .05
Rejection region in terms of z: z > z.05 = 1.645
rejectio n reg io n in term s o f x :
z 
x  170
65
a = 0.05
 1 .6 4 5
400
x  1 7 0  1 .6 4 5
65
400
 1 7 5 .3 4 .
170
175.34
8
Example (cont.) calculating 
– The rejection region with a = .05.
Express the rejection
region directly, not in
standardized terms
x  175 . 34
– Let the alternative value be m = 180 (rather than just
m>170)
H : m = 170
0
HA: m = 180
Do not reject H0
a=.05
m= 170
1 7 5 .3 4
Specify the
alternative value
under HA.
m180
9
Example (cont.) calculating 
– A Type II error occurs when a false H0 is not
rejected. Suppose =180, that is H0 is false.
A false H0…
…is not rejected
H0: m = 170
H1: m = 180
x  175 . 34
m= 170
1 7 5 .3 4
a=.05
m180
10
Example (cont.) calculating 
 (180)  P ( x  175.34 given that H 0 is false )
 P ( x  175.34 given that m  180)
 P(z 
175 . 34  180
65
400
)  . 0764
H0: m = 170
Power when =180
= 1-(180)=.9236
H1: m = 180
m= 170
1 7 5 .3 4
m180
11
Effects on  of changing a
• Increasing the significance level a, decreases
the value of , and vice versa.
2 < 1
m= 170
a2 > a1
m180
12
Judging the Test
• A hypothesis test is effectively defined by the
significance level a and by the sample size n.
• If the probability of a Type II error  is judged to
be too large, we can reduce it by
– increasing a, and/or
– increasing the sample size.
13
Judging the Test
• Increasing the sample size reduces 
R e call R R : z 
xm
s
n
 z a , or
x  m  za
s
n
By increasing the sample size the
standard deviation of the sampling
distribution of the mean decreases.
Thus, the cutoff value of for the
rejection region decreases.
14
Judging the Test
• Increasing the sample size reduces 
R e call R R : z 
xm
s
n
 z a , or
x  m  za
s
n
Note what happens when n increases:
a does not change,
but  becomes smaller
m= 170
xxxLLxLxLxLL m180
15
Judging the Test
• Increasing the sample size reduces 
• In the example, suppose n increases from 400 to
1000.
x  m  za
  P (Z 
s
 170  1.645
n
 173.38
1000
173.38  180
65
65
)  P ( Z   3.22)  0
1000
• a remains 5%, but the probability of a Type II
drops dramatically.
16
A Left - Tail Test
• Self-Addressed Stamped Envelopes.
– The chief financial officer in FedEx believes that
including a stamped self-addressed (SSA) envelop
in the monthly invoice sent to customers will
decrease the amount of time it take for customers to
pay their monthly bills.
– Currently, customers return their payments in 24
days on the average, with a standard deviation of 6
days.
– Stamped self-addressed envelopes are included with
the bills for 75 randomly selected customers. The
number of days until they return their payment is
recorded.
17
A Left - Tail Test: Hypotheses
• The parameter tested is the population mean
payment period (m) for customers who receive
self-addressed stamped envelopes with their bill.
• The hypotheses are:
H0: m = 24
H1: m < 24
• Use  = .05; n = 75.
18
A Left - Tail Test: Rejection Region
• The rejection region:
• t < t.05,74 = 1.666
• Results from the 75 randomly selected
customers:
x  22.95 days, s  6 days
19
A Left -Tail Test: Test Statistic
• The test statistic is:
t
xm
s
n

22.95  24
6
  1.52
75
Since the rejection region is t   t   t   1.666
a
.05
We do not reject the null hypothesis.
Note that the P-value = P(t74 < -1.52) = .066.
Since our decision is to not reject the null hypothesis,
A Type II error is possible.
20
Left-Tail Test: Calculating , the
Probability of a Type II Error
• The CFO thinks that a decrease of one day in the
average payment return time will cover the costs of
the envelopes since customer checks can be
deposited earlier.
• What is (23), the probability of a Type II error when
the true mean payment return time  is 23 days?
21
Left-tail test: calculating
(cont.)
– Test statistic:
H0 : m = 24
HA : m < 24
Choose = .05
Rejection region in terms of z: z < -z.05 = -1.645
rejectio n reg io n in term s o f x :
z 
x  24
6
  1 .6 4 5
a = 0.05
75
x  2 4  1 .6 4 5
6
75
 2 2 .8 6 .
2 2 .8 6
24
22
Left-tail test: calculating  (cont.)
– The rejection region with a = .05.
Express the rejection
region directly, not in
standardized terms
x  22.86
– Let the alternative value be m = 23 (rather than just
m < 24)
H : m = 24
0
HA: m = 23
Specify the
alternative value
under HA.
Do not reject H0
a=.05
2 2 .8 6
m= 23
m24
23
Left-tail test: calculating  (cont.)
 (23)  P ( x  22.86 given that H 0 is false )
 P ( x  22.86 given that m  23)

22.86  23 
 Pz 
  .718
6 75 

H0: m = 24
Power when =23 =
1-(23)=.282
H1: m = 23
a=.05
2 2 .8 6
m= 23
m24
24
A Two - Tail Test for 
• The Federal Communications Commission
(FCC) wants competition between phone
companies. The FCC wants to investigate if
AT&T rates differ from their competitor’s rates.
• According to data from the (FCC) the mean
monthly long-distance bills for all AT&T
residential customers is $17.09.
25
A Two - Tail Test (cont.)
• A random sample of 100 AT&T customers is
selected and their bills are recalculated using a
leading competitor’s rates.
• The mean and standard deviation of the bills
using the competitor’s rates are
x  $17.55, s  $3.87
• Can we infer that there is a difference between
AT&T’s bills and the competitor’s bills (on the
average)?
26
A Two - Tail Test (cont.)
• Is the mean different from 17.09?
H0: m = 17.09
H A : m  17.09
• n = 100; use  = .05
27
A Two – Tail Test (cont.)
Rejection region
t   t .025 , 99 or t  t .025 , 99
t   1.9842 or t  1.9842
t
x m
s
n

17.55  17.09
3.87
 1 .1 9
t99
100
a/2  0.025
-ta/2 = -1.9842
a/2  0.025
0 ta/2 = 1.9842
Rejection region
28
A Two – Tail Test: Conclusion
There is insufficient evidence to conclude that there is a
difference between the bills of AT&T and the competitor.
Also, by the P-value approach:
The P-value = P(t < -1.19) + P(t > 1.19)
= 2(.1184) = .2368 > .05
t 
x m
s
n

17.55  17.09
3.87
 1.19
100
A Type II error is possible
a/2  0.025
a/2  0.025
-1.19 0 1.19
-ta/2 = -1.9842
ta/2 = 1.9842
29
Two-Tail Test: Calculating , the
Probability of a Type II Error
• The FCC would like to detect a decrease of $1.50 in
the average competitor’s bill. (17.09-1.50=15.59)
• What is (15.59), the probability of a Type II error
when the true mean competitor’s bill  is $15.59?
30
Two – Tail Test: Calculating  (cont.)
Rejection region
rejectio n reg io n in term s o f x :
z 
x  1 7 .0 9
  1 .9 6
z   z .025 or
z  z .025
z   1.96 or
z  1.96
3 .8 7
100
x  1 7 .0 9  1 .9 6
3 .8 7
100
a/2  0.025
x  1 6 .3 3
z 
x  1 7 .0 9
a/2  0.025
Do not reject H0
 1 .9 6
3 .8 7
100
x  1 7 .0 9  1 .9 6
3 .8 7
1 6 .3 3
17.09
17.85
Reject H0
100
x  1 7 .8 5
31
Two – Tail Test: Calculating  (cont.)
 (15.59)  P (16.33  x  17.8 5 given that m  15.59)
 16.33  15.59
x  15.59
17.85  15.59 
 P



3.87 100
3.87 100 
 3.87 100
 P (1.912  z  5.84)  .028
H0: m = 17.09
HA: m = 15.59
Power when
=15.59 = 1(15.59)=.972
a=.05
m17.09
m= 15.59
1 6 .6 3
1 7 .8 5
32
General formula: Type II Error
Probability (A) for a Level  Test
H A :m  m0

m0  m A 
P  z  za 


n 

H A :m  m0

m0  m A 
1  P  z   za 


n 

H A :m  m0


m0  m A 
m0  m A 
P  z  za / 2 
  P  z   za / 2 


n 

n 


33
Sample Size n for which a level 
test also has (A) = 
 (z  z ) 2
a


 for a 1-tailed (right or left) test
 m0  m A 
n
2
   ( za / 2  z  ) 
 for a 2-tailed test (approx. solution)

 m0  m A 
34