Transcript Document

Chapter 11
Introduction to
Hypothesis Testing
1
Developing Null and Alternative
Hypotheses
• Hypothesis testing can be used to determine whether a
statement about the value of a population parameter should or
should not be rejected.
• The null hypothesis, denoted by H0 , is a tentative assumption
about a population parameter.
• The alternative hypothesis, denoted by H1, is the opposite of
what is stated in the null hypothesis.
• Hypothesis testing is similar to a criminal trial. The hypotheses
are:
H0: The defendant is innocent
H1: The defendant is guilty
2
A Summary of Forms for Null and
Alternative Hypotheses about a
Population Mean
• The equality part of the hypotheses always appears in the
null hypothesis.
• In general, a hypothesis test about the value of a
population mean  must take one of the following three
forms (where 0 is the hypothesized value of the
population mean).
H0:  > 0 H0:  < 0 H0:  = 0
H1:  < 0 H1:  > 0 H1:   0
3
Concepts of Hypothesis Testing
• The critical concepts of hypothesis testing.
– Example:
• An operation manager needs to determine if the mean
demand during lead time is greater than 350.
• If so, changes in the ordering policy are needed.
– There are two hypotheses about a population mean:
• H0: The null hypothesis
 = 350
• H1: The alternative hypothesis  > 350
4
Concepts of Hypothesis Testing
• Assume the null hypothesis is true (= 350).
 = 350
– Sample from the demand population, and build a statistic
related to the parameter hypothesized (the sample mean).
– Pose the question: How probable is it to obtain a
sample mean at least as extreme as the one observed
from the sample, if H0 is correct?
5
Concepts of Hypothesis Testing
• Assume the null hypothesis is true (= 350).
x  355
 = 350
x  450
– Since the x is much larger than 350, the mean  is likely
to be greater than 350. Reject the null hypothesis.
– In this case the mean  is not likely to be greater than
350. Do not reject the null hypothesis.
6
Types of Errors
• Two types of errors may occur when deciding whether to
reject H0 based on the statistic value.
– Type I error: Reject H0 when it is true.
– Type II error: Do not reject H0 when it is false.
• Example continued
– Type I error: Reject H0 ( = 350) in favor of H1 ( > 350)
when the real value of  is 350.
– Type II error: Believe that H0 is correct ( = 350) when
the real value of  is greater than 350.
7
Type I and Type II Errors
• Since hypothesis tests are based on sample data, we
must allow for the possibility of errors.
• The person conducting the hypothesis test specifies the
maximum allowable probability of making a
Type I error, denoted by  and called the level of
significance.
• Generally, we cannot control for the probability of
making a Type II error, denoted by .
• Statistician avoids the risk of making a Type II error by
using “do not reject H0” and not “accept H0”.
8
Decision Table for Hypothesis Testing
9
Null True
Null False
Fail to
reject null
Correct
Decision
Type II error
( )
Reject null
Type I error
()
Correct Decision
(Power)
Controlling the probability of
conducting a type I error
• Recall:
– H0:  = 350 and H1:  > 350.
– H0 is rejected if x is sufficiently large
• Thus, a type I error is made if x  critical value
when  = 350.
• By properly selecting the critical value we can limit the
probability of conducting a type I error to an acceptable
level.
Critical value
10
 = 350
x
Steps in Hypothesis Testing
1. Establish hypotheses: state the null and
alternative hypotheses.
2. Determine the appropriate statistical test and sampling
distribution.
3. Specify the Type I error rate (
4. State the decision rule.
5. Gather sample data.
6. Calculate the value of the test statistic.
7. State the statistical conclusion.
8. Make a managerial decision.
11
Testing the Population Mean When the
Population Standard Deviation is Known
• Example
– A new billing system for a department store will be costeffective only if the mean monthly account is more than
$170.
– A sample of 400 accounts has a mean of $178.
– If accounts are approximately normally distributed with
s = $65, can we conclude that the new system will be
cost effective?
12
Testing the Population Mean (s is Known)
• Example 11.1 – Solution
– The population of interest is the credit accounts at
the store.
– We want to know whether the mean account for all
customers is greater than $170.
H1 :  > 170
– The null hypothesis must specify a single value of
the parameter ,
H0 :  = 170
13
Approaches to Testing
• There are two approaches to test whether the
sample mean supports the alternative
hypothesis (H1)
– The rejection region method is mandatory for
manual testing (but can be used when testing is
supported by a statistical software)
– The p-value method which is mostly used when a
statistical software is available.
14
The Rejection Region Method
The rejection region is a range of values such
that if the test statistic falls into that range, the
null hypothesis is rejected in favor of the
alternative hypothesis.
15
The Rejection Region Method
for a Right - Tail Test
Example 11.1 – solution continued
• Define a critical value xL for x that is just large enough
to reject the null hypothesis.
• Reject the null hypothesis if
x  xL
16
Determining the Critical Value for the
Rejection Region
• Allow the probability of committing a Type I error
be  (also called the significance level).
• Find the value of the sample mean that is just
large enough so that the actual probability of
committing a Type I error does not exceed 
Watch…
17
Determining the Critical Value –
for a Right – Tail Test
Example – solution continued
z 

 x  170
x L  170
65
400
xL
x
P(commit a Type I error) = P(reject H0 given that H0 is true)
= P( x
18
 xLgiven that H0 is true)… is allowed to be 
Since P( Z  Z )   we have:
Determining the Critical Value –
for a Right – Tail Test
Example – solution continued
 = 0.05
 x  170
xL
z 
65
x L  170  z 
.
400
If we select   0.05, z .05  1.645.
19
65
x L  170  1.645
 175.34.
400
x L  170
65
400
Determining the Critical value
for a Right - Tail Test
Re ject the null hypothesis if
x  175.34
Conclusion
Since the sample mean (178) is greater than
the critical value of 175.34, there is sufficient
evidence to infer that the mean monthly
balance is greater than $170 at the 5%
significance level.
20
The standardized test statistic
– Instead of using the statistic x , we can use the
standardized value z.
z
x 
s
n
– Then, the rejection region becomes
z  z
21
One tail test
The standardized test statistic
• Example - continued
– We redo this example using the standardized test
statistic.
Recall: H0:  = 170
H1:  > 170
– Test statistic:
z
x 
s
n

178  170
65
400
 2.46
– Rejection region: z > z.05  1.645.
22
The standardized test statistic
• Example - continued
Re ject the null hypothesis if
Z  1.645
Conclusion
Since Z = 2.46 > 1.645, reject the null
hypothesis in favor of the alternative
hypothesis.
23
P-value Method
– The p-value provides information about the amount of
statistical evidence that supports the alternative
hypothesis.
– The p-value of a test is the probability of observing a
test statistic at least as extreme as the one computed,
given that the null hypothesis is true.
– Let us demonstrate the concept on Example
24
P-value Method
The probability of observing a
test statistic at least as extreme as 178,
given that  = 170 is…
P( x  178 when   170)
178  170
 P( z 
)
65 400
 P( z  2.4615)  .0069
 x  170
25
x  178
The p-value
Example 1
• Calculate the value of the test statistic, set up the
rejection region, determine the p-value, interpret
the result, and draw the sampling distribution.
H 0 :   15
H1 :   15
s  2, n  25, x  14.3,   0.1
26
Solution
27
如何利用Excel查出標準常態分配機率值
• 標準常態分配(Normal distribution)
-> Using插入>fx函數>選取類別(統計)>NORMSDIST
為標準常態分配累積機率值
->Using插入>fx函數>選取類別(統計)>NORMSINV
求標準常態分配下,給定累積機率反求Z値
28
Example 2
• A random sample of 18 young adult men (20-30 years
old) was sampled. Each person was asked how many
minutes of sports they watched on television daily. The
responses are listed here. It is known that s =12. Test to
determine at the 5% significance level whether there is
enough statistical evidence to infer that the mean amount
of television watched daily by all young adult men is
greater than 50 minutes
55 60 65 74 66 37 45 68 64
65 58 55 52 63 59 57 74 65
29
Solution
3.57
30
A Two - Tail Test
• Example 11.2
– AT&T has been challenged by competitors who
argued that their rates resulted in lower bills.
– A statistics practitioner determines that the mean
and standard deviation of monthly long-distance bills
for all AT&T residential customers are $17.09 and
$3.87 respectively.
31
A Two - Tail Test
• Example 11.2 - continued
– A random sample of 100 customers is selected and
customers’ bills recalculated using a leading
competitor’s rates (see Xm11-02).
– Assuming the standard deviation is the same (3.87),
can we infer that there is a difference between
AT&T’s bills and the competitor’s bills (on the
average)?
32
A Two - Tail Test
• Solution
– Is the mean different from 17.09?
H0:  = 17.09
H1 :   17.09
– Define the rejection region
z   z / 2 or z  z / 2
33
A Two – Tail Test
Solution - continued
/2  0.025
x
/2  0.025
17.09
If H0 is true ( =17.09), x can still fall far
above or far below 17.09, in which case
we erroneously reject H0 in favor of H1
(  17.09)
34
x
We want this erroneous
rejection of H0 to be a
rare event, say 5%
chance.
A Two – Tail Test
Solution - continued
z
/2  0.025
x
s

n
17 .55  17 .09
3.87
 1.19
100
17.55
x
17.09
x
From the sample we have:
/2  0.025
/2  0.025
/2  0.025
x  17.55
-z/2 = -1.96
35
0 z/2 = 1.96
Rejection region
A Two – Tail Test
There is insufficient evidence to infer that there is a
difference between the bills of AT&T and the competitor.
Also, by the p value approach:
The p-value = P(Z< -1.19)+P(Z >1.19)
= 2(.1173) = .2346 > .05
/2  0.025
z
36
x
s
n
/2  0.025
-1.19 0 1.19

17 .55  17 .09
3.87
100
 1.19
-z/2 = -1.96
z/2 = 1.96
Example 3
37
Solution
38
Calculation of the Probability of a
Type II Error
• To calculate Type II error we need to…
– express the rejection region directly, in terms of the
parameter hypothesized (not standardized).
– specify the alternative value under H1.
• Let us revisit Example 11.1
39
Calculation of the Probability of a
Type II Error
Express the rejection
region directly, not in
standardized terms
• Let us revisit Example 11.1
– The rejection region was x  175.34 with  = .05.
– Let the alternative value be  = 180 (rather than just
>170)
H :  = 170
0
H1:  = 180
Do not reject H0
=.05
= 170
xL 
175.34
40
180
Specify the
alternative value
under H1.
Calculation of the Probability of a
Type II Error
– A Type II error occurs when a false H0 is not
rejected.
H0:  = 170
A false H0…
…is not rejected
H1:  = 180
x  175.34
= 170
xL 
175.34
41
=.05
180
Calculation of the Probability of a
Type II Error
  P( x  175.34 given that H 0 is false )
 P( x  175.34 given that   180)
 P( z 
175.34  180
65
400
)  .0764
H0:  = 170
H1:  = 180
= 170
xL 
175.34
42
180
Example 4
43
Solution
44
Example 5
45
Solution
 48 .04  52

x


51
.
96

52

 P


 10 / 100 s / n 10 / 100 


46
Effects on  of changing 
• Decreasing the significance level , increases
the value of , and vice versa
2 < 1
= 170
47
2 > 1
180
Judging the Test
• A hypothesis test is effectively defined by the
significance level  and by the sample size n.
• If the probability of a Type II error  is judged to be
too large, we can reduce it by
– increasing , and/or
– increasing the sample size.
48
Judging the Test
• Increasing the sample size reduces 
xL  
s
Re call : z  
, thus x L    z 
s n
n
By increasing the sample size the
standard deviation of the sampling
distribution of the mean decreases.
Thus, x Ldecreases.
49
Judging the Test
• Increasing the sample size reduces 
xL  
s
Re call : z  
, thus x L    z 
s n
n
Note what happens when n increases:
 does not change,
but  becomes smaller
= 170
50
xxxLLxLxLxLL
180
Judging the Test
• Power of a test
– The power of a test is defined as 1 - 
– It represents the probability of rejecting the null
hypothesis when it is false.
51
Example 6
52
Solution
53