Chapter 9: Hypothesis Testing
9.1 Introduction to Hypothesis Testing
Hypothesis testing is a tool you use to make decision
from data. Something you usually do:
1. You make a statement about something.
2. You collect sample data relating to the statement.
3. If given that the statement is true, the sample
outcome is unlikely, you realize that the statement
probably is not true.
For example, assume you want to know if a particular
coin is fair. Your hypothesis is that it is a fair coin. You
toss the coin 20 times and get heads 18 times. Since
that is an unlikely outcome given that it is a fair coin,
you reject the hypothesis that it is a fair coin.
9.2 Steps of Hypothesis Testing
1. Specify the null hypothesis
(虛無假設，零假設, H0) and the
alternative hypothesis (對立假設, H1).
2. What level of significance ()?
3. Which test and test statistic?
4. State the decision rule.
5. Use the sample data to calculate the
6. Use the test statistic to make a decision.
7. Interpret the decision in the context of the
Assume that a teacher has a class of 28 students.
She want to use the IQ score to demonstrate that her
students are “above average.” The IQ scores are
standardized to have a population mean of 100 and a
standard deviation of 16.
Step 1: Specify the Null Hypothesis and the Alternative
The null hypothesis, H0, is the statement we are interested
in testing. The word null implies “nothing” or “non existent.”
It indicates what would happen by chance or what would
happen if there was no difference or no treatment effect.
The alternative hypothesis, H1, is the statement that we
accept if our sample outcome leads us to reject the null
For the classroom example, the statement is as follows:
H0: = 100 (the students have an average IQ)
H1: > 100 (the students have an above-average IQ)
The null hypothesis always includes the equal condition.
The teacher is interested in the condition that the
students are above average. Then this condition is in the
The hypotheses are written in the form with a population
parameter on the left and a numeric value on the right:
H0: = 0
H1: > 0
0 = the hypothesized value, 100 in this example.
Step 2: What Level of Significance (顯著水準)
The level of significance is the probability of rejecting the
null hypothesis by chance alone. (What does it mean with
= 0.05?) This could result from sampling error. Occasionally
we get a sample just by chance (碰巧) that would lead us to
reject the null hypothesis. It is possible to get 18 out of 20
heads with a perfectly fair coin, but the probability is very
low or unlikely. The level of significance is our definition of
unlikely. The traditional definition of unlikely is 5% of the
time or less. (碰巧取到較極端的資料導致推翻正確Ho的機率)
What significance level should you use?
If you want to be more certain that we are not falsely rejecting
the null hypothesis, you can reduce the significance level to .01
or even lower.
When in doubt, use the standard 5% level.
Type I and Type II errors
The significance level is also called the probability of a type
I error. A type I error occurs when you falsely reject the null
hypothesis on the basis of sampling error. A type II error
occurs when you fail to reject the null hypothesis when it
Step 3: Which Test and Test Statistic?
The test statistic (統計量) is the value calculated from the
sample to determine whether to reject the null hypothesis.
If we can assume a normal sampling distribution of means,
we can calculate a z-value for a sampling distribution.
For our IQ problem, a sample of 28 is enough for the
central limit theorem to be valid, especially since we have
reason to believe that the population distribution of IQs is
close to normal.
For out test of mean vs. hypothesized value, a z-test,
the test statistic is
To calculate this test statistic, we need to know the population
standard deviation, . In this case, we know = 16.
Step 4: State the Decision Rule
We reject the null hypothesis if the test statistic is larger than
a critical value corresponding to the significance level in step
2. For = 0.05, the z-value corresponding to 0.05 in the
upper tail of the normal curve is 1.645. The decision rule is
Reject H0 if z > 1.645
(1) We were testing whether the IQ scores are greater than
100, a one-tailed test.
We can also test:
(2) The mean is less than 100, a one-tailed test.
(3) Reject the hypothesis if the mean is either greater than
or less than 100, a two-tailed test
5% one-tail upper
z = 1.645
5% one-tail lower
z = -1.645
z = -1.96
z = 1.96
Step 5: Use the Sample Data to Calculate the Test Statistic
Assume that the mean IQ of the students is 105.6. Our test
X 0 (105.6 100)
16 / 28
Step 6: Use the Test Statistic to Make a Decision
We see that our z value of 1.85 is greater than the critical
value 1.645, and so we reject the null hypothesis
Step 7: Interpret the Decision in the Context of the Original
To say that a result is “statistically significant” means that
it is more than by chance alone.
The Concept of a p-value
In our example, the probability of a z-value greater than 1.85
is 0.0322. This is called p-value of the test. The p-value is the
probability of getting the sample result by chance alone if the
null-hypothesis is actually true. In our case, the p-value is
smaller than the level of significance.
The decision rule can also be
Reject H0 if the p-value is less than
can be any significance level.
Note: For a two-tail test, you double one-tail p-value before
comparing it to .
Test Statistic versus p-value
Both methods require the calculation of a test statistic. The
test statistic approach compares the value of the calculated
test statistic to a critical value from a table; the p-value
approach calculates the probability of the test statistic and
compares it to the significance level, .
If you are using a statistical software, you will get a p-value
and you do not need to look up a critical value in a table.
With the p-value approach, instead of just rejecting the null
hypothesis (or not rejecting it), you will get a sense of how
significant (or not significant) the results are. For example,
if a test has a p-value of 0.000035, it is very unlikely to have
happened by chance alone.
Learning Activity 9.1-2 Hypothesis Test Calculations
Use excel to calculate the z-value in the text
Determine the p-value by looking up the Tables.xls
Replicate the z-value and p-value by using
MegaStat | Hypothesis Tests | Mean vs. Hypothesized Value
Click “Summary Data” and select B3:B6 as the input
Use MegaStat | Probability | Normal Distribution to replicate
the figures below.
9.A Hypothesis Testing Simulation (Normal random numbers)
This simulates the example we looked at this chapter.
We rarely reject H0 when the population mean is 100.
When we do reject the H0, it is an example of type I error
(rejecting based on a lucky sample).
You can change population mean to 110 and see the
null hypothesis is rejected most of the time. When it is not
rejected, it is a type II error (failing to reject when you should
have since the population mean is greater than 100).
Our example is
H0: = 0
H1: > 0
0 = the hypothesized value, 100 in this example.