Statistical Analysis – Chapter 5 “Central Limit
Download
Report
Transcript Statistical Analysis – Chapter 5 “Central Limit
Statistical Analysis – Chapter 6
“Hypothesis Testing”
Dr. Roderick Graham
Fashion Institute of Technology
Basic Concepts of Hypothesis Testing
What do we mean by a hypothesis?
A “proposed” explanation for a phenomenon.
In statistics, usually a hypothesis centers around explaining
sample means.
We usually hypothesize that a particular sample collected
is or is not like the population from which it is drawn.
Why the Central Limit Theorem is
important for hypothesis testing…
Remember that the central limit states that:
With a sample size over 30, the mean of the sample equals the
mean of the population
And sample means are normally distributed around the mean
This means that with any sample, we can EXPECT that
this sample is no different than the population from which
it is drawn
Conceptual Background for Hypothesis
Testing
But when we take a sample that is different from the
mean, it may be an indication that this sample is not equal
to the population.
But how can we be sure?
We can never be exactly sure…but we can use the
normal table to tell us the chances that a sample value is
like or unlike the population from which it is drawn.
Basics of Hypothesis Testing (using a
class example)
At the beginning of one of our
classes, you completed a small
survey about the types of music
you like. This survey is turned
into a scale. Social scientists use
this survey as a measure of
cultural openness. Americans
score around 3.21 on this scale,
with a standard deviation of .95.
We will test to see if the sample
we drew from this class is
significantly different than the
population.
Basics of Hypothesis Testing (using a
class example)
Step 1
All statistical tests start with an assumption. For this
example, the assumption is that the mean of the population,
µ is 3.21, and the Standard Deviation of the population, σ is
.95. The population is Americans. We also assume that our
sample is equal to the American population. Thus…
µ = X
In other words, because of the central limit theorem, we
assume that the mean of the population = the mean of the
sample. So any sample we take should have a mean of
approximately 3.21.
Basics of Hypothesis Testing (using a
class example)
Step 2
We will set a “cut-off” for accepting this assumption. Our
cut-off will be 95% of the normal distribution area (z =
+/- 1.96). If we calculate a z-score for our sample, and it
falls outside of this number, we reject our assumption that
the sample is the same as the population.
Basics of Hypothesis Testing (using a
class example)
Step 2 (cont’d)
Let’s make the necessary calculations. We need a mean and
a standard deviation, in order to use this formula: z X
X
1.
2.
3.
The mean of the population, µ, is 3.21. The population
standard deviation is .95.
What is the mean of our sample? This is X . The mean is
3.50.
We cannot assume a normal distribution. Thus, we
need to know the N of the sample in order to compute
the standard deviation. The N = 38. We compute a
standard deviation for the sample using this: X / n
Basics of Hypothesis Testing (using a
class example)
Step 2 Cont’d
Population
Sample (FIT Students)
µ = 3.21
X
σ = .95
N = 38
z-score (test) = +/-1.96
X
= 3.50
= .154
Z-score (sample) = 1.88
Equations used…
X / n
z
X
X
Basics of Hypothesis Testing (using a
class example)
Step 3
Now we evaluate our findings.
We decided that any value greater than z = +/- 1.96 is not like
the population.
The z-score for our sample is 1.88
So, do we accept or reject the initial assumption?
NO. We do not reject the assumption. FIT students are just like
the American Population.
Errors with hypothesis testing
Type 1 error – you have rejected an assumption when
you should not have. We call this “the α risk”. (alpha risk).
The alpha is always the point at which we reject or accept
an assumption. For example we set our level of rejection
for the FIT sample at 95% (+/- 1.96).
For the FIT example, the α is .05 (1 - .95).
Errors with hypothesis testing
Type II error – You accept the assumption when you
shouldn’t have. We call this “the β risk” (beta risk). In
other words, the sample is really different from the
population, but we did not identify it.
Power = 100% - β, or 1 - β
Power is the probability of making a correct decision by
avoiding Type II error.
Let’s go over a problem…(p. 170)
NIH agreed to supply immunizations for viruses. A
process is set up to fill test tubes to an average of 9.00 ml,
with a standard deviation of .35 ml. Now let’s say we
took a sample of 49 test tubes, and 99% of all sample
averages fall between 8.87 ml and 9.13 ml. We will use
this 99% criterion to accept µ = 9.00 ml.
What is the probability of Type I error?
b. What is the probability of Type II error if the process
shifts to µ = 9.20 ml.?
c. What is the power of the test in part b?
a.
Sample Questions for Review
Questions 6.6 and Questions 6.8 (work in groups if you
like)
We may finish these in class….if not…answers will be
posted on the website
END