Hypothesis testing - Lyndhurst Schools
Download
Report
Transcript Hypothesis testing - Lyndhurst Schools
Chapter 12:
Hypothesis Testing
• Remember that our ultimate goal is to take
information obtained in a sample and use it to
come to some conclusion about the population.
• To determine whether or not we can reach a
certain conclusion about the population, we need
to perform a hypothesis test.
• The purpose of hypothesis testing is to determine
whether or not a claim about the true value of the
population characteristic is valid.
• Hypothesis testing is a process for evaluating
hypotheses about population parameters using
sample statistics.
Writing Hypothesis
• The null hypothesis describes our initial belief
about a population parameter.
• The null hypothesis must contain a condition
of equality (no difference in values).
• For example, let’s say that a candy company believes
each of their chocolate bars contains an average of 70
grams of peanuts. Their null hypothesis would be:
• The alternative hypothesis describes what we
want to establish or what we suspect is true.
• Sticking with the same chocolate bar example, let’s say we took
five samples of chocolate bars, and found the mean amount of
peanuts to be 40 grams, 51 grams, 35 grams, 55 grams, and 46
grams. Now we start to suspect the company is lying and the
mean amount of peanuts is less than 70 grams. Here’s how we
would write both our null and alternative:
Example: Write a null and alternative hypothesis for
each situation.
a) The psychological study to measure students’
attitudes toward school ranged from 0 to 200. The
mean score for US college students is about 115. A
teacher suspects that older students have better
attitudes toward school.
b) Buddy read a newspaper report claiming that 12% of all
adults in the US are left-handed. He believes that the
proportion of lefties at Seton Hall is not equal to 12%.
One-tail vs Two-tail
• Hypothesis testing will either be classified as a
one-tailed test or a two-tailed test, and it is
based off of our alternative hypothesis.
• If our alternative was that the mean was greater than 70,
then we would only care about the right tail (still a onetailed test).
Terminology
• The significance level of a test is a predetermined
level of evidence that is required to reject the null
hypothesis.
– Common levels of significance are 1%, 5%, and 10%.
• Critical values are the values we read either from
the z-table or t-table that separate the “reject the
null hypothesis” region from the “do not reject the
null hypothesis” region.
• The test statistic measures how far a sample
statistic diverges from what we would expect if the
null were true, in standardized units.
• The test statistic when performing a
hypothesis test for a single population mean
will be:
• The test statistic when performing a
hypothesis test for a single population
proportion will be:
Two Approaches
• There are two approaches to hypothesis testing:
the classical approach and the p-value approach.
• In the classical approach, the computed value of
the test statistic is compared to the critical
value(s). If the test statistic falls within the critical
region, we will reject the null hypothesis. If it
does not, we will fail to reject the null hypothesis.
• In the p-value approach, we reject the null
hypothesis if the p-value is less than the level of
significance.
Step 1: Does the mean amount owed on delinquent credit cards
usually exceed $2000?
Step 5:
Step 8: At the 1% level of significance, there is not
convincing evidence to support the claim that the
mean amount owed on delinquent credit cards is
more than $2000.
Example: A pineapple company is interested in the sizes of
pineapples grown in their fields. Last year, the mean weight of the
pineapples was 31 ounces. This year, the company is using a
different irrigation system, and management is wondering how this
change will affect the mean weight. They will be concerned if the
mean weight is not equal to 31 ounces. A sample of 50 pineapples
were taken that produced a mean weight of 31.935 ounces, with a
standard deviation of 2.394 ounces. Is there convincing evidence
that the mean weight of pineapples produced in the field has
changed this year? Use a 5% level of significance.
Step 1: Has the mean weight of pineapples changed this year?
Use df=49 and area in two-tails 0.05.
Example: A pineapple company is interested in the sizes of
pineapples grown in their fields. Last year, the mean weight of the
pineapples was 31 ounces. This year, the company is using a
different irrigation system, and management is wondering how this
change will affect the mean weight. They will be concerned if the
mean weight is not equal to 31 ounces. A sample of 50 pineapples
were taken that produced a mean weight of 31.935 ounces, with a
standard deviation of 2.394 ounces. Is there convincing evidence
that the mean weight of pineapples produced in the field has
changed this year? Use a 5% level of significance.
Step 5:
Example: A pineapple company is interested in the sizes of
pineapples grown in their fields. Last year, the mean weight of the
pineapples was 31 ounces. This year, the company is using a
different irrigation system, and management is wondering how this
change will affect the mean weight. They will be concerned if the
mean weight is not equal to 31 ounces. A sample of 50 pineapples
were taken that produced a mean weight of 31.935 ounces, with a
standard deviation of 2.394 ounces. Is there convincing evidence
that the mean weight of pineapples produced in the field has
changed this year? Use a 5% level of significance.
Step 8: At the 5% level of significance, there is
convincing evidence to support the claim that
the mean weight of pineapples has changed this
year.
Step 1: Is the mean etch-a-sketch production for South Pole Elves
lower than 165?
Step 5:
Step 6:
Step 8: At the 5% level of significance, there is
convincing evidence to support the claim that the
mean etch-a-sketch production is less than 165.
P-Value
• P-value (probability value) is the probability of
getting a test statistic, at least as extreme as the
observed test statistic, assuming the null
hypothesis is true.
• If the p-value is smaller than our significance
level (alpha), we will reject the null hypothesis.
• If the p-value is larger than our significance level,
we will fail to reject the null hypothesis.
Let’s revisit some previous examples.
Example: A pineapple company is interested in the sizes of
pineapples grown in their fields. Last year, the mean weight of the
pineapples was 31 ounces. This year, the company is using a
different irrigation system, and management is wondering how this
change will affect the mean weight. They will be concerned if the
mean weight is not equal to 31 ounces. A sample of 50 pineapples
were taken that produced a mean weight of 31.935 ounces, with a
standard deviation of 2.394 ounces. Is there convincing evidence
that the mean weight of pineapples produced in the field has
changed this year? Use a 5% level of significance.
We know that our hypotheses are:
Suppose this test yielded a p-value of 0.0081. This would be lower
than our significance level of .05. Therefore, we would reject our null
hypothesis.
Suppose this test yielded a p-value of .02. This is NOT lower than
our significance level of .01. Therefore, we would fail to reject our
null hypothesis.
However, what would happen if we had a different significance
level, say 5%. Would our conclusion be different?
This leads us into the fact that we can make errors making
conclusions during hypothesis testing.
Type I and Type II Errors
• It is possible to perform all of the correct procedures in a
hypothesis test, but still make an error with our conclusion.
• The different kinds of errors we can make are known as Type I
and Type II errors.
• A Type I error occurs when we reject a null hypothesis
given that the null hypothesis is actually true.
• A Type I error is also known as a “false positive.”
• Examples of Type I errors:
– You go to the doctor for a exam. You have a perfect bill of
health, but the doctor tells you have you cancer.
– In a courtroom, the null hypothesis is that the defendant is
innocent and this null is actually true. A Type I error would
be convicting the defendant.
• A Type II error occurs when we fail to reject the null
hypothesis given that the null hypothesis is false.
• A Type II error is also known as a “false negative.”
• Example of Type II errors:
– You go to the doctor for an exam. You have cancer, but the
doctor tells you that you are perfectly fine.
– In a courtroom, a Type II error would be failing to convict a
guilty person.
Example: Suppose Jackie Moon says he is a 65% free throw
shooter. A sample of his recent free throws was taken, and
his sample free-throw percentage was 50%.
a) State the hypotheses.
b) A test was run and it yielded a p-value of 0.02. Using a
significance level of 5%, what conclusion would you make?
c) If your conclusion was an error, what type of error did you
make?
Type I. We would have rejected a true null.
Example: John and Jeremy are venture capitalists that invested in a
new company called Holy Shirts and Pants. The company claims their
mean daily profits are $370. However, a recent sample was taken and
the average daily sample profits were $290.
a) State the hypotheses.
b) A test was run and it yielded a p-value of 0.07. Using a
significance level of 5%, what conclusion would you make?
c) If your conclusion was an error, what type of error did you
make?
Type II. We would have failed to reject a false null.
Hypothesis Testing for a Population Proportion
• Recall that the test statistic for a population
proportion will be calculated by:
• Also remember we are always using the zdistribution when testing for a population
proportion.
Example: Prestige Worldwide manufactures liquid paper. The
machine used in making the liquid paper is known to produce
5% defective bottles of liquid paper. A random sample of 200
bottles was taken recently and showed that 17 of the bottles
were defective. Using a 1% level of significance, can we
conclude that the machine is producing more than 5%
defective bottles of liquid paper?
Step 1: Is the machine producing more than 5% defective bottles
of liquid paper?
Example: Prestige Worldwide manufactures liquid paper. The
machine used in making the liquid paper is known to produce
5% defective bottles of liquid paper. A random sample of 200
bottles was taken recently and showed that 17 of the bottles
were defective. Using a 1% level of significance, can we
conclude that the machine is producing more than 5%
defective bottles of liquid paper?
Step 5:
Step 6:
Example: Prestige Worldwide manufactures liquid paper. The
machine used in making the liquid paper is known to produce
5% defective bottles of liquid paper. A random sample of 200
bottles was taken recently and showed that 17 of the bottles
were defective. Using a 1% level of significance, can we
conclude that the machine is producing more than 5%
defective bottles of liquid paper?
Step 8: At the 1% level of significance, there is not
convincing evidence to support the claim that the
machine is producing more than 5% defective
bottles of liquid paper.
Example: Johnny Karate claims to be a 80% free throw
shooter. In a recent sample of 50 free-throws, he made
32. Is there convincing evidence to claim that Johnny
Karate is less than an 80% free throw shooter? Use a
5% level of significance to test.
Step 1: Is Johnny Karate less than an 80% free throw shooter?
Example: Johnny Karate claims to be a 80% free throw
shooter. In a recent sample of 50 free-throws, he made
32. Is there convincing evidence to claim that Johnny
Karate is less than an 80% free throw shooter? Use a
5% level of significance to test.
Step 5:
Step 6:
Example: Johnny Karate claims to be a 80% free throw
shooter. In a recent sample of 50 free-throws, he made
32. Is there convincing evidence to claim that Johnny
Karate is less than an 80% free throw shooter? Use a
5% level of significance to test.
Step 8: At the 5% level of significance, there convincing
evidence to support the claim that Johnny Karate is
less than an 80% free-throw shooter.