More About Tests and Intervals

Download Report

Transcript More About Tests and Intervals

More About Tests and Intervals
Ch. 21
AP Statistics
Null Hypothesis
It is a statement about the value of a parameter
for a model
p. 482 Example
In Florida, before the change in helmet law 60% of
youths involved in a motorcycle accident had been
wearing their helmets. Three years after the law
change, it was observed that 781 youths were
involved in a motorcycle accident and only 396 were
wearing helmets. Has helmet use in Florida declined
among riders under the age of 21 subsequent to the
change in helmet laws?
Steps
1.
2.
3.
4.
Hypotheses (and define parameter, p)
Model (conditions- and list test used)
Mechanics
Conclusion
P-value
• It is the conditional probability of getting
results at least as unusual as the observed
statistics given the null is true
• The lower the p-value, the more comfortable
we feel about our decision to reject the null
hypothesis, but the null doesn’t get any more
false
Ex p. 484
A new England Journal of Medicine paper reported
that the 7 year old risk of heart attack in diabetes
patients taking Avandia was increased from the
baseline of 20.2% to an estimated risk of 28.9% and
said the p-value was 0.03. How should the p-value be
interpreted in this context?
Ex. P. 485 The question of whether the diabetes
drug Avandia increased the risk of heart attack
was raised by a study in the New England
Journal of Medicine. This study estimated the 7
year risk of heart attack to be 28.9% and
reported a p-value of 0.03 for a test of whether
this risk was higher than the baseline risk of
20.2%. An earlier study(the ADOPT) study had
estimated the 7 year risk to be 26.9% and
reported a p-value of 0.27. Why did the
researchers in the ADOPT study not express
alarm about the increased risk they had seen?
Alpha Level
• Also called the significance level
• Common alphas = 0.1, 0.05, 0.01
What does it mean to be statistically
significant?
• In large samples-small deviations can be
statistically significant
• In small samples-large deviations may not be
statistically significant
You can approximate a hypothesis test by using
a confidence interval…
A CI is 2 sided and can be compared with a 2
sided HT
Ex. P. 488 The baseline 7 year risk of heart
attacks for diabetics is 20.2%. In 2007 a NEIM
study reported a 95% confidence interval
equivalent to 20.8% to 40% for the risk among
patients taking the drug Avandia. What did this
confidence interval suggest to the FDA about
the safety of the drug?
JC P. 488
1. An experiment to test the fairness of a
roulette wheel gives a z-score of 0.62. What
would you conclude?
2. In the last chapter we encountered a bank
that wondered if it could get more customers
to make payments on delinquent balances by
sending them a DVD urging them to set up a
payment plan. Well, the bank just got back
the results on their test of this strategy. A
90% CI for the success rate is (0.29, 0.45).
Their old send a letter method had worked
30% of the time. Can you reject the null that
the proportion is still 30% at 5%? Explain.
3. Given the CI the bank found, what would you
recommend that they do? Should they scrap
the DVD strategy?
Example P. 488
Teens are at the greatest risk of being killed or
injured in traffic crashes. According to the National
Highway Safety Administration, 65% of young
people killed were not wearing a safety belt. In
2001, a total of 3322 teens were killed in car
accidents, an average of 9 teens a day. Because
many of these deaths could have been easily
prevented by the use of safety belts, many states
have begun “Click It or Ticket” campaigns in which
increased enforcement and publicity have resulted
in significantly higher seatbelt use.
Overall use in Massachusetts quickly increased
from 51% in 2002 to 64.8% in 2006, with a goal
of surpassing the national average of 82%.
Recently, a local newspaper reported that a
roadblock resulted in 23 tickets to drivers who
were unbelted out of 134 stopped for
inspection. Does this provide evidence that the
goal of over 82% compliance was met?
Use a CI and a HT.
Type I Error
Reject the null hypothesis when it is true.
It is normally alpha, but when alpha is not given,
find the probability of the question being
asked.
This is like getting a false positive test result(you
are really healthy but diagnosed with a
disease)
Type II Error
When you fail to reject the null hypothesis when
it is false
If Type II Error is too large, you will have to take
a larger sample to reduce the chance of error.
This is like getting a false negative test result(you
are really sick but are not diagnosed with the
disease)
Power
Rejecting the null hypothesis when it is false
Power and Type II Error are complements of
each other.
If you increase n, you increase the power.
A published study found the risk of heart attack
to be increased in patients taking Avandia for
diabetes. An article said, “A few events either
way might have changed the findings for heart
attack or for death from cardiovascular causes.
In this setting, the possibility that the findings
were due to chance cannot be excluded.” What
kind of error would the researchers have made
if, in fact, their findings were due to chance?
What could be the consequences of this error?
The study of Avandia published in NEJM
combined results from 47 different trials-a
method called meta-analysis. The drug’s
manufacturer issued a statement saying, “Each
study is designed differently and looks at unique
questions: For example, individual studies vary
in size and length, in the type of patients who
participated, and in the outcomes they
investigate.” Nevertheless, by combining data
from many studies, meta-analyses can achieve a
much larger sample size. How could this larger
sample size help?
JC P. 494
1. Remember our bank that’s sending out DVDs
to try to get customers to make payments on
delinquent loans? It is looking for evidence that
the costlier DVD strategy produces a higher
success rate than the letters it has been sending.
Explain what a type I error is in this context and
what the consequences would be to the bank.
2. What’s a type II error in the bank experiment
context and what would the consequences
be?
3. For the bank, which situation has higher
power: a strategy that works really well,
actually getting 60% of people to pay off
their balances, or a strategy that barely
increases the payoff rate to 32%? Explain.
• A larger sample size, decreases Type II error
and increases power
• Power is the complement of a type II error
• If you reduce alpha, you increase type II error
and decrease the power
• The larger the real difference between a
hypothesized value and the true population
proportion, the smaller the chance of making
a type II error and the greater the power of
the test
• To reduce the standard deviation, use a larger
sample size
#32You are in charge of shipping computers to
customers. You learn that a faulty disk drive was
put into some of the machines. There’s a simple
test you can perform, but it’s not perfect. All
but 4% of the time, a good disk drive passes the
test, but unfortunately, 35% of the bad disk
drives pass the test too. You have to decide on
the basis of one test whether the disk drive is
good or bad. Make this a HT.
1. What are the hypotheses?
2. Given that a computer fails the test, what
would you decide? What if it passes the
test?
3. How large is alpha for this test?
4. What is the power of this test?
#25A company is sued for job discrimination
because only 19% of the newly hired candidates
were minorities when 27% of all applicants were
minorities. Is this strong evidence that the
company’s hiring practices are discriminatory?
1. Is this a one sided or two sided test? Why?
2. What would a Type I error be?
3. What would a Type II error be?
4. What is meant by the power of the test for
this problem?
5. If the hypothesis is tested at the 5% level
instead of the 1% level, how will this affect
the power of the test?
6. The lawsuit is based on the hiring of 37
employees. Is the power of the test higher
than, lower than, or the same as it would be
if it were based on 87 hires?