Error Types - highlandstatistics

Download Report

Transcript Error Types - highlandstatistics

The Fine Art of Knowing How Wrong You Might Be
To Err Is Human
 Humans make an infinitude of mistakes.
 I figure safety in numbers makes it a little more okay.
 When we do a hypothesis test, there are only so many
options for what we decide to do.
 More specifically, there are only four possible
outcomes.
Four?!?
 Some of you might be thinking that our only two
possible options are to reject H0 or to not reject H0.
 You are correct.
 In each of these cases, though, we might be right or we
might be wrong.
 That is where the number four comes from.
Four
 We might find sufficient evidence, and it turns out the
evidence is telling us the truth.
 We might find sufficient evidence, but it turns out the
evidence was naughty, bad evidence that was
misleading us. (A frame job.)
 We might not find sufficient evidence, but it turns out
that there really is the difference we were looking for.
 We might not find sufficient evidence, and it turns out
that is correct and that nothing is going on.
So Then What?
 No matter what decision we make, we might be wrong.
 When you’re right, you’re right.
 When you’re wrong, you don’t know it, so act like
you’re right…it projects an attractive level of
confidence.
 In statistics we have no real way to find the truth
without taking a census.
 Do not administer a census if you can help it.
 A census is dumb.
Where Am I Going With This?
 If we reject H0 (which means there was sufficient
evidence), but we were wrong, this is called a Type I
Error (Roman Numeral 1).
 The probability of making this kind of error is alpha.
 So, normally, 5%.
 We call this percentage significance level, and use the
letter α for it.
 We can set it as high or low as we want.
 Normally we go with 5%.
Why Not Make It 0%?
 If we do not reject H0 (which means there was not
sufficient evidence to draw a conclusion), but
something really is going on, this is called a Type II
Error (Roman Numeral 2).
 The probability of this kind of error is called β and it is
a real pain to calculate, so we are not going to.
 What you need to know about it is that the lower α is,
the higher β is.
5% Is A Good Balance
 There are really only 2 things we can do to make β
smaller.
 First, we can make α higher.
 This is usually counterproductive, as it just makes us
more likely to be wrong the other direction.
 Second, we can make n (the sample size) larger.
 Larger sample sizes make everything better.
 Make the sample size too large, though, and you violate
the less than 10% condition.
 Also, if the sample is too large, you are pretty much
doing a census.
A Pair of Complements
 For a 2-sided test (One where the alternative is a ≠
kind), the complement of α is the confidence level.
 And the compliment of the confidence level is the
significance level.
 So, 95% confidence matches with 5% (for a ≠ test).
 The complement of β is called statistical power.
 You don’t need the formula for this, and only need to
know the term in a general way.
Significance
 Remember how we discussed the difference between
statistical significance and practical significance?
 I remember it.
 Hopefully we can all remember it.
 If the p-value is lower than α, we found sufficient
evidence.
 Another way to say this is that a “statistically
significant difference was found”.
 Statistically significant means the H0 got rejected.
Nothing To Prove
 Remember how I said earlier that Statistics is never
used to prove anything ever?
 I remember it.
 Hopefully we can all remember it.
 We never prove the null or the alternative hypothesis.
 We only support the alternative strongly enough or do
not support it strongly enough.
A Truckload of Apples
 On a recent homework problem you were asked to find
out how likely it was that a sample of 150 apples were
taken from a whole truckload would have less than 5%
damaged.
 We were able to do this math because we were told
that the true percentage of damaged apples was 8%.
A Truckload of Apples
 In the real world we would not know that 8% of the
whole truckload was damaged.
 They just told us that so we could do the math.
 So what should we do in the real world?
 Since we don’t know what the true percent is, we will
do the next best thing…we will make a wild
assumption about the truth.
Wild Assumption
 In practice we will make a sensible assumption,
actually.
 Our assumption will usually be based on something
that is known.
 This could be past data.
 This could be the value from a similar group.
 This could be .5 in the case of fair coins.
 The purpose in making this assumption is, primarily,
to allow us to do the math.
Alternatives
 The alternative hypothesis will be based on whatever
we are trying to prove.
 We use the null hypothesis so we can do the math, but
really the alternative hypothesis is what we are trying
to focus on when we run a hypothesis test.
Foreshadowing FTW
 Tomorrow we will discuss inferior ways to run a
hypothesis test.
 I say inferior, but they are perfectly acceptable.
 Mr. J.P. Damron, in fact, almost always favored a
confidence interval method for running hypothesis
tests.
 It is worth noting that he never once got a hypothesis
test wrong with this method, and seriously did over 30
of them.
Assignments
 You can read chapter 21, but I do not think it is
necessary, especially if you have limited free time.
 Test 4 is next week on Thursday.
 There will be another Quiz Retake Bonanza on
February 19th (the day after the test).