Error Types - highlandstatistics
Download
Report
Transcript Error Types - highlandstatistics
The Fine Art of Knowing How Wrong You Might Be
To Err Is Human
Humans make an infinitude of mistakes.
I figure safety in numbers makes it a little more okay.
When we do a hypothesis test, there are only so many
options for what we decide to do.
More specifically, there are only four possible
outcomes.
Four?!?
Some of you might be thinking that our only two
possible options are to reject H0 or to not reject H0.
You are correct.
In each of these cases, though, we might be right or we
might be wrong.
That is where the number four comes from.
Four
We might find sufficient evidence, and it turns out the
evidence is telling us the truth.
We might find sufficient evidence, but it turns out the
evidence was naughty, bad evidence that was
misleading us. (A frame job.)
We might not find sufficient evidence, but it turns out
that there really is the difference we were looking for.
We might not find sufficient evidence, and it turns out
that is correct and that nothing is going on.
So Then What?
No matter what decision we make, we might be wrong.
When you’re right, you’re right.
When you’re wrong, you don’t know it, so act like
you’re right…it projects an attractive level of
confidence.
In statistics we have no real way to find the truth
without taking a census.
Do not administer a census if you can help it.
A census is dumb.
Where Am I Going With This?
If we reject H0 (which means there was sufficient
evidence), but we were wrong, this is called a Type I
Error (Roman Numeral 1).
The probability of making this kind of error is alpha.
So, normally, 5%.
We call this percentage significance level, and use the
letter α for it.
We can set it as high or low as we want.
Normally we go with 5%.
Why Not Make It 0%?
If we do not reject H0 (which means there was not
sufficient evidence to draw a conclusion), but
something really is going on, this is called a Type II
Error (Roman Numeral 2).
The probability of this kind of error is called β and it is
a real pain to calculate, so we are not going to.
What you need to know about it is that the lower α is,
the higher β is.
5% Is A Good Balance
There are really only 2 things we can do to make β
smaller.
First, we can make α higher.
This is usually counterproductive, as it just makes us
more likely to be wrong the other direction.
Second, we can make n (the sample size) larger.
Larger sample sizes make everything better.
Make the sample size too large, though, and you violate
the less than 10% condition.
Also, if the sample is too large, you are pretty much
doing a census.
A Pair of Complements
For a 2-sided test (One where the alternative is a ≠
kind), the complement of α is the confidence level.
And the compliment of the confidence level is the
significance level.
So, 95% confidence matches with 5% (for a ≠ test).
The complement of β is called statistical power.
You don’t need the formula for this, and only need to
know the term in a general way.
Significance
Remember how we discussed the difference between
statistical significance and practical significance?
I remember it.
Hopefully we can all remember it.
If the p-value is lower than α, we found sufficient
evidence.
Another way to say this is that a “statistically
significant difference was found”.
Statistically significant means the H0 got rejected.
Nothing To Prove
Remember how I said earlier that Statistics is never
used to prove anything ever?
I remember it.
Hopefully we can all remember it.
We never prove the null or the alternative hypothesis.
We only support the alternative strongly enough or do
not support it strongly enough.
A Truckload of Apples
On a recent homework problem you were asked to find
out how likely it was that a sample of 150 apples were
taken from a whole truckload would have less than 5%
damaged.
We were able to do this math because we were told
that the true percentage of damaged apples was 8%.
A Truckload of Apples
In the real world we would not know that 8% of the
whole truckload was damaged.
They just told us that so we could do the math.
So what should we do in the real world?
Since we don’t know what the true percent is, we will
do the next best thing…we will make a wild
assumption about the truth.
Wild Assumption
In practice we will make a sensible assumption,
actually.
Our assumption will usually be based on something
that is known.
This could be past data.
This could be the value from a similar group.
This could be .5 in the case of fair coins.
The purpose in making this assumption is, primarily,
to allow us to do the math.
Alternatives
The alternative hypothesis will be based on whatever
we are trying to prove.
We use the null hypothesis so we can do the math, but
really the alternative hypothesis is what we are trying
to focus on when we run a hypothesis test.
Foreshadowing FTW
Tomorrow we will discuss inferior ways to run a
hypothesis test.
I say inferior, but they are perfectly acceptable.
Mr. J.P. Damron, in fact, almost always favored a
confidence interval method for running hypothesis
tests.
It is worth noting that he never once got a hypothesis
test wrong with this method, and seriously did over 30
of them.
Assignments
You can read chapter 21, but I do not think it is
necessary, especially if you have limited free time.
Test 4 is next week on Thursday.
There will be another Quiz Retake Bonanza on
February 19th (the day after the test).