No Slide Title

Download Report

Transcript No Slide Title

Hypothesis Testing
Basic Problem
We are interested in deciding whether
some data credits or discredits some
“hypothesis” (often a statement about
the value of a parameter or the
relationship among parameters).
1
Suppose we consider the value
of  = (true) average lifetime of
some battery of a certain cell size
and for a specified usage.
H0 :  = 160
and hypothesize: H1 :  160

2
This would usually involve a
scenario of either
H0 :  = 160
(1) 160 is a “standard”
H1 : 
160
(2) the previous  was 160
(has there been a change?)

or something analogous.
H0 is called the “Null Hypothesis”
H1 is called the “Alternate
Hypothesis”
3
We must make a decision whether to
ACCEPT H0 or REJECT H0
(ACCEPT H0 same as REJECT H1)
(REJECT H0 same as ACCEPT H1)
We decide by looking at
from a sample of size n
X
4
Basic Logic:
H0 :  = 160
H1 : 
160

(1) Assume (for the moment) that H0 true.
(2) Find the probability that X would be
“that far away”, if, indeed, H0 is true.
(3) If the probability is small, we reject H0; if
it isn’t too small, we give the benefit of
the doubt to H0 and accept H0.
5
BUT — what’s “too small” or “not too
small”? Essentially — you decide!
You pick (somewhat arbitrarily) a value,
, usually .01    .10, and most often
= .05, called the SIGNIFICANCE
LEVEL;
6
If the probability of getting “as far
away” from the H0 alleged value as we
indeed got is greater than or equal to
, we say “the chance of getting what
we got isn’t that small and the
difference could well be due to sample
error, and, hence, we accept H0 (or, at
least, do not reject H0).”
7
If the probability is <, we say that the
chance of getting the result we got is
too small (beyond a reasonable doubt)
to have been simply “sample error,”
and hence, we REJECT H0.
8
Suppose we want to decide if a coin is
fair. We flip it 100 times.
H0 : p = 1/2, coin is fair
H1 : p 1/2, coin is not fair

9
Let X = number of heads
Case 1)
X = 49
Perfectly consistent with H0,
Could easily happen if p = 1/2;
ACCEPT H0
2)
X = 81
Are you kiddin’? If p = 1/2, the
chance of gettin’ what we got is
one in a billion!
REJECT H0
3)
X = 60
NOT CLEAR!
10
What is the chance that if p = 1/2 we’d
get “as much as” 10 away from the
ideal (of 50 out of 100)?
If this chance <, reject H0
If this chance >, accept H0
11
Important logic:
H0 gets a huge
“Favor from the Error”;
H1 has the
“Burden of Proof”;
We reject H0 only if the results are
“overwhelming”.
12
To tie together the  value chosen and the X
values which lead to accepting (or rejecting)
H0, we must figure out the probability law of
X if H0 is true.
Assuming a NORMAL distribution (and the
Central Limit Theorem suggests that this is
overwhelmingly likely to be true), the answer
is:
X
 = 160
13
We can find (using normal distribution
tables) a region such that  = the
probability of being outside the region:
X
/2
150.2
/2
=160
169.8
(I made up the values of 150.2 and 169.8)
14
Note: logic suggests (in this example) a
“rejection” region which is 2sided; in experimental design,
most regions are 1-sided.
150.2
169.8 is called the
Acceptance Region (AR)
<150.2 and >169.8
is called the
Critical Region (CR)
15
X
/2
150.2
/2
=160
169.8
Decision Rule:
If X in AR, accept H0
If X in CR, reject H0
16
X is called the “TEST STATISTIC” (that
function of the data whose value we
examine to see if it’s in AR or CR.)
ONE-SIDED LOWER TAIL
ONE-SIDED UPPER TAIL
H0 :  < 10
H1 :  >10
H0 :  > 20
H1 :  < 20
X
X


C


Critical
Value
20
10
C

Critical
Value
17
 has another meaning, which in
many contexts is important:
H0 true
we accept H0
we reject H0
Good!
(Correct!)
Type I
Error, or
“ Error”
H0 false
Type II
Error, or
“ Error”
Good!
(Correct)
18
 = Probability of Type I error
= P(rej. H0|H0 true)
 = Probability of Type II error
= P(acc. H0|H0 false)
19
We often preset . The value of 
depends on the specifics of the H1:
(and most often in the real world, we
don’t know these specifics).
20
EXAMPLE: H0 :  < 100
H1 :  >100
Suppose the Critical Value = 141:
X

=100
C=14
1
21
 = P (X < 141/= 150)
= .3594
What is ?
These are
values
corresp.to a
value of 25
for the Std.
Dev. of X
 = 150
141  = 150
 = 160
 = P (X < 141/= 160)
= .2236
141
 = 160
 = 170
 = P (X < 141/= 170)
= .1230
141
 = P (X < 141/= 180)
 = 170
 = 180
= .0594
 = P (X < 141|H0 false)
141
 = 180
Note: Had  been preset at .025
(instead of .05), C would have
been 149 (and  would be
larger); had  been preset at
.10, C would have been 132
and  would be smaller.
 and  “trade off”.
23
In ANOVA, we have
H0 : 12• • • = c
H1 : not all (column) means are =.
The probability law of “Fcalc” in the ANOVA
table is an F distribution with appropriate
degrees of freedom values, assuming H0 is
true:

0
C

Critical Value
24
MSBcol
Fcalc =
MSWError
E(MSBcol) = 2 + Vcol
E(MSWError) = 2
The larger the ratio, Fcalc, the more suggestive
that H0 is false).

C
C is the value so that if Vcol = 0 (all ’s=)
P (Fcalc > C)= 
25
Note: What is ?
(
 = P Fcalc < C
The ’s are not all =
(i.e., the level of the
factor does matter!!)
)
Answer:
Unable to be determined because we
would need exact specification of the
“non-equality”.
[Hardly ever known, in practice!]
26
HOWEVER —
The fact that we cannot compute
the numerical value of  in no way
means it doesn’t exist!
And – we can prove that whatever
 is, it still “trades off” with .
27