Transcript Water Test

Water Test
• Take 1 cup from each sleeve
–
–
–
–
–
See numbers on bottom of cup
Numbers should be a # < 100 and 500 + that number
For small # (<100), 1’s place tells which bottle to pour
For large # (>500), 1’s place + 1 tells which bottle to pour
Labels do not matter. (Water has been transferred, etc)
• According to Evian’s website, their H2O has a:
– “pleasant taste and silky texture”
– Further, it is “exceptionally drinkable”
• Decide which is which, and give me the cup that you
think is tap water. (One is Evian and 1 is tap.)
Water Test
• Experiment estimates overall ability of
population to tell difference between tap
and Evian water:
– Xi
= 0 if person i can’t tell difference
= 1 if person i can
– p = x1+ …+xn
• Note that this is different from the
probability that an individual person can
tell the difference.
0.2
0.1
“all water tastes
the same”
average
joe / jane
water
connoisseurs
0.0
Density of tasting ability
0.3
0.4
Picture – idea for what p means
-4
-2
0
Tasting Abilities
2
4
Picture – idea for what p means
p is:
0.2
½ + area under
curve to right
of the “ability cutoff”
0.1
(or 1 if that sum is greater than 1)
0.0
Density of tasting ability
0.3
0.4
If a person’s ability is
greater than a certain
number (“threshold”),
then the person
can tell the difference.
-4
-2
0
2
4
Tasting Abilities
Certain ability or threshold
Water Test
• Experiment estimates overall ability of
population to tell difference between tap
and Evian water:
– Xi
= 0 if person i can’t tell difference
= 1 if person i can
– p = (x1+ …+xn)/n
• Note that this is different from the
probability that an individual person can
tell the difference.
How could we estimate this?
Hypothesis test
• Suppose we want to know if people can do
better than “just guessing”?
• What’s the null hypothesis?
Type 1 and Type 2 Errors
Action
H0 True
Fail to Reject H0
Reject H0
correct
Type 1
error
Significance level = a
=Pr(Making type 1 error)
Truth
HA True
Type 2
error
correct
Power =
1–Pr(Making type 2 error)
In terms of our water example,
suppose we repeated the experiment
and sampled 50 new people
Pr( Type 1 error )
= Pr( reject H0 when mean is 50% )
= Pr( |Z| > z0.025 )
= Pr( Z > 1.96 ) + Pr( Z < -1.96 ) = 0.05 = a
When p is 0.5, then Z, the test statistic, has a standard normal distribution.
Note that the test is designed to have type 1 error = a
Power
= Pr( reject H0 when p is not 0.5),
For instance, suppose 0.75 is a difference from 0.5 that is important
to detect.
= Pr( reject H0 when p is 0.75 )
= Pr( |(P-0.5)/sqrt((0.75*0.25)/35)| > 1.96)
= Pr((P-0.5)/sqrt( (0.75*0.25)/35) > 1.96 )
+ Pr((P-0.5)/sqrt( (0.75*0.25)/35) < -1.96 )
P ~N(0.75, (0.75*0.25)/35 ) by CLT, so
(P-0.5)/sqrt( (0.75*0.25)/35) ~ N(3.42,1). This means, we want
Pr( Z > 1.96-3.42) + Pr(Z < -1.96-3.42) where Z~N(0,1)
=Pr( Z > -1.46 ) + Pr(Z < -5.38) = 0.93
Power calculations are an integral
part of planning any experiment:
• Given:
– a certain level of a
– difference that is of interest
– (need preliminary estimate of std dev of x’s
that go into x when the test is about a mean)
• Compute required n in order for power to
be at least 85% (or some other
percentage...)
Power calculations are an integral
part of planning any experiment:
•
•
•
Bad News: Algebraically messy (but you should know
how to do them)
Good News: Minitab (or other statistical software) can
be used to do them:
Stat: Power and Sample Size…
–
–
Choose type of test you’re doing (“Z test” = large sample test
for a mean)
Inputs:
1. required power
2. difference of interest
–
Output:
Result = required sample size
–
Options: Change a, one sided versus 2 sided tests
1.0
Picture for Power
0.8
0.6
0.4
0.2
Power
“Pr(Reject HO
when it’s false)”
As n increases and/or
a increases and/or std
dev decreases, these
curves become
steeper
0.0
0.5
True p
1.0