biol.582.f2011.lec.4

Download Report

Transcript biol.582.f2011.lec.4

BIOL 582
Lecture Set 4
Two Sample Hypothesis Tests
Review
BIOL 582
• We have already done a few two-sample hypothesis tests,
where we have generated distributions like this one
100
50
0
Frequency
150
200
Two samples (n1 =20), (n2=30), mean 1 = mean 2, sd 1 = sd 2, 1000 random permutations
-3
-2
-1
0
mean 1 - mean 2
1
2
3
Review
BIOL 582
• And we learned about theoretical probability distributions
• We know that both empirical and theoretical distributions
are used as proxies for distributions of test statistics under
a null condition.
0.2
0.1
0.0
Density
0.3
0.4
Two samples (n1 =20), (n2=30), mean 1 = mean 2, sd 1 = sd 2, 1000 random permutations
-3
-2
-1
0
mean 1 - mean 2
1
2
3
BIOL 582
Review
• Finally, we know how hypothesis testing works
• State Null hypothesis
• State Alternative hypothesis
• Define acceptable type I error rate
• Determine P-value of observed statistic
• Reject/Accept (or fail to reject) null hypothesis
• Arrive at a reasonable conclusion
• One thing we have not emphasized is that the process
above requires use of a proper test statistic!
• For example, if our null hypothesis is no difference in the
portion of individuals with trait y less than x, comparing
means would not make sense.
Two-Sample approaches for comparison of means
BIOL 582
• Here are some example test statistics, The first is most
intuitive, the rest build on the first
d12 = x1 - x2
d12 = x1 - x2
z12 =
x1 - x2
t12 =
The z stat scales the difference between sample means by the square root of the pooled
population variance (stuff in denominator). This stat converts the difference in means to
a “standard deviate”, which can be evaluated with a standard normal distribution. It
requires that population variances are known (rather unlikely)
s 12 s 22
N1
+
These are the same, but the absolute value indicates a two-tailed assessment of a
distribution is used (i.e., the alternative hypothesis is that means are not equal). As we
have seen, these stats can be evaluated with empirical distributions.
N2
The t stat scales the difference between sample means by the square root of
the pooled sample variance (first stuff in denominator), which is converted
to a standard error (by multiplying by the second stuff).
x1 - x2
( n1 -1) s12 + ( n2 -1) s22 ´
( n1 -1) + ( n2 -1)
1 1
+
n1 n2
This stat converts the difference in means to a “t stat”, which can be
evaluated with a t distribution. Recall that the t distribution is like a
standard normal distribution, corrected for small sample sizes. There are
different t distributions for different sample sizes (degrees of freedom). The
degrees of freedom are always the sum of the sample sizes, minus 2, unless
a “correction” is made (more later)
BIOL 582
Two-Sample approaches for comparison of means
• Here are some more test statistics
SS12
The “Sums of Squares” between samples 1 and 2. This stat needs some explanation.
Recall that sample variance is equal to
2
s2 =
The numerator is the sums of squares. Thus
S ( xi - x )
n -1
s2 =
SS
n -1
However, the numbers indicate that two samples are used. So in this case, the sums of
squares are calculated from sample means, not for them. See below:
SS12 = ( x1 - x ) + ( x2 - x )
2
See proof for
interchangeability
between SS and
contrast in means
on last page
2
Where the double bars indicate that this value is the “grand” mean (calculated from
values of both samples).
One must use an empirical distribution to evaluate this stat!
BIOL 582
Two-Sample approaches for comparison of means
• Here are some more test statistics
SS12
F=
SSE / (n1 + n2 - 2)
The F value is a ratio of “between group” variance to “within group” variance.
The sums of squares are first divided by the degrees of freedom – 1 for between
groups, n1 – n2 – 2 for within groups – to create variances. Then the betweengroup variance is “standardized” by the within-group variance. The sum of
squared error, SSE, is found as
SSE = å( xij - xi )
2
Meaning the mean of each group is subtracted from every subject within the
groups, squared, and summed. This is essentially finding the variance of each
sample but before dividing the SS by the df, the components are first added
(pooled)
This stat can be evaluated with an F distribution, with 1, and n1 – n2 – 2 df.
When the null hypothesis compares two groups, F = t2.
BIOL 582
Two-Sample approaches for comparison of means
• Thought-provoking question
• d12, SS require generating empirical distributions
• z, t, F do not
• Why would one bother using d12, SS?
• The other test stats have rigid assumptions about the data
1.
2.
3.
Data are sampled from normally distributed populations
Populations have equal variances (although some ways to deal with this)
All data are independent observations (also ways to deal with this)
BIOL 582
Two-Sample approaches for comparison of means
• Thought-provoking question
• d12, SS require generating empirical distributions
• z, t, F do not
• Why would one bother using stats that have strict
assumptions?
• Parametric stats have a multitude of uses whereas
empirical distributions have to take into account the
appropriate (and often unique) method to generate them.
BIOL 582
Two-Sample approaches for comparison of means
• For example, t and F stats can be used to
• test the slopes of linear regressions
• measure effects in “paired” designs
• test differences in proportions between populations
• compare linear models
•
We will investigate the following “by hand” and with “canned” functions in R
• Two sample means contrast
• Paired designs
•
The t stat has different variants for different tests, and for violations in
assumptions. For example, when variances are unequal, one can use an
alternative calculation of t, which penalizes the degrees of freedom and means
using a t-distribution with fatter tails, making it easier to have a type II error in
lieu of a type I error. We will not dwell on these, but realize they exist. We will
spend time, however, learning to concern ourselves with assumptions.
Dealing with assumptions
BIOL 582
•
Normality
•
How to evaluate?  Goodness of fit tests, normal probability plots
Normal Quantile Plot
3
6
Normal Quantile Plot
Bad
2
sd observed
0
0
-1
-2
-2
-3
sd observed
1
4
2
Good
-3
-3
-2
-1
0
1
2
-2
-1
0
1
3
sd expected
sd expected
Two-sample Kolmogorov-Smirnov test
data: a and b
D = 0.04, p-value = 0.4005
alternative hypothesis: two-sided
Two-sample Kolmogorov-Smirnov test
data: a and b
D = 0.078, p-value = 0.004558
alternative hypothesis: two-sided
2
3
Dealing with assumptions
BIOL 582
Equal variance (and and a hint of normality)
•
How to evaluate?  Box plots
Really Bad
2
0
standard devaiates
-5
0
standard devaiates
1
0
-1
-2
-2
-3
standard devaiates
4
Bad
5
Good
2
3
•
1
2
Group
1
2
Group
1
2
Group
Proof of interchangeability between squared mean contrast and SS
( x1 - x2 ) = x12 - 2x1x2 + x22
2
2
é
ù
x
x
=
x
x
x
x
( 1 2 ) ë( 1 ) ( 2 )û
2
= éë( x1 - x ) - ( x2 - x )ùû
2
= ( x1 - x ) - 2 ( x1 - x ) ( x2 - x ) + ( x2 - x )
2
2
= SS - 2 ( x1 - x ) ( x2 - x )
= SS - 2x1 x2 + 2x1 + 2x2 - 2x 2
2.0
Thus
x12 - 2x1 x2 + x22 = SS - 2x1 x2 + 2x1 + 2x2 - 2x 2
x12 - 2x1 + x 2 + x22 - 2x2 + x 2 = SS
( x1 - x ) + ( x2 - x ) = SS
2
2
1.8
2
= SS - 2x1 x2 + 2x1 + 2x2 - 2x 2
1.6
( x1 - x2 )
1.4
x12 + x22 - 2x1 - 2x2 + 2x 2 = SS
mean contrats/SS
2
1.2
2
2
1.0
x - 2x1 x2 + x + 2x1 x2 - 2x1 - 2x2 + 2x = SS
2
1
Thus
0
20
40
60
n1/n2
80
100