two-sample indx
Download
Report
Transcript two-sample indx
PH1690: Foundations of
Biostatistics
L8(2): Two Sample Hypothesis Testing
1
Objective
• Understand how to conduct appropriate ttests for two independent samples
2
Two Independent Samples
• Numerator of t statistic: difference
of sample means
• Denominator-depends on…whether
variances are considered equal.
– It uses the pooled variance estimate if
the assumption of equal variance
cannot be rejected.
– Uses approximate method assuming
variances not equal
3
To be pooled, or not to be pooled? Equal S.D. or
not equal?
• We could base our decision on an informal rule
(usually works and is much less complicated).
– If no sample standard deviation is twice the other, (i.e 0.5 < s1/s2
< 2), then the assumption of equal standard deviations should
be ok.
• We could perform a graphical analysis and look at the
box-plots for the samples to informally assess the
equal standard deviations assumption.
• There are formal tests to assess the evidence against
equal population standard deviations
– Variance Ratio Test
– Levene’s Test
4
Variance Ratio Test
• Ho: σ12 = σ22 vs. Ha: σ12 ≠ σ22
• Test statistic: F= s12 / s22 ~ Fdf1=n1-1, df2=n2-1
• p-value:
5
A note on effect of unequal variance on
student’s t-test with equal variances
• When the sample sizes are approximately
equal (what we call a balanced design),
unequal variances have little effect on pvalues and CI.
T Test with Equal Variances
• H0: μ1 = μ2
• H1: μ1 < μ2 or H1: μ1 > μ2 or H1: μ1 ≠ μ2
• Test statistic: t
• Numerator: difference in sample means
• Denominator: square root of pooled variance
estimate, times a quantity involving sample sizes of
both groups
• df: n1+n2-2
• Critical value: depending on H0
7
T Test with Equal Variances
• t test statistic:
t
x1 x2
1
1
s
n1 n2
(n 1)s
s
1
2
1
(n2 1) s22 / n1 n2 2
• Rejection region:
• t>tn +n -2,1-α or t<-tn +n -2,1-α
• t>tn +n -2,1-α/2 or t<-tn +n -2,1-α/2
1
2
1
2
1
2
1
2
8
Vitamin C Example
• To test whether Vitamin C will reduce the cold
occurrence compared to a placebo.
• 20 subjects are randomly assigned to Vitamin
C or placebo group.
9
Vitamin C Data
• The Excel data sheet:
Subject
Vitamin C
Placebo
1
4
7
2
0
8
Group
Mean
SD
3
3
4
Vitamin C
3.3
1.57
4
4
6
Placebo
5.7
1.34
5
4
6
6
3
4
7
4
6
8
3
4
9
2
6
10
6
6
10
• Stata examples are provided. We will walk
through the steps in class.
Stata Checking Equal Variance
Notice, that the standard deviations pass the 0.5<s1/s2<2 rule. Formally, using the
Variance ratio test, ratio=1.37, two-sided p-value=0.6446. Decision: Fail to reject the
Null hypothesis that the variances are equal. Conclude that we can use ttest with equa
12
variance.
Stata T Test Output
. ttest VitaminC=Placebo, unpaired
Two-sample t test with equal variances
Variable
Obs
Mean
Std. Err. Std. Dev. [95% Conf. Interval]
VitaminC
Placebo
10
10
3.3
5.7
.4955356
.4229526
1.567021
1.337494
2.179021
4.743215
4.420979
6.656785
combined
20
4.5
.4198997
1.877849
3.62114
5.37886
-2.4
.651494
-3.768738 -1.031262
diff = mean(VitaminC) - mean(Placebo)
Ho: diff = 0
t = -3.6838
degrees of freedom =
18
diff
Ha: diff < 0
Pr(T < t) = 0.0008
Ha: diff != 0
Pr(|T| > |t|) = 0.0017
Ha: diff > 0
Pr(T > t) = 0.9992
13
Another Scenario
• What will happen if we reject equal variance
hypothesis? In other words, the intermediate
null hypothesis 12 22 has been rejected.
14
T Test with Unequal Variances
• Hypotheses: same as t test with equal
variances
• Test statistic
• Numerator: same as t test with equal variances
• Denominator: different
• df: different
15
Satterthwaite’s Method
• t test statistic
x1 x2
t
s12 s22
n1 n2
• The approximate degrees of freedom, d’
d'
s
2
1
s
2
1
/ n1 s / n2
2
2
2
/ n1 / n1 1 s22 / n2 / n2 1
2
2
16
T Test for Unequal Variances
• Rejection Region
• t>td’,1-α or t<-td’,1-α
• t>td’,1-α/2 or t<-td’,1-α/2
17
Another Example Related to Vitamin C
• The data set was collected to answer the
research question: whether Vitamin C intake is
different between smokers and non-smokers.
• 20 smokers and 20 non-smokers were
included in this study.
18
Equal Variance Rejection
. sdtest Smoker=NonSmoker
Variance ratio test
Variable
Obs
Mean Std. Err. Std. Dev. [95% Conf. Interval]
Smoker
NonSmo~r
20
20
54.4 5.803991 25.95624 42.25211 66.54789
115.9 10.44129 46.69487 94.04613 137.7539
combined
40
85.15 7.681609 48.58276 69.61248 100.6875
ratio = sd(Smoker) / sd(NonSmoker)
Ho: ratio = 1
Ha: ratio < 1
Pr(F < f) = 0.0070
Ha: ratio != 1
2*Pr(F < f) = 0.0139
f = 0.3090
degrees of freedom = 19, 19
Ha: ratio > 1
Pr(F > f) = 0.9930
19
T Test with Unequal Variance
. ttest Smoker=NonSmoker, unpaired unequal
Two-sample t test with unequal variances
Variable
Obs
Mean
Std. Err. Std. Dev. [95% Conf. Interval]
Smoker
NonSmo~r
20
20
54.4
115.9
5.803991
10.44129
25.95624
46.69487
42.25211
94.04613
66.54789
137.7539
combined
40
85.15
7.681609
48.58276
69.61248
100.6875
-61.5
11.946
diff
-85.90668 -37.09332
diff = mean(Smoker) - mean(NonSmoker)
t = -5.1482
Ho: diff = 0
Satterthwaite's degrees of freedom = 29.7183
Ha: diff < 0
Pr(T < t) = 0.0000
Ha: diff != 0
Pr(|T| > |t|) = 0.0000
Ha: diff > 0
Pr(T > t) = 1.0000
20
CI for Paired Data
• Apply the confidence interval methods as
illustrated for one sample t test in Chapter 6
to d, to find a confidence interval for Δ , as
shown in the following equation:
d tn1,1 / 2 sd / n d tn1,1 / 2 sd
21
CI for Independent Data
• If equal variances,
1 1
1 1
x1 x2 tn1 n2 2,1 / 2 s
1 2 x1 x2 tn1 n2 2,1 / 2 s
n1 n2
n1 n2
• If unequal variances,
x1 x2 td ',1 / 2
s12 s22
s12 s22
1 2 x1 x2 td ',1 / 2
n1 n2
n1 n2
22
Summary
• You have learned:
– How to conduct a variance ratio test
– How to conduct a two-sample independent t-test
• Assuming equal variances
• Assuming unequal variances
23