Transcript sample

Two Independent Means
Unit 8
HS 167
8: Comparing Two Means
1
Sampling Considerations
One sample or two?
If two samples, paired
or independent?
Is the response variable
quantitative or
categorical?
Am I interested in the
mean difference?
This chapter → two independent samples → quantitative response → interest in mean difference
HS 167
8: Comparing Two Means
2
One sample
SRS from one population
Comparisons made to an external reference population
HS 167
8: Comparing Two Means
3
Paired Sample
Two samples with each observation in sample 1 matched to a
unique observation in sample 2
Just like a one-sample problem except inferences directed
toward within-pair differences DELTA
HS 167
8: Comparing Two Means
4
Independent sample inference
Independent samples from two populations
No matching or pairing
HS 167
8: Comparing Two Means
5
What type of sampling method?
1. Measure vitamin content in loaves of bread and see
2.
3.
if the average meets national standards.
Compare vitamin content of bread immediately
after baking versus 3 days later (same loaves are
used on day one and 3 days later)
Compare vitamin content of bread immediately
after baking versus loaves that have been on shelf
for 3 days
1 = single sample
2 = paired samples
3 = independent samples
HS 167
8: Comparing Two Means
6
Illustrative example: independent
samples
Goal: compare response variable in two groups
Fasting cholesterol (mg/dl)
Group 1 (type A personality):
233, 291, 312, 250, 246, 197, 268,
224, 239, 239, 254, 276, 234, 181,
248, 252, 202, 218, 212, 325
Group 2 (type B personality)
344, 185, 263, 246, 224, 212, 188,
250, 148, 169, 226, 175, 242, 252,
153, 183, 137, 202, 194, 213
HS 167
8: Comparing Two Means
7
Data setup for independent samples
Two columns
Response
variable in one
column
Explanatory
variable in other
column
HS 167
8: Comparing Two Means
8
Side-by-side boxplots
Compare locations, spreads, and shapes
400
Interpretation:
21
20
(1) Different locations
(group 1 > group 2)
300
(2) Different spreads
(group 1 < group 2)
200
100
N=
20
20
1
2
(3) Shape: fairly
symmetrical (but
both with outside
values)
GROUP
HS 167
8: Comparing Two Means
9
Summary statistics by group
If no major departures from Normality, report means and standard
deviations (and sample sizes)
Group
n
mean
std dev
1
2
20
20
245.05
210.30
36.64
48.34
Take time to look at your results.
HS 167
8: Comparing Two Means
10
Notation for independent samples
Parameters (population)
Group 1
N1
µ1
σ1
Group 2
N2
µ2
σ2
Group 1
n1
s1
Group 2
n2
x1
x2
Statistics (sample)
s2
x1  x2 is the sample mean difference
x1  x2 estimates 1   2
HS 167
8: Comparing Two Means
11
Sampling distribution of mean difference
The sampling distribution of the mean difference is key to inference
x1  x2 ~ N ( 1   2 , SE x1  x2 )
{FIGURE DRAWN ON BOARD}
The SDM difference tends to be Normal with expectation μ1 − μ2
and standard deviation SE; (SE discussed next slide)
HS 167
8: Comparing Two Means
12
Pooled Standard Error
Illustrative data (summary statistics)
Group
ni
si
xbari
1
20
36.64
245.05
2
20
48.34
210.30
df1  n1  1  20  1  19
s
2
pooled
df 2  n2  1  20  1  19
df  df1  df 2  19  19  38
(df1 )( s12 )  (df 2 )( s22 )

df
(19)(36.64 2 )  (19)( 48.34 2 )

38
 1839.623
1 1
1 
 1
SEx1  x2  s 2pooled     1839.623    13.56
 20 20 
 n1 n2 
HS 167
8: Comparing Two Means
13
Confidence interval for µ1 – µ2
(1−αlpha)100% confidence interval for µ1 – µ2
( x1  x2 )  t df ,1   SE x1  x2
2
Illustrative example (Cholesterol in type A and B men)
( x1  x2 )  (t n 1,.975 )( SE x1  x2 )
 (245.05  210.30)  (2.02)(13.56)
 34.75  27.39
 (7.36, 62.14)
HS 167
8: Comparing Two Means
14
Comparison of CI formulas
(point estimate)  (t*)( SE )
Type of
sample
single
paired
independent
HS 167
point
estimate
df for t*
n 1
x
nd  1
xd
x1  x2
(n1  1)  (n2 1)
8: Comparing Two Means
SE

n
 delta
n
1 1
s 2pooled   
 n1 n2 
15
Independent t test
A. H0: µ1 = µ2
vs.
H1: µ1 > µ2 or
H1: µ1 < µ2 or
H1: µ1  µ2
tstat 
( x1  x2 )
SE x1  x2
with df  df1  df 2
B. Independent t
statistic
C. P-value – use t
table or software
utility to convert
tstat to P- value
D. Significance level
HS 167
Pooled t statistic
 (n1  1)  (n2  2)
Illustrative example
SE x1  x2  13.56
x1  x2  245.05  210.30  34.75
df  19  19  38
tstat 
x1  x2 34.75

 2.56
SE x1  x2 13.56
One - sided P between 0.01 and 0.005
Two - sided P between 0.01 and 0.02
8: Comparing Two Means
16
SPSS output
These are the pooled (equal variance) statistics calculated in HS 167
HS 167
8: Comparing Two Means
17
Conditions necessary for t procedures
Validity assumptions



good information (no information bias)
good sample (“no selection bias”)
good comparison (“no confounding” – no lurking
variables)
Distributional assumptions



HS 167
Sampling independence
Normality
Equal variance
8: Comparing Two Means
18
Sample size requirements for
confidence intervals
 1.96   
n

 d 
2
This will restrict the margin of error to no bigger than plus or minus d
HS 167
8: Comparing Two Means
19
Sample size requirement for CI
Suppose, you have a variable with  = 15
4 152
For d  5, use n 
 36
2
5
4 152
For d  2.5, use n 
 144
2
2.5
4 152
For d  1, use n  2  900
1
HS 167
8: Comparing Two Means
Sample size
requirements
increases when you
need precision
20
Sample size for significance test
Goal: to conduct a significance test with adequate
power to detect “a difference worth detecting”
The difference worth detecting is a difference
difference worth finding.


HS 167
In a study of an anti-hypertensives
for instance, a drop of 10 mm Hg might be worth
detecting, while a drop of 1 mm Hg might not be
worth detecting.
In a study on weight loss, a drop of 5 pounds might
be meaningful in a population of runway models, but
may be meaningless in a morbidly obese population.
8: Comparing Two Means
21
Determinants of sample size
requirements
“Difference worth detecting” ()
Standard deviation of data ()
Type I error rate ()

We consider only   .05 two-sided
Power of test (we consider on 80%
power)
HS 167
8: Comparing Two Means
22
Sample size requirements for test
Approx. sample size needed for 80%
power at alpha = .05 (two-sided) to
detect a difference of Δ:
16  
n
1
2

2
Illustrative example: Suppose Δ = 25 and 
= 45 …
2
16  45
n
 1  52.8  53
2
25
HS 167
8: Comparing Two Means
23