Transcript Treatment 1

Tests of Hypotheses Involving Two Populations
12.1 Tests for the Differences of Means
Comparison of two means:  1 and  2
The method of comparison depends on the design of the
experiment
The samples will either be Independent or Dependent
Independent Samples – implies data values obtained from
one sample are unrelated to values from the other sample
Dependent Samples – implies subjects are paired so that
they are as much alike as possible
The purpose of pairing is to explain subject to subject
variability. (In some studies we apply both treatments to
the same subject)
If subject to subject variability is large relative to the expected
treatment differences then a dependent sample design
should be considered
12.3 Tests for Differences of Means of
Paired Samples
Observe n pairs of observations
x ij  observation for treatment i in pair j
d j  x1 j  x 2 j
difference between treatments 1 and 2 in pair j
Pair
Treatment 2
Difference
2
x11
x12
x 21
x 22
d1  x11  x21
d 2  x12  x22
3
x13
x 23
d 3  x13  x23
.
.
.
.
.
.
.
.
.
.
.
.
n
x1n
x2n
d n  x1n  x2 n
Sample Mean
x1
x2
d  x1  x 2
Sample Variance
s1
Sample Standard
Deviation
s1
1
Treatment 1
2
s2
s2
2
sd
sd
2
For dependent sample design all analysis is based on
differences: d1 , d 2 , d 3 ,..., d n
The differences are a sample from a distribution with
2
mean  d  1   2 and an unknown variance  d
We can calculate Confidence Intervals and perform
Hypothesis Tests in the same way as with one sample
Point Estimate for  d  1   2 is d  x1  x 2
d  d
The Test Statistic is T  s
d
n
which has a t-distribution with n-1 degrees of freedom
Confidence Interval for  d  1   2
s
E  t  d

d E


n
Example
A marketing expert wishes to prove that a new product
display will increase sales over a traditional display
Treatment 1: New Method
Treatment 2:Traditional Method
 12 stores are available for the study
 There is considerable variability from store to store
 We will divide the 12 stores into six pairs such that
within each pair the stores are as alike as possible
 Measurement will be cases sold in a one month period
Pair
New Method
Treatment 1
Traditional Method
Treatment 2
Difference
1
13
11
2
2
31
29
2
3
20
21
-1
4
19
17
2
5
42
39
3
6
26
22
4
Sample Mean
d 2
Sample Variance
s d  2.8
Sample Standard
Deviation
s d  1.6733
2
n6
d 2
s d  1.6733
Perform a Hypothesis test using
  .05
Conclude with 95% confidence that the new method
produces larger sales.
Perform and interpret a 95% confidence interval
We are 95% confident that the mean increase in sales is
between 0.244 cases and 3.756 cases using the new
product display.
12.2 Tests of Differences of Means Using Small
Samples from Normal Populations When the
Population Variances Are Equal but Unknown
Notation for Two Independent Samples
Sample 1 (Treatment 1)
Sample 2 (Treatment 2)
n1
n2
x11 , x12 , x13 ,..., x1n1
x 21 , x 22 , x 23 ,..., x 2 n2
Sample Mean
x1
x2
Sample Variance
s1
2
s2
Sample Standard
Deviation
s1
s2
Sample Size
Data
2
The point Estimate for 1   2 (Independent Samples)
is x1  x2
This is the same as what we did with dependent samples
The sampling distribution for x1  x2
 x x  1   2
1

2
2
x1  x2


2
1
n1


2
2
n2
If we sample from populations with means of  1 and  2 ,
and standard deviations of  1 and  2 respectively, then

x1  x 2   1   2 
Z

2
1
n1


2
2
n2
is approximately standard normal for large n.
The form of a confidence interval and a hypothesis test
for two independent samples depends on what we
assume
If we sample from populations 1 and 2, the samples are
independent, and  1
2
  2 then
2

x1  x 2   1   2 
T
S
2
p
n1

S
2
p
n2
has a t-distribution with d . f .  n1  n2  2
Where s p 
2
n1  1s1 2  n2  1s 2 2
n1  n2  2
This means s p  s p
2
2
2
is pooled estimate of  1   2
Thus for a Hypothesis Test the Test Statistic is

x1  x 2   1   2 
T
S
2
p
n1

S
2
p
n2
and a Confidence Interval is calculated using
2
 S2

S
E  t p  p 
n1
n2 



x1  x2   E
Example
A study is designed to compare gas mileage with a fuel
additive to gas mileage without the additive.
A group of 10 Ford Mustangs are randomly divided into
two groups and the gas mileage is recorded for one tank
of gas.
Treatment 1
(with additive)
Treatment 2
(without additive)
n1  5
n2  5
26.3, 27.4, 25.1, 26.8, 27.1
24.5, 25.4, 23.7, 25.9, 25.7
x1  26.54
x2  25.04
s1  0.813
s 2  0.848
Sample Size
Data
Sample Mean
Sample Variance
2
2
Do we have enough evidence at   .05 to prove that the
2
2
Additive increases gas mileage. Assume  1   2
Conclude with 95% confidence that the additive
improves gas mileage
Compute and interpret a 95% Confidence Interval
for 1   2
We are 95% confident that the additive will increase gas
mileage by an amount between 0.177 and 2.823 miles.
What if we do not assume  1   2 ?
2
It then makes no sense to compute s p
2
2
A reasonable variable is
T
x1  x 2   ( 1   2 )
s1
2
n1
s
 2
2
n2
Unfortunately the distribution of T is unknown.
For large n1 and n 2 , T is approximately standard normal.
For small n1 and n 2 , the distribution of T can be
approximated by a t-distribution using a complicated
formula for the degrees of freedom.