Chapter 9. Comparing Two Population Means

Download Report

Transcript Chapter 9. Comparing Two Population Means

Chapter 9. Comparing Two
Population Means
9.1
9.2
9.3
9.4
9.5
NIPRL
Introduction
Analysis of Paired Samples
Analysis of Independent Samples
Summary
Supplementary Problems
1
9.1 Introduction
9.1.1 Two Sample Problems(1/7)
• A set of data observations x1,… , xn from a population A
with a cumulative dist. FA(x),
a set of data observations y1,… , ym from another population B
with a cumulative dist. FB(x).
• How to compare the means of the two populations,  A and  B ?
(Fig.9.1)
• What if the variances are not the same between the two
populations ? (Fig.9.2)
NIPRL
2
9.1.1 Two Sample Problems (2/7)
Is  A   B ?
Probability distribution
of population B
Probability distribution
of population A
Fig.9.1 Comparison of the
means of two prob. dists.
A
B
Is  A   B ?
2
2
Probability distribution
of population B
Probability distribution
of population A
Fig.9.2 Comparison of the
variance of two prob. dists.
A
NIPRL
B
3
9.1.1 Two Sample Problems (3/7)
• Example 49. Acrophobia Treatments
- In an experiment to investigate whether
the new treatment is effective or not,
a group of 30 patients suffering from
acrophobia are randomly assigned to
one of the two treatment methods.
- 15 patients undergo the standard treatment, say A, and 15 patients undergo
the proposed new treatment B.
Fig.9.3 Treating acrophobia.
NIPRL
4
9.1.1 Two Sample Problems (4/7)
- observations x1,… , x15 ~ A (standard treatment),
observations y1,… , y15 ~ B (new treatment).
- For this example, a comparison of the population means  A and  B ,
provides an indication of whether the new treatment is any better or
any worse than the standard treatment.
NIPRL
5
9.1.1 Two Sample Problems (5/7)
- It is good experimental practice to randomize
the allocation of subjects or experimental objects
between standard treatment and the new treatment,
as shown in Figure 9.4.
- Randomization helps to eliminate any bias
that may otherwise arise if certain kinds of subject
are “favored” and given a particular treatment.
• Some more words: placebo, blind experiment,
double-blind experiment
NIPRL
Fig.9.4 Randomization of
experimental subjects
between two treatment
6
9.1.1 Two Sample Problems (6/7)
• Example 44. Fabric Absorption Properties
- If the rollers rotate at 24 revolutions per minute, how does changing the
pressure from 10 pounds per square inch (type A) to 20 pounds per square
inch (type B) influence the water pickup of the fabric?
- data observations xi of the fabric water pickup with type A pressure
and observations yi with type B pressure.
- A comparison of the population means  A and  B shows
how the average fabric water pickup is influenced by the change in pressure.
NIPRL
7
9.1.1 Two Sample Problems (7/7)
•
Consider testing
H0 :  A  B
• What if a confidence interval of
•
 A  B contains zero ?
Small p-values indicate that the null hypothesis is not a plausible statement,
and there is sufficient evidence that the two population means are different.
• How to find the p-value ?
Just in the same way as for one-sample problems
NIPRL
8
•9.1.2 Paired Samples Versus Independent Samples
(1/2)
• Example 53. Heart Rate Reduction
- A new drug for inducing a temporary reduction in a patient’s heart rate
is to be compared with a standard drug.
- Since the drug efficacy is expected to depend heavily on the particular
patient involved, a paired experiment is run whereby each of 40 patients is
administered one drug on one day and the other drug on the following day.
- blocking: it is important to block out unwanted sources of variation that
otherwise might cloud the comparisons of real interest
NIPRL
9
9.1.2 Paired Samples Versus Independent Samples (2/2)
• Data from paired samples are of the form (x1, y1), (x2, y2), …, (xn, yn) which
arise from each of n experimental subjects being subjected to both “treatments”
• The comparison between the two treatments is then based upon the pairwise
differences zi = xi – yi , 1 ≤ i ≤ n
Fig.9.9 Paired and
independent samples
NIPRL
10
9.2 Analysis of Paired Samples
9.2.1 Methodology
• Data observations (x1, y1), (x2, y2), …, (xn, yn)
One sample technique can be applied to the data set zi = xi – yi , 1 ≤ i ≤ n,
in order to make inferences about the unknown mean  (average difference).
•
xi   A   i   i A,
yi  B   i   i B ,
where the observation obtained when treatment A is applied to the ith experimental
subject, can be thought of as a treatment A effect  A , together with a subject i effect
 i say, and with some random error  i A .
NIPRL
11
zi   A  B   i AB ,
 i AB   i A   i B
Since this error term is an observation from a dist. with a zero expectation,
zi 's may be regarded as observations from a distribution with
mean μ=μ A -μ B , which does not depend on the subject effect γi .
NIPRL
12
9.2.2 Examples(1/2)
• Example 53. Heat Rate Reduction
- An initial investigation of the data observations zi
reveals that 30 of 40 are negative.
This suggests that    A   B  0
- The sample average z  2.655, the sample standard
deviation s  3.730, so that with    0, the t -statistic is
t
n(z  )
40  (2.655)

 4.50
s
3.730
- Two sided hypothesis is H :   0 versus H :   0,
0
A
p-value  2  P( X  4.50) 0.0001, where X ~ t39 ( ).
NIPRL
13
9.2.2 Examples(2/2)
- This reveals that it is not plausible that   0, we can conclude that
there is evidence that the new drug has a different effect from the standard drug.
- t0.005, 39  2.7079, a 99% two-sided confidence interval for the difference
between the average effects of the drugs is

t0.005, 39 s 

n
n 

2.7079  3.730
2.7079  3.730 

  2.655 
,  2.655 

40
40


   A  B   z 
t0.005, 39 s
,z
  4.252,  1.058 
- Consequently, the new drug provides a reduction in a patient’s heart rate of
somewhere between 1% and 4.25% more on average than the standard drug.
NIPRL
14
9.3 Analysis of Independent Samples
Samples
size
mean
n
x
y
x1 , x2 , ... , xn
y1 , y2 , ... , ym
Population A
Population B
m
standard deviation
sx
sy
The point estimate of  A   B is x  y .
Var( x )   A / n,
2
Var( y )   B / m
2
Standard error is s.e.( x  y ) 
NIPRL
 A2
n

 B2
m
15
Assumethatthe population variance  2 isunknown.
Thestandard errors for two-sample t - testsareasfollows:
i ) inthegeneral procesure: 
s.e.( x  y ) 
2
sx 2 s y

,
n
m
 ii ) inthepooled variance procesure:
s.e.( x  y )  s p
1 1

n m
When  A 2 and  B 2 are assumedtotake "known" values, weusea two-sample z -test.
NIPRL
16
9.3.1 General Procedure (Smith-Satterthwaite test)
• Inference about
 A   B uses a point estimate x  y whose standard error is
estimated by s.e.( x  y ) 
In this case,
•
2
sx 2 s y

m
n
p - values and critical points are calculated from at-distribution.
Degrees of freedom  
s
2
x
/ n  sy / m
2

2
sx / n (n 1)  s y / m (m 1)
4
2
4
2
,
or at least the value rounded down to the nearest integer.
The simpler choice   min{n, m}  1 can be used, although it is a little less powerful.
NIPRL
17
A two-sided 1   level confidence interval for  A   B is
2
2

sx 2 s y
sx 2 s y 

 A   B   x  y  t / 2,

, x  y  t / 2,


n
m
n
m 


Astandard format is





     critical pt. s.e.(  ),   critical pt. s.e.(  ) 


NIPRL
18
• One sided confidence intervals are
2 
2

s
s
 A   B   , x  y  t , x  y  and
n
m 



2
2


s
s
y
 A   B   x  y  t , x  ,  
n
m




• For the two-sided hypothesis testing problem
H 0 :  A   B   versus H A :  A   B   for fixed value  of interest(usually   0).
x  y 
The appropriate t-statistic is t 
sx 2 / n  s y 2 / m
NIPRL
19
A two-sided p - value is calculated as 2  P( X | t |), where the random variable
X has a t - distribution with  degrees of freedom, and one-sided p - values are
P ( X  t ) and P( X  t ).
A size  two-sided hypothesis test
accepts H 0 if | t |  t / 2, and
rejects H 0 when | t |  t / 2,  t  t , or t  t , , for one-sided hypo.test 
NIPRL
20
• Suppose n  24, x  9.005, s  3.438 and m  34, y  11.864, s  3.305.
x
y
The hypotheses H 0 :  A   B versus H A :  A   B are tested with
t
9.005  11.864
3.438 / 24  3.305 / 34
2
2
 3.169
• The two-sided p - value is therfore p - value  2  P( X  3.169), where
(3.4382 / 24  3.3052 / 24) 2

 48.43
3.4384 /(242  23)  3.3054 /(342  33)
Using the integer value   48 gives
p - value
2  0.00135  0.0027.
• So there is very strong ecidence that the
null hypothesis is not a plausible statement, and the experimenter can conclude
that A   B .
NIPRL
Fig.9.22
p-value=0.0027
21
• A 99% two-sided confidence interval for the difference in population means
 A   B  (9.005  11.864  t0.005,48
3.4382 3.3052

,
24
34
3.4382 3.3052
9.005  11.864  t0.005,48

)  (5.28,  0.44)
24
34
• The fact that zero is not contained within this confidence interval implies
that the null hypothesis H 0 :  A   B has a two-sided p - value smaller than
0.01, which is consistent with the previous analysis.
NIPRL
22
9.3.2 Pooled Variance Procedure
(n  1) sx  (m  1) s y
2
2
ˆ
Assume



,
then
the
estimte

 sp 
A
B
•
2
2
2
2
nm2
which is known as the pooled variance estimator.
• In this case, the standard error of x - y is
s.e.( x - y ) 
 A2
n

 B2
m
which can be estimated by s p
NIPRL
1 1
 ,
n m

2
1 1

n m
23
When a pooled variance estimate is employed,
 p-values and critical points are calculated from a
t -distribution with n  m - 2 degrees of freedom.
So, a two-sided 1   level confidence interval for  A  B is

1 1
1 1
 A  B   x  y  t / 2, n  m2 s p
 , x  y  t / 2, n  m2 s p
 
n m
n m

which again is constructed using the standard format





     critical point  s.e.(  ),   critical point  s.e.(  )  with    A  B .


NIPRL
24
• Consider again the data obtained with n  24, x  9.005, sx  3.438 and m  34,
y  11.864, s y  3.305. The hypotheses H 0 :  A   B versus H A :  A   B are tested
with t 
sp
xy
23  3.4382  33  3.3052
 3.192, where s p 
 3.360
24

34

2
1/ n  1/ m
• The two-sided p-value is 2  P( X  3.192) 2  0.00115  00023,
where r.v. X has a t -distribution with d . f . n  m  2  56.
Fig.9.24
p-value=0.0023
NIPRL
25
9.3.3. z-Procedure
• Two-sample z-tests are used when an experimenter wishes to use "known" values
of the population standard deviations  A and  B in place of sx and s y .
•
In this case, p-values and critical points are calculated from the standard normal
distribution.
•
Consider a sample of size n from population A with x ,  A and a sample size m
2
from population B with y ,  B .
2
-
A two-sided 1   level confidence interval for the difference in population

 A2  B 2
 A2  B 2 

means  A  B is  x  y  z / 2

, x  y  z / 2


n
m
n
m 


One-sided confidence intervals are



 A2  B 2 
 A2  B 2
 , x  y  z / 2
 and  x  y  z / 2


, 



n
m 
n
m




NIPRL
26
• The appropiate z -statistic for the null hypothesis H 0 :  A   B   is
x  y 
z
 A2 / n   B 2 / m
• A two-sided p-value is calculated as 2  ( | z |), and one-sided p-values are
1  ( z ) and ( z).
• A size  two-sided hypothesis test accepts the ull hypothesis if | z |  z / 2
and reject the null hypothesis when | z |  z / 2
and size  one-sided hypothesis tests have rejection regions z  z / 2 or z   z / 2 .
NIPRL
27
9.3.4. Examples(1/2)
• Example 49. Acrophobia Treatments
The one-sided hypothesis testing H 0 :  A   B versus H 0 :  A   B
n  m  15, x  47.47, y  51.20, and sx  11.40, s y  10.09 Fig.9.25 Acrophobia
- unpooled analysis (from Minitab)
treatment data set
(improvement scores)
(11.402 /15  10.092 /15) 2

 27.59(DF),
11.404 /(152 14)  10.094 /(152 14)
47.47  51.20
t
 0.949, P( X  0.949)  0.175
2
2
11.40 /15  10.09 /15
NIPRL
28
9.3.4. Examples(2/2)
- Analysis using pooled variance
47.47  51.20
 0.946,
10.8 1/15  1/15
  n  m  2  28,
t
sp
2
14 11.402  14 10.092

 115.88 or s p  115.88 10.8
28
* Almost same as in the unpooled case.
NIPRL
29
9.3.5. Sample Size Calculations
•
Interest : determination of appropriate sample sizes n and m, or
an assessment of the precision afforded by given sample sizes
• For the general procedure, confidence interval length is L  2t / 2,
2
sx 2 s y

n
m
• If we know that the s.d.'s of the two populations A and B are not larger than
 A and  B , respectively, we can then estimate that the sample size
n and m are adequate as long as
L0  2t / 2,
 A2

 B2
n
m
where a suitable value of the critical pointt / 2, can be used.
NIPRL
30
If an experimenter wishes to have equal sample sizes n  m,
4t 2 / 2, ( A   B )
then n  m 
2
L0
2
2
• Example 44. Fabric Absorption Properties
Based upon samples with n  m  15, a 99% confidence interval for the mean
difference was calculated  A   B  (6.24,16.26).
This interval has a length of over 10%.
How much additional sampling is required if the experimenter wants
a 99% confidence interval with a length no larger than L0  4%?
NIPRL
31
–
The initial samples have sx  4.943 and s y  4.991, and the (pooled) analysis of
the initial samples uses a critical point t0.005,28  2.763.
4  2.7632  (4.9432  4.9912 )
Using these values in the formula n  m 
 94.2.
4.02
– to meet the specified goal, the experimenter can estimate that total
sample sizes of n=m=95 will suffice.
NIPRL
32
Summary problems
(1) In a one-sample testing problem of means, the rejection region is in the
same direction as the alternative hypothesis.
(yes)
(2) The p-value of a test can be computed without regard to the
significance level.
(yes)
(3) The length of a t-interval is larger than that of a z-interval with the same
confidence level.
(no)
(4) If we know the p-value of a two-sided testing problem of the mean, we
can always see whether the mean (0 ) is contained in a two-sided
confidence interval.
(yes)
(5) Independent sample problems may be handled as paired sample
problems.
(no)
NIPRL
33