Transcript 11-2 Day 1
AP STATISTICS
LESSON 11 – 2
(DAY 1)
Comparing Two Means
ESSENTIAL QUESTION:
When can procedures for
comparing two means be used
and what are those procedures?
Objectives:
• To determine if procedures for comparing two
means should be used.
• To construct two means significance tests
• To construct confidence intervals to make
inferences when comparing two samples.
Comparing Two Means
Comparing two populations or two treatments is one of the
most common situations encountered in statistical practice.
We call such situations two-sample problems.
A two sample problem can arise from a randomized
comparative experiment that randomly divides subjects into
two groups and exposes each group to a different
treatment.
Comparing random samples separately selected from two
populations is also a two sample problem. Unlike the
matched pairs designs studied earlier, there is no
matching of the units in the two samples and the
two samples can be of different sizes.
Two – Sample Problems
• The goal of inference is to compare the
responses to two treatments or to
compare the characteristics of two
populations.
• We have a separate sample from each
treatment or each population.
Example 11.9
Page 648
Two-Sample Problems
1.
A medical researcher is interested in the effect on blood pressure
of added calcium in our diet. She conducts a randomized
comparative experiment in which on group of subjects receives a
calcium supplement and a control group receives a placebo.
2.
A psychologist develops a test that measures social insight. He
compares the social insight of male college students with that of
female college students by giving the test to a sample of students
of each gender.
3.
A bank wants to know which of two incentive plans will most
increase the use of its credit cards. It offers each incentive to a
random sample of credit card customers and compares the
amount charged during the following six months.
Conditions for Comparing Two Means
• We have two SRSs, from two distinct
populations. The samples are independent
(That is, one sample has no influence on the
other.) Matching violates independence, for
example. We measure the same variable for
both samples.
• Both populations are normally distributed.
The means and standard deviations of the
populations are unknown.
Organizing the Data
Call the variable we measure x1 in the first
population and x2 in the second . We know
parameters in this situation.
Population
Variable
Mean
Standard
deviation
1
x1
μ1
σ1
2
x2
μ2
σ2
Organizing Data (part 2)
There are four unknown parameters, the
two means and the two standard
deviations.
Population Sample Mean
size
1
2
n1
n2
x1
x2
Sample
Standard
deviation
s1
s2
Example 11.10
Page 650
Calcium and Blood Pressure
Does increasing the amount of calcium
in our diet reduce blood pressure? A
randomized comparative experiment
was designed.
• Subjects: 21 Healthy Black Men
• A randomly chosen group of 10 of the men
received a calcium supplement for 12 weeks.
• The control group of 11 men received a
placebo.
• The experiment was double-blind.
The Sampling Distribution of x1 – x2
• The mean of x1 – x2 is μ1 – μ2. That is,
the difference of sample means is an
unbiased estimator of the difference of
population means.
• The variance of the difference is the
sum of the variance of x1 – x2 which is
σ 1 + σ2
n1
n2
Note that the variance add. The standard
deviations do not.
• If the two populations are both normal
The Sampling Distribution of x1 – x2
(continued…)
Then the distribution of x – x is also
normal.
The two-sample z statistic is standardized
by
z = (x1 – x2 ) – ( μ1 – μ2 )
√ σ12/n1 + σ22/n2
Standard Deviation of Two-Sample Means
Whether an observed difference between
two samples is surprising depends on the
spread of the observations as well as on
the two means. This standard deviation is
√ σ12/n1 + σ22/n2
Standard Error
• Because we don’t know the population
standard deviations, we estimate them
by the sample standard deviations from
our two samples.
SE = √ s12/n1 + s22/n2
The two-sample t statistic:
t = (x1 – x2 ) – ( μ1 – μ2 )
√ s12/n1 + s22/n2
Two-Sample t Distributions
• The statistic t has the same
interpretation as any z or t statistic: it
says how far x1 – x2 is from its mean in
standard deviation units.
• When we replace just one standard
deviation in a z statistic by a standard
error we must replace the z distribution
with the t distribution.
Degrees of Freedom for
Two-Sample Problems
Two methods for calculating degrees of
freedom:
Option 1: Use procedures based on the statistic t
with critical values from a t distribution (used by
calculator).
Option 2: Use procedures based on the based on
the statistic t with critical from the smaller n – 1.
Confidence Interval for a
Two-Sample t
( μ1 – μ2 ) ± t*√ s12/n1 + s22/n2
Compute the two-sample t statistic
t=
(x1 – x2 )
√ s12/n1 + s22/n2
Example 11.11 Page 655
Calcium and Blood Pressure,
continued
The P-value. This example uses the conservative method which leads to the t
distribution with 9 degrees of freedom.
Example 11.12
Page 656
Two-Sample t Confidence
Interval
• Sample size strongly influences the Pvalue of a test.
• An effect that fails to be significant at a
s specified level a in a small sample will
be significant in a larger sample.
Robustness Again
• The two-sample t procedures are more robust
than the one-sample t methods, particularly when
the distributions are not symmetric.
• When the sizes of the two samples are equal and
the two populations being compared have
distributions with similar shapes, probability
values from the t table are quite accurate.
• When the two populations distributions have
different shapes, larger samples are needed.