Transcript PPT Notes

Chapters 22,
24, 25
Inference for Two-Samples
Confidence Intervals for
2 Proportions
Categorical Data
We use the 2 Proportion Z
Interval when we have two
independent samples from
categorical data and we want
an interval to estimate the true
difference in those proportions. .
Comparing Two Proportions
Conditions:
*Randomness for both samples
*Independence within both samples
*n1p1, n1(1-p1), n2p2, n2(1-p2) are all  10
(insures both samples are large enough to
approximate normal)
*Independence between both samples
Pooling
Pooling means we are combining our
sample sizes to reduce variability.
We can only pool under certain conditions.
The formula chart reminds you when you
are allowed to pool.
For Confidence Intervals:
(we do not assume that p1 = p2 therefore
we do not pool for CI's)
( pˆ1  pˆ 2 )  z *
pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )

n1
n2
*choose the non-pooled formula from
chart, calculator will make correct
choice automatically
For Confidence Intervals:
 Our
confidence interval statements
reflect the true difference in the two
proportions (in context).
 i.e.
I am 90% confidence that the true
difference in ….
Comparing Two Means
 When
our data is quantitative, then we
are either looking for an interval that
contains the true difference in the means
or the true mean difference.
Conditions:
*Randomness for both samples
*Independence within both samples.
*Sample size restriction must be met:
If n1 + n2 < 15, do not use if outliers or severe
skewness are present
If 15 ≤ n1 + n2 < 30, use except in presence
of outliers
If n1 + n2  30, sample is large enough to use
regardless of outliers or skewness by CLT
*The two samples we are comparing must
be independent from each other.
Comparing Two Independent Means
 Two-Sample
t Interval:(Quantitative Data)
s s
x  x   t  
n n
2
1
1
2
2
2
1
2
*choose the non-pooled formula from the formula sheet
*df = there is a nasty formula – ick!
But …calculator gives you this 
 When
we interpret confidence intervals
for independent samples we say “the true
difference between two means” (not the
mean difference).
 i.e.
We are 95% confident that the true
difference in the mean ….
Paired Samples
 Often
we have samples of data that are
drawn from populations that are not
independent. We have to watch carefully
for those!! We can not treat these samples
as two independent samples, we must
consider the fact that they are related.
 So…what do we do?
Paired Samples
 If
we have two samples of data that are
drawn from populations that are not
independent, we use that data to create
a list of differences. That list of differences
then becomes our data and we will not
use the two individual lists of data again.
Paired Samples
 Once
we have the list of differences,
everything else is like 1 sample procedures
from the last unit.
Paired Samples
 We
will treat that list of differences as our
data.
 That list must meet the conditions for a
single sample t-distribution. We will use the
t-interval or the t-test on this list of
differences depending on the question.
 When we interpret our confidence
interval or make our conclusion we will be
talking about the “mean of the
differences.”
The One-Sample t-Procedure:
A
level C confidence interval for  is:
 s 
X t

 n
where t* is the critical value from the t
distribution based on degrees of freedom.
Just remember that all the
variables represent the
“mean difference” for your
populations.
Hypothesis Testing
with Two Samples
Categorical Data
Ho: p1 = p2
Ha: p1  p2 or
p1 < p2
or
p1 > p2
(Since our null states that p1 = p2, we
assume this to be true until proven
otherwise, therefore we automatically
pool here)
z
pˆ1  pˆ 2
1 1
pˆ (1  pˆ )(  )
n1 n2
Where
the
x1 x2
pˆ 
n1 n2
*choose the pooled formula from
chart, the calculator will
automatically pool
We rely on the same p-values and alphas to make
our conclusions.
Quantitative Data
Ho: μ1 = μ 2
Ha: μ 1  μ 2 or
μ1<μ2
or
μ1>μ2
 Since
there is no assumption of σ1= σ2,
we do not pool
Two-Sample t-test
t
x1  x 2
s12 s 2 2

n1 n 2
*choose the non-pooled formula from the
formula sheet
*df = there is a nasty formula – ick!
But …calculator gives you this 
Two-Sample t Procedures:
 We do not pool t-tests.
 Pooling assuming equal σ values. Since σ1
and σ2 are both unknown, why would we
assume they are equal?
 You must tell the calculator you do not
want to pool. “Just say No”.
Paired Samples
 Remember
when your two samples are
not independent of each other, you must
create a list of differences and use that
set of data and single sample procedures.
Single Sample Reminder
 To
test the hypothesis: Ho:  = o and
Ha:  > o
Ha:  < o
Ha:   o
 Calculate the test statistics t and the p
value.
 We make the same conclusions based on
p-values and alpha.
Single Sample Reminder
 For
a one-sample t statistic:
X 
t
s/ n
has the t distribution with n – 1 degrees of
freedom.
 Just remember that all the variables
represent the “mean difference” for your
populations.