Two-sample t
Download
Report
Transcript Two-sample t
Chapter 24
COMPARING TWO MEANS
Objective: To test claims about inferences for
two sample means, under specific conditions
COMPARING TWO MEANS
• When comparing two means, our parameter of interest is the difference between
the two means, 1 – 2.
• The catch is that we must be confident that our two samples are completely
independent from one anther.
• Therefore, our standard deviation can be found by adding the two variances to
account for extra variability among the two different samples:
𝑺𝑫(𝒙𝟏 - 𝒙𝟐 ) =
𝝈𝟏
𝟐
𝒏𝟏
+
𝝈𝟐
𝟐
𝒏𝟐
• We still don’t know the true standard deviations of the two groups, so we need to
estimate and use the standard error
𝑺𝑬(𝒙𝟏 - 𝒙𝟐 ) =
𝒔𝟏
𝟐
+
𝒔𝟐
𝟐
COMPARING TWO MEANS (CONT.)
• Because we are working with means and estimating the standard
error of their difference using the data, we shouldn’t be surprised
that the sampling model is a Student’s t.
• The confidence interval we build is called a two-sample t-interval
(for the difference in means).
• The corresponding hypothesis test is called a two-sample t-test.
TWO-SAMPLE T-INTERVAL AND TTEST CONDITIONS
• Independence Assumption (Each condition needs to be checked
for both groups.):
• Randomization Condition: Were the data collected with suitable
randomization (representative random samples or a randomized
experiment)?
• 10% Condition: We don’t usually check this condition for
differences of means. We will check it for means only if we have a
very small population or an extremely large sample.
TWO-SAMPLE T-INTERVAL AND TTEST CONDITIONS (CONT.)
• Normal Population Assumption:
• Nearly Normal Condition: This must be checked for both groups.
A violation by either one violates the condition.
• Independent Groups Assumption: The two groups we are
comparing must be independent of each other. (See 8.3 if the
groups are not independent of one another…)
TWO-SAMPLE T-INTERVAL
• When the conditions are met, we are ready to find the confidence interval
for the difference between means of two independent groups, 1 – 2.
The confidence interval is
(𝒙𝟏 − 𝒙𝟐 ) ± 𝒕𝒅𝒇 ∗ × 𝑺𝑬(𝒙𝟏 − 𝒙𝟐 )
where the standard error of the difference of the means is
𝑺𝑬(𝒙𝟏 - 𝒙𝟐 ) =
𝒔𝟏
𝟐
𝒏𝟏
+
𝒔𝟐
𝟐
𝒏𝟐
• The critical value t*df depends on the particular confidence level, C, that
you specify and on the number of degrees of freedom, which we get from
the sample sizes and a special formula.
DEGREES OF FREEDOM
• The special formula for the degrees of freedom for our t critical
value is a bear:
2
s12 s22
n1 n2
df
2
2
2
2
1 s1
1 s2
n1 1 n1 n2 1 n2
• Because of this, we will let technology calculate degrees of freedom
for us!
STEPS FOR TWO-SAMPLE TINTERVAL
1. Check Conditions and show that you have checked these!
• Randomization Condition: Were the data collected with suitable
randomization (representative random samples or a randomized
experiment)?
• 10% Condition: Is each sample size less than 10% of the population
size?
• Nearly Normal Condition: This must be checked for both groups.
A violation by either one violates the condition.
• Independent Groups Assumption: The two groups we are comparing
must be independent of each other.
STEPS FOR TWO-SAMPLE TINTERVAL (CONT.)
2. State the test you are about to conduct
Ex) Two-Sample t-Interval for Means
4. Calculate your t-interval
∗
(𝒙𝟏 − 𝒙𝟐 ) ± 𝒕𝒅𝒇 ×
𝒔𝟏
𝟐
𝒏𝟏
+
𝒔𝟐
𝟐
𝒏𝟐
5. State your conclusion IN CONTEXT.
I am 95% confident that the average pulse rate for smokers is between 3.1
and 8.9 bpm higher than the pulse rate for nonsmokers.
TWO-SAMPLE T-INTERVAL
EXAMPLE
Does increasing the amount of calcium in our diet reduce blood pressure? Examination of a
large sample of people revealed a relationship between calcium intake and blood pressure. The
relationship was strongest for black men. Such observational studies do not establish
causation. Researchers therefore designed a randomized comparative experiment. The
subjects in part of the experiment were 21 healthy black men. A randomly chosen group of 10
of the men received a calcium supplement for 12 weeks. The control group of 11 men
received a placebo pill that looked identical. The experiment was double-blind. The response
variable is the decrease in systolic (heart contracted) blood pressure for a subject after 12
weeks, in mm of mercury. An increase appears as a negative response. Take Group 1 to be
the calcium group and Group 2 the placebo group. Here are the data for…
The 10 men in Group 1 (calcium):
7
-4
18
17
-3
-5
1
10
11
-2
3
-5
5
2
-11
-1
The 11 men in Group 2 (placebo):
-1
-3
12
-1
-3
TWO-SAMPLE T-INTERVAL
EXAMPLE (CONT.)
• Set up your hypotheses in words and symbols as if we were to conduct a test.
• Check conditions
• Create a 90% confidence interval. Can we reject H o?
TWO-SAMPLE T-INTERVAL
CALCULATOR TIPS
Check your last confidence interval with the calculator:
• STAT Tests
• 0: 2-SampTint
• Specify if you are using data (if you have raw data to enter into L1 and
L2) or stats (if you have summarized data)
• Specify 1 for both frequencies
• Pooled? We will discuss this later (for now say NO).
• Calculate
TWO-SAMPLE T-TESTS
• The hypothesis test we use is the two-sample t-test for means.
• The conditions for the two-sample t-test for the difference between
the means of two independent groups are the same as for the twosample t-interval.
TWO-SAMPLE T-TESTS (CONT.)
• We test the hypothesis H 0:1 – 2 = 0, where the hypothesized
difference, 0, is almost always 0, using the statistic
𝒙𝟏 − 𝒙𝟐 − (𝝁𝟏 − 𝝁𝟐 )
𝒕=
𝑺𝑬(𝒙𝟏 − 𝒙𝟐 )
• The standard error is
𝑺𝑬(𝒙𝟏 - 𝒙𝟐 ) =
𝒔𝟏
𝟐
𝒏𝟏
+
𝒔𝟐
𝟐
𝒏𝟐
• When the conditions are met and the null hypothesis is
true, this statistic can be closely modeled by a Student’s tmodel with a number of degrees of freedom given by a
TWO-SAMPLE T-TESTS (CONT.)
• We test the hypothesis H 0:1 – 2 = 0, where the hypothesized
difference, 0, is almost always 0, using the statistic
𝒙𝟏 − 𝒙𝟐 − (𝝁𝟏 − 𝝁𝟐 )
𝒕=
𝑺𝑬(𝒙𝟏 − 𝒙𝟐 )
• The standard error is
𝑺𝑬(𝒙𝟏 - 𝒙𝟐 ) =
𝒔𝟏
𝟐
𝒏𝟏
+
𝒔𝟐
𝟐
𝒏𝟐
• When the conditions are met and the null hypothesis is
true, this statistic can be closely modeled by a Student’s tmodel with a number of degrees of freedom given by a
WRITING HYPOTHESES
There are different ways to write the null hypotheses. You may often
times assume they are equal:
• H 0: 1 – 2 = 0
OR
H 0: 1 = 2
Thus, the corresponding alternative hypotheses may be one of the
following:
• H A: 1 – 2 ≠ 0
• H A: 1 – 2 > 0
• H A: 1 – 2 < 0
OR
OR
OR
H A: 1 ≠ 2
H A: 1 > 2
H A: 1 < 2
STEPS FOR TWO-SAMPLE MEAN
HYPOTHESIS TESTING
1. Check Conditions and show that you have checked these!
• Randomization Condition: Were the data collected with suitable
randomization (representative random samples or a randomized
experiment)?
• 10% Condition: Is each sample size less than 10% of the population
size?
• Nearly Normal Condition: This must be checked for both groups.
A violation by either one violates the condition.
• Independent Groups Assumption: The two groups we are comparing
must be independent of each other.
STEPS FOR HYPOTHESIS TESTING
(CONT.)
2. State the test you are about to conduct
Ex) Two-Sample t-Test for Means
3. Set up your hypotheses
H 0:
H A:
4. Calculate your test statistic
𝒕=
𝒙𝟏 − 𝒙𝟐 − (𝝁𝟏 − 𝝁𝟐 )
𝟐
𝟐
𝒔𝟏
𝒔𝟐
+
𝒏𝟏
𝒏𝟐
5. Draw a picture of your desired area under the t-model, and
calculate your P-value.
STEPS FOR HYPOTHESIS TESTING
(CONT.)
6. Make your conclusion.
P-Value
Action
Conclusion
Low
Reject H0
The sample
mean is
sufficient
evidence to
conclude HA in
context.
High
Fail to reject H0
The sample
mean does not
provide us with
sufficient
evidence to
conclude HA in
context.
TWO-SAMPLE MEAN HYPOTHESIS
TEST EXAMPLE
Using the previous data to compare the calcium supplement and the placebo,
calculate the test statistic and P-value to determine if there is enough evidence
that calcium reduces blood pressure.
CALCULATOR TIPS
Given a set of data:
• Enter data into L1
• Set up STATPLOT to create a histogram to check the nearly Normal condition
• STAT TESTS 2:T-Test
• Choose Stored Data, then specify your data list (usually L1)
• Enter the mean of the null model and indicate where the data are (>, <, or ≠)
Given sample mean and standard deviation:
• STAT TESTS 2:T-Test
• Choose Stats enter
• Specify the hypothesized mean and sample statistics
• Specify the tail (>, <, or ≠)
• Calculate
ASSIGNMENTS
• Day 1: pp. 579 – 586 # 2 – 6 EVEN, 7, 9
• Day 2: pp. 579 – 586 # 10, 11, 19, 25, 36