2-sample tests

Download Report

Transcript 2-sample tests

The Two Sample t
Review significance testing
Review t distribution
Introduce 2 Sample t test / SPSS
Significance Testing
• State a Null Hypothesis
• Calculate the odds of obtaining your sample
finding if the null hypothesis is correct
– Compare this to the odds that you set ahead of
time (e.g., alpha)
– If odds are less than alpha, reject the null in favor
of the research hypothesis
• The sample finding would be so rare if the null is true
that it makes more sense to reject the null hypothesis
Significance the old fashioned way
• Find the “critical value” of the test statistic for
your sample outcome
– Z tests always have the same critical values for given
alpha values (e.g., .05 alpha  +/- 1.96)
• Use if N >100
– t values change with sample size
• Use if N < 100
• As N reaches 100, t and z values become almost identical
• Compare the critical value with the obtained
value  Are the odds of this sample outcome
less than 5% (or 1% if alpha = .01)?
Critical Values/Region for the z test
( = .05)
Directionality
• Research hypothesis must be directional
– Predict how the IV will relate to the DV
• Males are more likely than females to…
• Southern states should have lower scores…
“2-Sample” t test
– Apply when…
• You have a hypothesis that the means (or proportions) of a
variable differ between 2 populations
– Components
– 2 representative samples – Don’t get confused here (usually both
come from same “sample”)
– One interval/ratio dependent variable
– Examples
» Do male and female differ in their aggression (# aggressive acts
in past week)?
» Is there a difference between MN & WI in the proportion who
eat cheese every day?
– Null Hypothesis (Ho)
• The 2 pops. are not different in terms of the dependent variable
2-SAMPLE HYPOTHESIS TESTING
• Assumptions:
– Random (probability) sampling
– Groups are independent
– Homogeneity of variance
» the amount of variability in the D.V. is about equal in each
of the 2 groups
– The sampling distribution of the difference between means is
normal in shape
2-SAMPLE HYPOTHESIS TESTING
• We rarely know population S.D.s
– Therefore, for 2-sample t-testing, we must use 2 sample S.D.s,
corrected for bias:
» “Pooled Estimate”
• Focus on the t statistic:
t (obtained) = (X – X)
σ x-x
• we’re finding the
difference between the two means…
…and standardizing this difference with the pooled estimate of the
standard error
2-SAMPLE HYPOTHESIS TESTING
• t-test for the difference
between 2 sample
means:
• Does our observed
difference between the
sample means reflects a
real difference in the
population means or is
due to sampling error?
2-Sample Sampling
Distribution
– difference between
sample means (closer
sample means will have
differences closer to 0)
- t critical
0
t critical
ASSUMING THE NULL IS TRUE!
Applying the 2-Sample t Formula
– Example:
• Research Hypothesis (H1):
– Soc. majors at UMD drink more beers per month than nonsoc. majors
– Random sample of 205 students:
» Soc majors: N = 100, mean=16, s=2.0
» Non soc. majors: N = 105, mean=15, s=2.5
» Alpha = .01
• Degrees of Freedom = N-2
» What is the null? Can it be rejected?
» FORMULA:
t(obtained) = X1 – X2
pooled estimate of standard error
Example 2
• Dr. Phil believes that inmates with tattoos will get
in more fights than inmates without tattoos.
• Tattooed inmates  N = 25, s = 1.06, mean = 1.00
• Non-Tattooed inmates  N = 37, s =.5599, mean = 0.5278
–
–
–
–
–
Null hypothesis?
Directional or non?
tcritical?
Difference between means?
Significant at the .01 level?
2-Sample Hypothesis Testing in SPSS
• Independent Samples t Test Output:
– Testing the Ho that there is no difference in number of
adult arrests between a sample of individuals who were
abused/neglected as children and a matched control
group.
Group Statistics
SUB_CNLX
SUBJECT / CONTROL
ADULT_S NUMBER
1 Subjects
OF ADULT OFFENSES 2 Controls
N
397
192
Std. Error
Mean
Std. Deviation
Mean
9.24
13.821
.694
4.43
7.002
.505
Interpreting SPSS Output
• Difference in mean # of
adult arrests between those
who were abused as
children & control group
Independent Samples Test
Levene's Test for
Equality of Variances
F
ADULT_S NUMBER
Equal variances
OF ADULT OFFENSES assumed
Equal variances
not assumed
36.864
Sig.
.000
t-test for Equality of Means
t
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
4.547
587
.000
4.81
1.058
2.732
6.887
5.604
585.783
.000
4.81
.858
3.124
6.495
Interpreting SPSS Output
• t statistic, with degrees of freedom
Independent Samples Test
Levene's Test for
Equality of Variances
F
ADULT_S NUMBER
OF ADULT OFFENSES
Equal variances
ass umed
Equal variances
not as sumed
36.864
Sig.
.000
t-tes t for Equality of Means
t
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
4.547
587
.000
4.81
1.058
2.732
6.887
5.604
585.783
.000
4.81
.858
3.124
6.495
Interpreting SPSS Output
• “Sig. (2 tailed)”
– gives the actual probability of obtaining this finding if the
null is correct
• a.k.a. the “p value” – p = probability
• The odds are NOT ZERO (if you get .ooo, interpret as <.001)
Independent Samples Test
Levene's Test for
Equality of Variances
F
ADULT_S NUMBER
OF ADULT OFFENSES
Equal variances
ass umed
Equal variances
not as sumed
36.864
Sig.
.000
t-tes t for Equality of Means
t
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
4.547
587
.000
4.81
1.058
2.732
6.887
5.604
585.783
.000
4.81
.858
3.124
6.495
“Sig.” & Probability
• Number under “Sig.” column is the exact probability
of obtaining that t-value (finding that mean
difference) if the null is true
– When probability > alpha, we do NOT reject H0
– When probability < alpha, we DO reject H0
• As the test statistics (here, “t”) increase, they
indicate larger differences between our obtained
finding and what is expected under null
– Therefore, as the test statistic increases, the probability
associated with it decreases
Example 2: Education & Age
at which First Child is Born
H0: There is no relationship between whether an individual has a
college degree and his or her age when their first child is born.
Group Statistics
AGEKDBRN R'S AGE
WHEN 1ST CHILD BORN
COLDGREE R has
4-year college degree
1.00 No -- less than a
Bachelor's degree
2.00 Yes -- a Bachelor's
or Graduate degree
N
Mean
Std. Deviation
Std. Error
Mean
812
22.74
4.826
.169
222
26.82
5.343
.359
Education & Age at which First Child is Born
1. What is the mean difference in age?
2. What is the probability that this t statistic is due to
sampling error?
3. Do we reject H0 at the alpha = .05 level?
4. Do we reject H0 at the alpha = .01 level?
Independent Samples Test
Levene's Test for
Equality of Variances
F
AGEKDBRN R'S AGE
Equal variances
WHEN 1ST CHILD BORN assumed
Equal variances
not assumed
4.547
Sig.
.033
t-test for Equality of Means
t
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
-10.926
1032
.000
-4.09
.374
-4.824
-3.355
-10.310
326.163
.000
-4.09
.397
-4.869
-3.309
SPSS In-Class
• Conduct an independent sample t-test
– Need one I/R variable
• This is the variable used to calculate means
– Need on Nominal, 2-category (dummy) variable
• This dictates the “groups” used to create the two
different means