Significance Tests for Small Samples

Download Report

Transcript Significance Tests for Small Samples

Sociology 601 Class 7: September 22, 2009
• 6.4: Type I and type II errors
• 6.5: Small-sample inference for a mean
• 6.6: Small-sample inference for a proportion
• 6.7: Evaluating p of a type II error.
1
6.5: Why the problem with small samples?
– Within a distribution of samples, the estimated variance and
standard deviation will vary, even for samples with the same
sample mean.
– s2 will sometimes be larger than 2 and sometimes smaller.
– when s is smaller than , a moderate difference between
Ybar and μ0 might be statistically significant.
– when s is larger than , a large difference between Ybar and
μ0 might not be statistically significant.
3
What causes this problem?
• The problem is that an imprecise estimator of
sigma can distort p-values.
• This problem arises even though the population
has a normal distribution, and even though the
(imprecise) estimator is unbiased.
4
Correcting the problem: the t-test.
• SOLUTION: calculate test statistics as before, but
recalculate the table we use to find p-values.
• the t-score for small samples is calculated in the same way
as the z-score for large samples.
Y  0
Y  0
t

s.e.Y
s
n
• look up the test statistic in Table B, page 669
• degrees of freedom = n-1
• conduct hypothesis tests or estimate confidence intervals as
with a larger sample.
5
Properties of the t-distribution:
• the t-distribution is bell-shaped and symmetric about 0.
• Compared to a z-distribution, the t-distribution has extra
area in the extreme tails.
• as n-1 increases, the t-distribution becomes
indistinguishable from the normal distribution.
6
Student’s t-distribution
t-distribution (df=1) and normal distribution:
7
Student’s t-distribution
8
Using table B on page 669:
• You have a t-score: what is the p-value?
t
n
Lower t in
Table B
Lower p in
Table B
Higher t in
Table B
Higher p in
Table B
P (1-sided)
P (2sided)
2.130 5
2.130 16
2.130 601
9
10
Using table B on page 669:
• You have a t-score: what is the p-value?
Lower t in
Table B
Lower p in
Table B
Higher t in
Table B
Higher p in
Table B
P (1-sided)
P (2sided)
2.130 5
1.533
.100
2.132
.050
p<.10
n.s.
2.130 16
1.753
.050
2.131
.025
p<.05
p<.10
2.130 601 1.960
.025
2.326
.010
p<.025
p<.05
t
N
11
Using STATA to find t-scores and p-values
• t-statistics and p-values using DISPLAY INVTTAIL and
DISPLAY TPROB:
– You provide the df and either the 1-tailed p or the 2-tailed
t
– compare to table B, page 669
– examples given for sample sizes 10000 and 5 (df = n – 1)
– Compare also to invnorm and normprob
. display invttail(9999,.025)
1.9602012
. display invttail(4,.025)
2.7764451
. display tprob(9999,1.96)
.05002352
. display tprob(4,1.96)
.12155464
12
STATA commands for section 6.5 or 6.2
• immediate test for sample mean using TTESTI:
• (note use of t-score, not z-score)
. * for example, in A&F problem 6.8, n=100 Ybar=508 sd=100 and mu0=500
. ttesti 100 508 100 500, level(95)
One-sample t test
-----------------------------------------------------------------------------|
Obs
Mean
Std. Err.
Std. Dev.
[95% Conf. Interval]
---------+-------------------------------------------------------------------x |
100
508
10
100
488.1578
527.8422
-----------------------------------------------------------------------------Degrees of freedom: 99
Ho: mean(x) = 500
Ha: mean < 500
t =
0.8000
P < t =
0.7872
Ha: mean != 500
t =
0.8000
P > |t| =
0.4256
Ha: mean > 500
t =
0.8000
P > t =
0.2128
13
T-test example: small-sample study of Anorexia
• A study compared various treatments for young girls
suffering from anorexia. The variable of interest was the
change in weight from the beginning to the end of the study.
• For a sample of 29 girls receiving a cognitive behavioral
treatment, the changes in weight are summarized by
Ybar = 3.01 and s = 7.31 pounds
• “Does the cognitive behavioral treatment work?”
14
T-test example: small-sample study of Anorexia
• Assumptions:
– We are working with a random sample of some sort.
– Observations are independent of each other.
– Change in weight is an interval scale variable.
– Change in weight is distributed normally in the
population.
• Hypothesis:
– H0: µ = 0. The mean change in weight is zero for the
conceptual population of young girls undergoing the
anorexia treatment.
15
T-test example: small-sample study of Anorexia
• Test statistic: if Ybar =3.01, s = 7.31, and n=29, then
Standard error = 7.31/sqrt(29) = 1.357
t = 3.01 / 1.357 = 2.217
• P-value:
df = 29 – 1 = 28
T(.025, 28df) = 2.048, T(.010, 28df) = 2.467
2.467 > 2.217 > 2.048
.01 < p < .025
P < .025 (one-sided), so
P < .05 (two-sided)
16
T-test example: small-sample study of Anorexia
• conclusion: reject H0: girls who undergo the cognitive
behavioral treatment do not stay the same weight.
• By this analysis, the results of the study are statistically
significant. To conclude that the results are substantively
significant, we need to address more questions.
• Q: Is 3.1 pounds a meaningful increase in weight?
– Note: s = 7.31. This number has substantive as well as
statistical importance.
• Q: Would we really expect girls to have no change in weight
if there was no effect of the program?
17
confidence interval using a t-test
 s 
c.i.  Y  t.025 * s.e.Y  Y  t.025 

 n
• This is a formula for a 95% confidence interval for a twosided t-test.
• Anorexia example again:
– Ybar = 3.01, s=7.31, n=29, df=29-1=28, t(.025,28) = 2.048
• c.i. = 3.01 ± 2.048(7.31/SQRT(29)) = 3.01 ± 2.780
• c.i. = (0.23, 5.79)
18
6.6: Small-sample inference for a population proportion:
the Binomial Distribution
• With large samples, we have been treating population
proportions as a special case of a population mean, but with
slightly different equations.
– z = ( 
ˆ - o ) /s.e.
–
= (
ˆ - o ) / (σ0 / SQRT(N) )
–
=(
ˆ - o ) / ( [ SQRT(o(1- o)) ]

/ SQRT(N) )
• With small samples, however, tests for population means


require the specific assumption that the variable has a
normal distribution within the population.
• We need a statistic from which we can draw inferences
when np < 10 or n(1-p) < 10.
19
Definitions for the Binomial Distribution
• Often, a single ‘random trial’ will have two possible
outcomes, “yes” (=1) and “no (=0).
• Let B be a random variable generated by a yes/no process.
Then B has a probability distribution:
– P(B=1) = p ; P(B=0) = 1-p.
– a heads on a coin flip: p =.5;
– a 6 on a die role p: = .167;
– for left-handed p: = ~.10;
• For a fixed number of observations N, each observation falls
into one of the two categories.
• A key assumption is that the outcomes of successive
observations are independent.
– coin flips? left-handedness?
20

Probabilities for the Binomial Distribution
• If we know the population proportion  and the sample size
N, we can calculate the probability of exactly X outcomes for
any value of X from 0 to N:
N!
X
N X
P(X) 
 (1  )
X!(N  X)!
• where N! = 1*2*…*N
• example: What is the probability of getting 3 heads (and 1
tail) when flipping a coin four times?
• example: What is the probability of rolling a die 6 times and
getting exactly 1 six? Exactly 2 sixes?
21
Small sample example for population proportion.
• Gender and selection of manager trainees:
• If there is no gender bias in trainee selection and the
pool of potential trainees is 50% male and 50%
female, what is the possibility of getting only two
women in a sample of 10 trainees?
• Alternately, is there evidence of gender bias in
trainee selection?
22
Hypothesis test for a population proportion.
1. Assumptions: we are estimating a population
proportion, and the observations are dichotomous,
identical, and independent.
2. Hypothesis: Ho:  = .5, where  is the population
proportion of trainees who are women.
3. Test statistics: none: we calculate p-values by
hand using an exact application of the binomial
distribution.
a. P(0 women) = (10!/0!*10!)*(.5)0*(1-.5)10 = .000977
b. P(1 woman) = (10!/1!*9!)*(.5)1*(1-.5)9 = .000977
Binomial distribution for n= 10,  =.5:
x
0
1
2
3
4
5
6
7
8
9 10
P(x) .001 .010 .044 .117 .205 .246 .205 .117 .044 .010 .001
23
Hypothesis test for a population proportion.
4. p-value: the p-value is the sum of p(x) for every X
at least as unlikely as the x we measure.
a. with 2 women and 8 men, we get …
b. p = .001+.010+.044+.044+.010+.001 = .110
5. Conclusion: Do not reject Ho: from this sample,
we cannot conclude with certainty that women
and men do not have an equal chance of being
selected into the training program.
24
STATA command for binomial distributions
• immediate test for small sample proportion using BITESTI:
• In a jury of 12 persons, only two are women, even though
women constitute 53% of the jury-age population. Is this
evidence for systematic selection of men in the jury?
• bitesti 12 2 .53
•
N
Observed k
Expected k
Assumed p
Observed p
• -----------------------------------------------------------•
12
2
6.36
0.53000
0.16667
•
•
•
Pr(k >= 2)
= 0.998312
Pr(k <= 2)
= 0.011440
Pr(k <= 2 or k >= 11) = 0.017159
(one-sided test)
(one-sided test)
(two-sided test)
25
Alternative STATA command for testing probabilities:
useful for large n
immediate test for sample proportion using PRTESTI:
. * for proportion: in A&F problem 6.12, n=832 p=.53 and p0=.5
. prtesti 832 .53 .50, level(95)
One-sample test of proportion
x: Number of obs =
832
-----------------------------------------------------------------------------Variable |
Mean
Std. Err.
[95% Conf. Interval]
-------------+---------------------------------------------------------------x |
.53
.0173032
.4960864
.5639136
-----------------------------------------------------------------------------Ho: proportion(x) = .5
Ha: x < .5
z = 1.731
P < z = 0.9582
Ha: x != .5
z = 1.731
P > |z| = 0.0835
Ha: x > .5
z = 1.731
P > z = 0.0418
26
Comparison of a binomial distribution and a normal distribution
• with a large enough N, a binomial distribution will look like a
normal distribution.
• With small samples, and with very low or high sample
proportions, the binomial distribution is not normal enough
to allow us to extrapolate from a t-score to a p-value.
• With the binomial, we do not calculate means and standard
deviations: we calculate p directly.
27