Transcript Determining Sample Sizex

```Determining
Sample Size
1
Statistical Significance
What factors influence the probability of a statistical significance?
2
Statistical Significance
What factors influence the probability of a statistical significance?
β¦
β¦
β¦
β¦
Alpha
Sample Size
Amount of variability in sample
Magnitude of differences between groups/categories/intervals
3
Determining Sample size
π=
π‘βπ  2
πΈ
Where
β¦
β¦
β¦
β¦
n = sample size
t = t score associated with desired significance level
s is the estimated standard deviation
E = the amount error that can be tolerated
4
Determining Sample size
π=
π‘βπ  2
πΈ
Where
β¦
β¦
β¦
β¦
n = sample size
t = t score associated with desired significance level
s is the estimated standard deviation
E = the amount error that can be tolerated
5
Where to get s?
If we donβt have population data, how do we know or even estimate s?
β¦ One solution take a small sample
β¦ Not always practical
6
Sample Size for Proportion
π=
π‘βπ  2
πΈ
With a proportion the largest s is associated with a proportion of .5
Using .5 is thus a βprudentβ assumption when choosing sample size
7
Example
Example: How big should the NYS Housing and Community Renewal
survey be?
β¦ Want to be at least 90% confident
β¦ Can tolerate a margin of error of plus or minus three percentage points
β¦ π=
π‘βπ  2
πΈ
8
POWER
9
IS MY COIN FAKE?
How many flips before you are confident coin is fake
10
IS MY COIN FAKE?
How many flips before you are confident coin is fake?
Probability
1
0.5
2
0.25
3
0.125
4
0.0625
5
0.03125
6
0.015625
7
0.007813
8
0.003906
9
0.001953
10
0.000977
11
Relationship between Power
and hypothesis testing
Accept Null Hypothesis
Reject Null Hypothesis
Null Hypothesis is true
Correct decision
Type I error( alpha
typically set to 5%)
Null Hypothesis is False
Type II error
Correct decision:
Probability of making
this decision correctly is
defined as Power
Probability of making
this correct inference
12
Requirements to estimate
Power
Type of test (e.g. two-sample independent t-test, one tail)
Alpha
Effect size of interest
How much accuracy is desirable
Sample size
Standard deviation of sample
13
Requirements to estimate
Power
Type of test (e.g. two-sample independent t-test, one tail)
β¦ Given
14
Requirements to estimate
Power
Alpha
β¦ Prefer to avoid Type I error-reject null hypothesis
although null hypothesis is true (lower alpha (.01)
β¦ Prefer to avoid Type II error βaccept null hypothesis
although null hypothesis is false (higher alpha (.05)
15
Requirements to estimate
Power
Effect size of interest
β¦ Determined by theory or intuition
β’
Are men heavier than women? What is an
βimportantβ difference?
β’ Two kilograms?
β’ Twenty kilograms?
β¦
16
Requirements to estimate
Power
Effect size of interest
β¦ Cohenβs D
β¦ πΆπβπππ =
β’
β’
ππ‘ βππ
ππ·ππππππ
Mt mean treatment or group 1
Mc mean control or group 2
β¦ Sdpooled=
ππ·π‘2 ππ‘ β1 +ππ·π2 ππ β1
ππ +ππ‘ β2
17
Requirements to estimate
Power
Cohenβs D
β¦ Tells us how big a difference is substantively important
β¦ Expresses difference in standard deviation units
Rules of thumb
β¦ .2 small effect
β¦ .5 moderate effect
β¦ .8 large effect
Consider using Cohenβs D if you have no intuition
about effect size or what is an important difference
18
Stata Examples
Class data
β¦ Are men heavier than women?
10.2 in text book
Captain Beaver is warned by Colonel Verleaf that if the mean
efficiency rating for the 150 platoons under Verleafβs
command falls below 80, Captain Beaver will be transferred to
Minot Air Base (A base in the middle of nowhere). Beaver
takes a sample of 20 platoos and finds the following: mean =
85; s=13.5
β¦
β¦
β¦
β¦
Null hypothesis µ = 80
Alternative hypothesis µ = 85
sd = 13.5
n = 20
19
Problem 10.2
. *PROBLEM 10.2
. power onemean 80 85, sd(13.5) n(20)
Estimated power for a one-sample mean test
t test
Ho: m = m0 versus Ha: m != m0
Study parameters:
alpha
N
delta
m0
ma
sd
=
=
=
=
=
=
0.0500
20
0.3704
80.0000
85.0000
13.5000
Estimated power:
power =
0.3495
20
Stata Example
Power for Proportion
12.10 in Book
VISTA manager William suspects 50% of his volunteers are over 65 years
old. A survey of 16 volunteers reveals 44% that are over age 65. How
much power does he have?
21
Problem 12.10
.
.
*PROBLEM 12.10
power oneproportion .5 .44, test(wald) n(16)
Estimated power for a one-sample proportion test
Wald z test
Ho: p = p0 versus Ha: p != p0
Study parameters:
alpha
N
delta
p0
pa
=
=
=
=
=
0.0500
16
-0.0600
0.5000
0.4400
Estimated power:
power =
0.0772
22
Sample size and power
Estimated power for a one-sample mean test
t test
H0: ΞΌ = ΞΌ 0 versus Ha: ΞΌ β  ΞΌ 0
1
Power (1-Ξ²)
.8
.6
.4
.2
0
0
20
40
60
Sample size (N)
80
100
Parameters: Ξ± = .05, Ξ΄ = .37, ΞΌ0 = 80, ΞΌa = 85, Ο = 14
23
Effect Size and Power
Estimated power for a one-sample mean test
t test
H0: ΞΌ = ΞΌ 0 versus Ha: ΞΌ β  ΞΌ 0
1
Power (1-Ξ²)
.8
.6
.4
.2
0
60
70
80
Alternative mean (ΞΌ a)
90
100
Parameters: Ξ± = .05, N = 20, ΞΌ0 = 80, Ο = 14
24
Are incomes higher in Mixed
Income Developments
NYSHCR survey of tenants
0=Not mixed income, 1 = mixed income
25
Are incomes higher in Mixed
Income Developments
. *ARE INCOMES HIGHER IN MIXED INCOME DEVELOPMENTS?
. ttest household_income, by(mixed_income)
Two-sample t test with equal variances
Group
Obs
Mean
0
1
2,000
395
combined
2,395
diff
Std. Err.
Std. Dev.
[95% Conf. Interval]
22499.16
26554.08
412.2586
716.2176
18436.76
14234.54
21690.66
25145.99
23307.66
27962.17
23167.92
365.2108
17872.95
22451.76
23884.08
-4054.925
980.8007
-5978.231
-2131.618
diff = mean(0) - mean(1)
Ho: diff = 0
Ha: diff < 0
Pr(T < t) = 0.0000
t =
degrees of freedom =
Ha: diff != 0
Pr(|T| > |t|) = 0.0000
-4.1343
2393
Ha: diff > 0
Pr(T > t) = 1.0000
26
Are incomes higher in Mixed
Income Developments
. power twomeans 22499 26554, sd1(18436) sd2(14234) n1(2000) n2(395)
Estimated power for a two-sample means test
Satterthwaite's t test assuming unequal variances
Ho: m2 = m1 versus Ha: m2 != m1
Study parameters:
alpha
N
N1
N2
N2/N1
delta
m1
m2
sd1
sd2
=
0.0500
=
2395
=
2000
=
395
=
0.1975
= 4055.0000
= 2.25e+04
= 2.66e+04
= 1.84e+04
= 1.42e+04
Estimated power:
power =
0.9984
27
```