Power and Sample Size

Download Report

Transcript Power and Sample Size

Power and Sample Size
Boulder 2004
Benjamin Neale
Shaun Purcell
To Be Accomplished
•
Introduce Concept of Power via
Correlation Coefficient (ρ) Example
Identify Relevant Factors Contributing to
Power
Practical:
•
•
•
•
Power Analysis for Univariate Twin Model
How to use Mx for Power
Simple example
Investigate the linear relationship (r)
between two random variables X and Y:
r=0 vs. r0 (correlation coefficient).
• draw a sample, measure X,Y
• calculate the measure of association r
(Pearson product moment corr. coeff.)
• test whether r  0.
How to Test r  0
•
•
•
•
•
assumed the data are normally
distributed
defined a null-hypothesis (r = 0)
chosen a level (usually .05)
utilized the (null) distribution of the
test statistic associated with r=0
t=r  [(N-2)/(1-r2)]
How to Test r  0
•
•
•
Sample N=40
r=.303, t=1.867, df=38, p=.06 α=.05
As p > α, we fail to reject r = 0
have we drawn the correct conclusion?
a= type I error rate
probability of deciding r  0
(while in truth r=0)
a is often chosen to equal
.05...why?
DOGMA
N=40, r=0, nrep=1000 – central t(38),
a=0.05 (critical value 2.04)
800
600
400
NREP=1000
4.6% sign.
200
0
-5
-4
-3
-2
-1
0
1
2
3
4
5
4
5
0.4
0.3
central t(38)
0.2
2.5%
2.5%
0.1
0
-5
-4
-3
-2
-1
0
1
2
3
Observed non-null distribution (r=.2)
and null distribution
100
80
60
abs(t)>2.04 in
23%
rho=.20
N=40
Nrep=1000
40
20
0
-5
-4
-3
-2
-1
0
1
2
3
4
5
1
2
3
4
5
0.5
0.4
null distribution t(38)
0.3
0.2
0.1
0
-5
-4
-3
-2
-1
0
In 23% of tests of r=0, |t|>2.024
(a=0.05), and thus draw the correct
conclusion that of rejecting r = 0.
The probability of rejecting the nullhypothesis (r=0) correctly is 1-b, or
the power, when a true effect exists
Hypothesis Testing
• Correlation Coefficient hypotheses:
– ho (null hypothesis) is ρ=0
– ha (alternative hypothesis) is ρ ≠ 0
• Two-sided test, where ρ > 0 or ρ < 0 are one-sided
• Null hypothesis usually assumes no effect
• Alternative hypothesis is the idea being
tested
Summary of Possible Results
accept H-0
reject H-0
H-0 true
1-a
a
H-0 false
b
1-b
a=type 1 error rate
b=type 2 error rate
1-b=statistical power
Rejection of H0
Non-rejection of H0
H0 true
Type I error
at rate a
Nonsignificant result
(1- a)
HA true
Significant result
(1-b)
Type II error
at rate b
Power
• The probability of rejection of a
false null-hypothesis depends
on:
–the significance criterion (a)
–the sample size (N)
–the effect size (NCP)
“The probability of detecting a given effect size
in a population from a sample of size N,
using significance criterion a”
Standard Case
Sampling
P(T) distribution if
H0 were true alpha 0.05
Sampling
distribution if HA
were true
POWER = 1 - b
b
a
Effect Size (NCP)
T
Impact of Less Cons. alpha
Sampling
P(T) distribution if
H0 were true alpha 0.1
Sampling
distribution if HA
were true
POWER = 1 - b 
b
a
T
Impact of More Cons. alpha
Sampling
P(T) distribution if
H0 were true alpha 0.01
Sampling
distribution if HA
were true
POWER = 1 - b
b
a
T
Increased Sample Size
Sampling
P(T) distribution if
H0 were true alpha 0.05
Sampling
distribution if HA
were true
POWER = 1 - b
b
a
T
Increase in Effect Size
Sampling
P(T) distribution if
H0 were true alpha 0.05
Sampling
distribution if HA
were true
POWER = 1 - b
b
a
Effect Size (NCP)↑
T
Effects on Power Recap
• Larger Effect Size
• Larger Sample Size
• Alpha Level shifts <Beware the False
Positive!!!>
• Type of Data:
– Binary, Ordinal, Continuous
When To Do Power Calcs?
•
•
•
•
Generally study planning stages of study
Occasionally with negative result
No need if significance is achieved
Computed to determine chances of
success
Power Calculations Empirical
•
•
•
•
•
Attempt to Grasp the NCP from Null
Simulate Data under theorized model
Calculate Statistics and Perform Test
Given α, how many tests p < α
Power = (#hits)/(#tests)
Practical: Empirical Power 1
• We will Simulate Data under a model
online
• We will run an ACE model, and test for C
• We will then submit our data and Shaun
will collate it for us
• While he’s collating, we’ll talk about
theoretical power calculations
Practical: Empirical Power 2
• First get F:\ben\2004\ace.mx and put it
into your directory
• We will paste our simulated data into this
script, so open it now in preparation, and
note both places where we must paste in
the data
• Note that you will have to fit the ACE
model and then fit the AE submodel
Practical: Empirical Power 3
• Simulation Conditions
– 30% A2
20% C2
50% E2
– Input:
– A 0.5477 C of 0.4472 E of 0.7071
– 350 MZ 350 DZ
– Simulate and Space Delimited at
– http://statgen.iop.kcl.ac.uk/workshop/unisim.html or
click here in slide show mode
– Click submit after filling in the fields and you will get a
page of data
Practical: Empirical Power 4
• With the data page, use control-a to select the
data, control-c to copy, and in Mx control-v to
paste in both the MZ and DZ groups.
• Run the ace.mx script with the data pasted in
and modify it to run the AE model.
• Report the A, C, and E estimates of the first
model, and the A and E estimates of the second
model as well as both the
-2log-likelihoods on the webpage
http://statgen.iop.kcl.ac.uk/workshop/ or click
here in slide show mode
Practical: Empirical Power 5
• Once all of you have submitted your
results we will take a look at the theoretical
power calculation, using Mx.
• Once we have finished with the theory
Shaun will show us the empirical
distribution that we generated today
Theoretical Power Calculations
• Based on Stats, rather than Simulations
• Can be calculated by hand sometimes, but
Mx does it for us
• Note that sample size and alpha-level are
the only things we can change, but can
assume different effect sizes
• Mx gives us the relative power levels at
the alpha specified for different sample
sizes
Theoretical Power Calculations
• We will use the power.mx script to look at
the sample size necessary for different
power levels
• In Mx, power calculations can be
computed in 2 ways:
– Using Covariance Matrices (We Do This One)
– Requiring an initial dataset to generate a
likelihood so that we can use a chi-square test
Power.mx 1
! Simulate the data
!
30% additive genetic
!
20% common environment
!
50% nonshared environment
#NGroups 3
G1: model parameters
Calculation
Begin Matrices;
X lower 1 1 fixed
Y lower 1 1 fixed
Z lower 1 1 fixed
End Matrices;
Matrix X 0.5477
Matrix Y 0.4472
Matrix Z 0.7071
Begin Algebra;
A = X*X' ;
C = Y*Y' ;
E = Z*Z' ;
End Algebra;
End
Power.mx 2
G2: MZ twin pairs
Calculation
Matrices = Group 1
Covariances A+C+E|
A+C
Options MX%E=mzsim.cov
End
A+C _
|
G3: DZ twin pairs
Calculation
Matrices = Group 1
H Full 1 1
Covariances A+C+E|
H@A+C _
H@A+C |
A+C+E /
Matrix H 0.5
Options MX%E=dzsim.cov
End
A+C+E /
Power.mx 3
! Second part of script
! Fit the wrong model to the simulated data
! to calculate power
#NGroups 3
G1 : model parameters
Calculation
Begin Matrices;
X lower 1 1 free
Y lower 1 1 fixed
Z lower 1 1 free
End Matrices;
Begin Algebra;
A = X*X' ;
C = Y*Y' ;
E = Z*Z' ;
End Algebra;
End
Power.mx 4
G2 : MZ twins
Data NInput_vars=2 NObservations=350
CMatrix Full File=mzsim.cov
Matrices= Group 1
Covariances A+C+E
|
A+C _
A+C
|
Option RSiduals
End
G3 : DZ twins
Data NInput_vars=2 NObservations=350
CMatrix Full File=dzsim.cov
Matrices= Group 1
H Full 1 1
Covariances A+C+E
|
H@A+C _
H@A+C
|
Matix H 0.5
Option RSiduals
! Power for alpha = 0.05 and 1 df
Option Power= 0.05,1
End
A+C+E /
A+C+E /