Sample size for microarray experiments

Download Report

Transcript Sample size for microarray experiments

R users: ssize package for sample size
Judith Boer
June 6, 2006
Introduction ssize package
• Simple R tool for sample size determination based on a pilot study
• Developed by Gregory Warnes (Pfizer)
• Available from BioC
• Input is list of standard deviations from representative control arrays
ssize package (BioC, Gregory Warnes, Pfizer R&D)
• Estimate standard deviation for each gene based on
representative control arrays
• Set parameters for
• minimum effect size (fold change)
2
• maximum family-wise type I error rate
0.05
• desired power
0.80
• Calculate per-test error rate based on Bonferroni correction
• Compute sample size for each gene separately according to
standard formula for the two-sample t-test with pooled variance
• Summarize the necessary sample size across all genes using
a cumulative plot
Assumption and output of ssize
• Microarray data has been normalized and transformed so that
the data for each gene is sufficiently close to a normal
distribution that a standard 2-sample pooled-variance t-test will
reliably detect differentially expressed genes
• however, alternative test method can be implemented
• Output ssize package is three plots:
• power plot
• sample size plot
• fold change plot
Performance of ssize in simulation study
• Manuscript submitted to Biometrics: Warnes & Liu
• Dependence of the genes: little or no effect
• Proportion of genes with true differential expression: no effect
• Unequal variance between control and test groups: has effect
• solution: use unequal variance t-test sample size formula
• Multiple comparison method: use of Bonferroni correction
considerably underestimates power, hence overestimates
sample size
• use of FDR planned by Warnes
Demonstration on our data
• Demonstrated on Agilent two-color data from a platform comparison
study by Peter-Bram ‘t Hoen and the LGTC
• 10 arrays with direct comparisons of 5 WT and 5 transgenic mice,
using dye swap replicates
• Loess normalization in limma
• Exported the MA list object
• Extracted the normalized log ratios for the KO vs WT with same dye
orientation (first 5 arrays)
• Calculated the standard deviation of the log ratios (exp.sd)
• Used exp.sd for the ssize package
Histograms of the standard deviations
Powerplot: power to detect 2-fold change
Samplesize plot: sample size to detect 2-fold change
Fold change plot: fold change to achieve 80% power
Conclusions ssize package
• Not useful for absolute estimation of sample size due to
Bonferroni correction
• Their own example comparing Bonferroni and FDR showed
reduction of sample size needed for 90% power from 8 to
below 3 arrays (curve very steep)
• Useful for relative sample size estimation
• compare different microarray platforms
• compare different biological sources (organism, tissue, treatment,
in vitro)
• Simple tool that anyone can use in R