Significance Tests
Download
Report
Transcript Significance Tests
Significance Tests
P-values and Q-values
Outline
Statistical significance in multiple testing
Empirical distribution of test statistics
Family-wide p-values
Correlation and p-values
False discovery rates
Tests and Test Statistics
T-test is fairly robust to skew, but not robust to outliers –
“thick tails” of distribution
Non-parametric tests are robust, but lose too much
ability to detect differences (power)
Robust tests can be useful
Permutation tests are simple and easy to program
Some authors use:
xi , group1 xi , group2
si
rather than
ti
SDi q , SD
xi , group1 xi , group2
SDi
To reduce numbers of low fold-changes in highly signficant
scores
Distribution of test statistics
Quantile plots of t-statistics: left: random distn; right: experiment
Distribution of Set of p-values
Multiple comparisons
Suppose 10,000 genes on a chip
Each gene has a 5% chance of exceeding
the threshold score for a p-value of .05
None actually differentially expressed
Type I error definition
On average, 500 genes should exceed .05
threshold ‘by chance’
Family-Wide Error Rate
‘Corrected’ p-value:
Probability of finding a single false positive among all
N tests
Normally all tests at same threshold
Simplest correction (Bonferroni)
pi* = Npi, (if Npi < 1, otherwise 1)
Fairly close to true false positive rate in simulations of
independent tests
Too conservative in practice!
P-Values from Correlated
Genes
Null distribution from
Null distribution from
Null distribution from
independent genes
perfectly correlated genes
highly correlated genes
.5
.3
.9
.5
.3
.9
.5
.3
.9
.7
.03
.1
.5
.3
.9
.45
.2
.95
.4
.9
.05
.5
.3
.9
.65
.25
.8
.6
.8
.4
.5
.3
.9
.4
.35
.75
.2
.2
.9
.5
.3
.9
.5
.4
.85
Rows: genes; columns: samples;
entries: p-values from randomized distribution
The Effect of Correlation
If all genes are uncorrelated, Sidak is
exact
If all genes were perfectly correlated
p-values for one are p-values for all
No multiple-comparisons correction needed
Typical gene data is highly correlated
First eigenvalue of SVD may be more than
half the variance
More sensitive tests possible if we can
generate joint null distribution of p-values
Re-formulating the Question
Independent: ~5% of genes exceed .05
threshold, all the time
Perfectly Correlated: all genes exceed .05
threshold ~5% of the time
Realistically correlated: .05 < f1 < 1 of genes
exceeds .05 threshold, .05 < f2 < 1 of the cases
New question: for a given f1 and , how likely is
it that a fraction f1 of genes will exceed the
threshold?
Step-Down p-Values
Calculate single-step p-values for genes: p1, …, pN
Order the smallest k p-values: p(1), …, p(k)
For each k, ask:
How likely are we to get k p-values less than p(k) if no
differences are real?
Generate null distribution by permutations
More significant genes, at the same level of Type I error,
compared with single-step procedures
See Ge, et al, Test, 2003
Bioconductor package multtest
False Discovery Rate
At threshold t* what fraction of genes are
likely to be true positives?
Illustration: 10,000 independent genes
t
1.96
2.57
3.29
p
#sig
E(FP)
FDR*
.05
.01
.001
600
200
40
500
100
10
87%
50%
20%
In practice use permutation algorithm to compute FDR
pFDR
How to estimate the FDR?
‘positive’ False Discovery Rate:
E(#false positives/#positives) * P(#positives
>0)
Simes’ inequality allows this to be
computed from p-values