Visit to Monroe County High Schools

Download Report

Transcript Visit to Monroe County High Schools

Measures of Significance
Jim Linnemann
Michigan State University
U. Toronto
Dec 4, 2003
A Prickly Problem
not to everyone’s taste…
•
•
•
•
•
•
•
•
What is Significance?
Li and Ma Equations
Frequentist Methods
Bayesian Methods
Is Significance well-defined?
To Do
Help Wanted
Summary
•
A simple problem
with lots of answers…
Two Poisson measurements
– 1) “background region”
– 2) “signal region”
– Poisson assumptions usually well satisfied
•
Is the 2nd measurement predicted by the first?
1) Measure significance of failure of prediction:
Did I see something potentially interesting?
I.e., not background
2) If “significant enough”
proceed to estimation of size of new effect
I’ll describe practice in High Energy Physics and Gamma
Ray Astronomy (Behavioral Statistics)
Crossed Disciplines…
To physicists, this may sound like a statistics talk
Or both HEP and Astrophysics (or neither…)
To statisticians…a physics talk
Apologies, but I’m try trying to cross boundaries
What a sabbatical is for, no?
From HEP (D0 experiment)
Working on Gamma Ray Astronomy (Milagro experiment)
Help?
• To my statistical colleagues:
– Better understand Fraser-Reid with nuisance parameters
prime motive for visit
• Geometrical intuition for nuisance parameters
– Is the indeterminacy of Z well understood?
– Binomial-Bayes identity interesting?
– Is Binomial test really best?
My Original Goal
What Can HEP and Astrophysics Practice
Teach Each Other?
Astrophysics
(especially  ray)
aims at simple formulae (very fast)
calculates σ’s directly (Asymptotic Normal)
hope it’s a good formula
HEP
(especially Fermilab practice)
calculates probabilities by MC (general; slow)
translates into σ’s for communication
loses track of analytic structure
Statistics: what can I learn
This is a progress report…
D0 (a tiny slice) and Milagro
• D0: searches for new particles (Top Quark; SUSY…)
– Many input events
– Many selection criteria
– ~ 10 events in signal region, maybe all background
• Milagro: searches for new or weak sources
–
–
–
–
Many input events (mostly hadron showers)
Few photons among this background
Few selection criteria
Cases:
• A) ~106 background events, possibly 10-3 photon excess
• B) many places to look in the sky
– 104 or more trials (multiple comparisons)
• C) many time scales for variable phenomena
– 1ms to 1 year
so up to 1014 trials
– ~10-100 counts in search region, maybe all background
DØ is a large scale
experiment…
…with some complexities
that come with scale
many different kinds of
analyses: I’m talking
about searches only!
Experimenters
have little
choice but to
love their
apparatus—we
spend years
building it!
Milagro is a smaller experiment
• “Only” 30 people
• ~ 100 man-years, ~5M$ construction cost
• ~ 5 year data taking lifetime
Los Alamos, NM
2600 m
Milagro Schematic
Use water instead of scintillators to detect EAS particles
100% of the area is sensitive (instead of 1% with scintillator)
e
m

8 meters
50 meters
80 meters
Low energy threshold (100 GeV)
Median energy ~4 TeV
High duty cycle (~95%)
(10% for air Cherenkov)
Large field of view (~2 sr)
(.003 sr)
Good background rejection (~90%) But needs work!
Trigger Rate 1.7 kHz
Milagro pointing:
About 1 degree
Moon Shadow: Proton Energy scale
Significance
maps!
and angular resolution
E = 640±70 GeV (MC 690 GeV)
sq = 0.9o
Declination
One year survey detected Crab and active galactic nuclei
Mrk 421 with the rest of the sky consistent with no
detections giving upper limits for the predicted TeV sources
indicated by circles
Mrk 421
Hot Spot
Crab
Right Ascension
s from background
The Northern Hemisphere in TeV Gamma Rays: 12/00-12/01
The Crab Nebula
Raw Data
On: 16,987,703
Off: 16,981,520
Significance: 1.4 s
Cut Data
On: 1,952,917
Off: 1,945,109
Excess: 7,808 (~10/day)
Significance: 5.4 s
Milagrito Results – MRK 501
Significance:Signal Search
with Poisson Background
• Z value: ~ Normal(0,1) (Astro. Usage; Milagro; Li Ma)
• The art is to pick a good variable for this
More Generally:
• P(more extreme “signal” | background) (HEP usage)
P-value (similar to Chi-squared probability)
– Assume Null Hypothesis: = background only
P(data worse | bkg), not
P(not bkg | data)
– Translate probability p into Z value by:
Z
Z   ( p);    e
1

t 2 / 2
dt

Z  u  Ln u   Ln p , u  2 Ln p 2

Observed vs. Prospective
Significance
• This discussion: Observed Significance (of my data)
– Post-hoc: (after data)
– Need definition of Z
– Choice of Zmin for observational claim
= max P(observed|background)
• Prospective Observability (before data, to optimize expt.)
• Involves more : both signal and background
– Naive calculation:
Z = S / √ B ( ignores fluctuations: Significance for Expected Conditions )
– Optimistic: crudely, ½ time less signal; or ½ time more background!
– Should consider Pr ( Z > Zmin ) (making observational claim)
Source Strength for 50% probability of observation? 90% ? (power)
Backgrounds in Astro and HEP
• Astrophysics: On-source vs. Off-source
– side observation with α = Ton/Toff (sensitivity ratio)
– b = α Noff; δb = α Noff
– α = (δb)2 / b
(deduced from above)
• HEP:
–
–
–
–
–
estimate background in defined signal region
Sometimes a sideband measurement, like Astrophysics
Often a MC estimate; rescaled to signal sensitivity
More often a sum of terms of both types
b ± δb
δb: uncertainties in quadrature
α = (δb)2 / b
I’ll use as a definition of effective α
Apply astrophysics formulae (non-integer N)
Li and Ma Equations
Z = S / σ(S)
S = Non – b
b = α Noff
N is observation;
b is background estimate
Eq 5: Var(S) = Var(Non) + Var(b) =Non + α2 Noff
Ignores key null hyp constraint: μon = α μoff (anti-signal bias!)
Eq 9: Var(S) = α (Non + Noff)
Obeys constraint; uses Non and Noff to estimate μoff
(Eq 5, 9 use Poisson → Normal)
Li and Ma don’t recommend Eq 5 or 9, but instead:
Eq 17: Log Likelihood Ratio
Z  2 x  Ln[ 
1
L(μs, μb) / L(μb)
 ]  y  Ln[(1   ) ]
x
x y
Generic test for composite hypothesis
+ Wilks’ Theorem
y
x y
x = Noff; y = Noff
A Li and Ma Variant
Eq 5c: Var(S) = α (1+ α) Noff
• Use only Noff to estimate Var(Non) and Var(b)
respects null hypothesis, not biased
– Obviously, poor sensitivity if α > 1
• But that’s always true—need decent estimate of background
• Get Eq 9 again if use both Non and Noff in M.L.
estimate of μon = α μoff
Other Frequentist Methods
Ignoring uncertainty in b:
• S/B
Li Ma 10a—not recommended
• Poisson (Non | b) (often much better)
 ( Non,0, b) / ( Non)
• Feldman & Cousins?
confidence limits!
– For significance, just Poisson( Non | b), I believe
Other Frequentist Methods
Using Uncertainty in b:
• b + δb instead of b in above
(I’ve seen it!)
• Near-Constant Variance (Zhang and Ramsden)
Z VS 
2
1




3
x

8
3
y
8




• Fraser Reid Likelihood to Significance Asymptotics
• Binomial Test
Fraser and Reid
Approximate Significance
• Interesting approximate method (last 15 yr)
• Significance from likelihood curve
• Can view as asymptotic expansion of pdf for large n
– Combine 2 first order estimates of significance: O(n-.5)
• Zr from Likelihood Ratio + Wilks Theorem (usually better estimate)
• Zt = D/σ from D = θ* - θ ; σ from Fisher Information ∂2L/∂2θ
Combine to give O(n-1.5) estimate: 1st order →3rd order:
One version: Z3 = Zr + 1/ Zr Ln(Zt / Zr)
• Fast & simple numerically to apply final formula
• Must redo algebra for each new kind of problem
– I’m still working to apply it to Non, Noff fully
• nuisance parameter; reduction to canonical exponential parameters
Two first-order estimates of significance
= Φ(Z(x))
from Likelihood Ratio
x
Corrected curve is closer to the Likelihood Ratio estimate
Binomial Proportion Test: Ratio of Poisson Means
P-value = Pr Binomial( Non | ω, k)
where ω = /(1+)
k j
p  value     (1   ) k  j  B( , x,1  y) / B( x,1  y)
jx  j 
Very stable!
k  x y
k
Conditional: Holds k = Non + Noff as fixed
(2nd Poisson mean a nuisance parameter)
UMPU (Uniformly Most Powerful Unbiased)
for Composite Hypothesis test μon / α μoff  1
If ~optimal, probably using the best variable!
Binomial Proportion Test: Ratio of Poisson Means
• Not in common use; probably should be
Known in HEP and Astrophysics: but not as optimal
nor a standard procedure
– Zhang and Ramsden claim too conservative
• for Z small? Even if true, we want Z > 4
– Closed form in term of special functions, or sums
• Applying for large N requires some delicacy;
• slower than Eq 17
• Gaussian Limit:
Z = (Non/k – ω)/√ [ω(1- ω)/k ] = Eq 9
Bayesian Methods
• In common use in HEP
• Cousins & Highland “smeared likelihood” efficiency
• Allow correlation among backgrounds (MC integration)
– Natural extension to efficiency, upper limits
• Predictive Posterior (after background measurement)
– P(x | y) (integrate posterior over the theoretical mean)
– Natural avenue for connection with p-values for Non
But: typical Bayes analysis isn’t significance, but odds ratio
– A flat prior for background, gives Gamma dist. for p(b|y)
– P value calc using Gamma:
(also Alexandreas--Astro)
• numerically, identical to Frequentist Binomial Test !
– Truncated Gaussian often used in HEP to represent p(b|y)
• Less tail (high b) than Gamma: higher reported significance
Predictive Posterior Bayes P-value (HEP)
In words: tail sum averaged over Bayes posterior for mean
or: integrate before sum
p ( j | y )   p( j | m ) p ( m | y )dm
p ( j | m )
m j em
p ( m | y ) 
j!
 e
y

,
  m /
y!
p N ( m | y )  Norm al[( m  b) / b],

P  value( x, y )   p( j |y )
jx
Posterior for μ with
flat prior for y
b  y
Two ways to write
Bayesian p-value:
p( j | y )   p( j | m ) p( m | y )dm

P  value( x, y )   p( j |y )   p( j  x | m ) p( m | y )dm
jx
where p( j  x | m )  Poisson p  value( j  x | m )
Bayesian p-value can be thought of as
Poisson p-value weighted by posterior for μ
Compute by sum, by numerical or MC integral
or, as it turns out, by an equivalent formula…
Bayes Gamma Posterior p-value
p ( j | y )   p ( j | m ) p  ( m | y ) dm
p( j | m ) 
m j em
p ( m | y ) 
;
j!
perform integral then :
 ye
y!
,
  m /
( y  j )! j
P  value( x, y )   p ( j |y )  
 (1   )1 y
j! y!
jx
jx


where    /(1   )
Large x, y: approximate sum by integral…
k j
     (1   ) k  j
jx  j 
k
= Binomial p-value
Surprised? I was! Proof: H. Kim, MSU
Comparing the Methods
• Some test cases from published literature
• And a few artificial cases
– Range of Non, Noff values
– Different α values (mostly < 1)
– Interest: Z > 3, but sometimes >> 3 (many trials…)
• Color Code Accuracy
– Assume Frequentist Binomial as Gold Standard
• Zhang and Ramsden found best supported by MC
At least when calculating Z > 3 or so
At worst, ZMC 3% higher
N increasing →
Poisson
Poisson mid
Fraser-Reid 1
Fraser-Reid 1 Z
Fraser-Reid 1 canonical
FR 1 Z canonical
2.08
2.28
2.07
2.17
2.35
2.35
2.84
3.00
2.84
3.14
3.07
2.92
2.14
2.29
2.14
2.38
2.32
2.20
4.87
4.96
4.87
5.04
4.99
4.91
3.80
3.86
3.80
3.89
3.87
3.82
5.76
5.82
5.76
5.78
5.82
5.78
7.72
8.78
8.95
8.81
8.80
8.78
6.44
6.46
6.44
6.47
6.46
6.44
3.37
3.37
3.37
3.37
3.37
3.37
Poission(>= x | b) very well approximated by FR with
non-canonical parameter, p*
mid-p Poisson well approximated by FR canonical,
7.09
7.09
7.09
7.10
7.09
7.09
6.69
6.68
6.68
6.68
6.68
6.68
Results Comments:
use Zbinomial as standard
•
•
•
•
•
Bayes: ZΓ = Zbinomial—when it converged
Bayes: ZΓ < Znormal
α < 1 and N > 500 easiest
LR, √ not bad
Usually, bad formulae overestimate Z!!
– S/√B, for example
– But Z5 is biased against signal
• Fraser-Reid vs Poisson:
– Exact all but one case (overestimate)
– Very slow calculation for very large N if integers
• Faster if floating point…Mathematica in action
How to test a Significance Z Variable?
Standard Method of MC Testing a Variable:
• “self-test”: compare Z with distribution of statistic
for MC assuming background only
– i.e. convert back from Z to probability
• Good if PrMC(Z>Zo) = PrGauss(Z>Zo)
– Intuition: want fast convergence to Gaussian
Why not just compare with “right answer”?
• Variables all supposed to give same Z, right?
Significance ill-defined in 2-D (Non, Noff)!
What is a Bigger Deviation?
Part of Significance Definition!
Which
contour?
• Measure Non, Noff = (x,y)
• Which values are worse?
– Larger S = x - α y?
– Or farther from line x = α y?
• In angle? Perpendicular to line?
• Trying to order 2-dim  set!
y↑
(Noff)
More
Signal
– Points on (x,y) plane
– Nuisance parameter bites again
• Statistics give different metrics
contours of equal deviation
• Convergence (to Gaussian)?
For large N:
Enough peaking so overlapping
regions dominate integrals?
x→
(Non)
9√x + 3/8 …
S
√B
•
Binomial
•
Likelihood Ratio
•
•
Compare
line shape
near point,
not spacing
To Do
• Monte Carlo Tests
• Fraser-Reid for full problem
• Nuisance parameter treatment, geometry
• Canonical parameter?
– Simpler numerics, if it works!
• But: problems for large Z ?
• And for large N?
Summary
• Probably should use Binomial Test for small N
– Optimal Frequentist Technique
– numerically, more work than Li Ma Eq 17 (Likelihood Ratio)
– Binomial Test and L. Ratio have roots in Hyp Testing
• For high and moderate N, Likelihood Ratio OK
– Anything works for Crab, but not for short GRB’s
• Most errors overestimate significance
– Variance Stabilization better than Li Ma Eq 9
– S/√B is way too optimistic—ignores uncertainty in B
• Interesting relations exist among methods
– Bayes with Gamma = Binomial
• Li Ma Eq 9 = Binomial for large N
– Bayes with Gaussian a bit more optimistic than Gamma
• Fraser-Reid Approximation Promising but not finished
References
Li & Ma Astroph. Journ. 272 (1983) 314-324
Zhang & Ramsden Experimental Astronomy 1 (1990) 145-163
Fraser Journ. Am. Stat. Soc. 86 (1990) 258-265
Alexandreas et. al. Nuc. Inst. & Meth. A328 (1993) 570-577
Gelman et. al., Bayesian Data Analysis, Chapman & Hall (1998)
(predictive p-value terminology)
Talks: http://www-conf.slac.stanford.edu/phystat2003/
Controlling the
False Discovery Rate:
From Astrophysics to
Bioinformatics
The Problem:
• I: You have a really good idea. You find a positive
effect at the .05 significance level.
• II: While you are in Stockholm, your students
work very hard. They follow up on 100 of your
ideas, but have somehow misinterpreted your
brilliant notes. Still, they report 5 of them gave
significant results. Do you book a return flight?
Significance
• Define “wrong” as reporting false positive:
– Apparent signal caused by background
• Set 
a level of potential wrongness
– 2 s =.05
3 s = .003 etc.
• Probability of going wrong on one test
• Or, error rate per test
– “Per Comparison Error Rate” (PCER)
• Statisticians say: “z value” instead of z s’s
– Or “t value”
What if you do m tests?
• Search m places
• Must be able to define “interesting”
– e.g. “not background”
• Examples from HEP and Astrophysics
•
•
•
•
Look at m histograms, or bins, for a bump
Look for events in m decay channels
Test standard model with m measurements (not just Rb or g-2)
Look at m phase space regions for a Sleuth search (Knuteson)
•
•
•
•
•
Fit data to m models: What’s a bad fit?
Reject bad tracks from m candidates from a fitting routine
Look for sources among m image pixels
Look for “bursts” of signals during m time periods
Which of m fit coefficients are nonzero?
– Which (variables, correlations) are worth including in the model?
– Which of m systematic effect tests are significant?
Rather than testing each independently
“Multiple Comparisons”
• Must Control False Positives
– How to measure multiple false positives?
Default method:
• Chance of any false positives in whole set
• Jargon: Familywise Error Rate (FWER)
– Whole set of tests considered together
– Control by Bonferroni, Bonferroni-Holm, or Random
Field Method
See backup slides for more
Must do something about m!
– m is “trials factor”
only NE Jour Med demands!
– Don’t want to just report m times as many signals
• P(at least one wrong) = 1 – (1- )m ~ m
– Use /m as significance test
“Bonferroni correction”
• This is the main method of control
– Keeps to  the probability of reporting 1 or more wrong
on whole ensemble of m tests
– Good: control publishing rubbish
– Bad: lower sensitivity (must have more obvious signal)
• For some purposes, have we given up too much?
False Discovery Rate (FDR)
• Fraction of errors in signal candidates
– Proportion of false positives among rejected tests
“False Discovery Fraction” might have been clearer?
Physics: rate = N/time
Statistics: rate = fraction?
• use FDR method to set the threshold
Hypothesis Testing:
Decision, based on test statistic:
Null (Ho) True
Background
Null Retained
Reject Null =
(can’t reject)
Accept Alternative
U
V
false positive
Type I Error α = εb
B
(noise)
R
mo
false discovery
Alternative True T inefficiency
S true positive
signal
Type II Error β = 1- εs
true detection
m-R
Total
reported signal
m1
m
= S+B
rejections
FDR = V/R = B/(S+B) if R > 0
0 if R=0
Goals of FDR
•
•
•
•
Tighter than  (single-test)
Looser than /m (Bonferroni trials factor )
Improve sensitivity (“power”; signal efficiency)
Still control something useful:
– fraction of false results that you report
b/(s+b) after your cut = 1 - purity
• rather than 1-α = rejection(b); or efficiency(s)
• for 1 cut, you only get to pick 1 variable, anyway
• Last, but not least, a catchy TLA
FDR in High Throughput
Screening
An interpretation of FDR:
Exp(
expense wasted chasing “red herrings”
cost of all follow-up studies
)≤q
GRB alerts from Milagro?
Telescope time to search for optical counterpart
FDR in a nutshell
– Search for non-background events
– Need only the background probability distribution
– Control fraction of false positives reported
• Automatically select how hard to cut, based on that
What is a p-value?
(Needed for what’s next)
Observed significance of a measurement
Familiar example: P( ≥ 2 | )
(should be flat)
• Here, probability that event produced by
background (“null hypothesis”)
• Measured in probability
• Same as “sigmas”—different units, that’s all
P value properties:
If all events are background
Distribution of p values = dn/dp should be flat
and have a linearly rising cumulative distribution
N(x) = ∫0x dp (dn/dp) = x
N(p in [a, b]) = (b-a)
So expect N(p ≤ p(r))/m = r/m for r-smallest p-value
Flat also means linear in log-log: if y = ln p
ln[ dn/dy] vs. y is a straight line, with a predicted slope
From GRB
paper, fig 1
Signal,
statistics, or
systematics?
“Best” of 9 plots
Note: A histogram is a binned sorting of the p-values
Benjamini & Hochberg
JRSS-B (1995) 57:289-300
1
• Select desired limit q on Expectation(FDR)
 is not specified: the method selects it
• Sort the p-values, p(1)  p(2)  ...  p(m)
• Let r be largest i such that
p(i)  q(i/m)/c(m)
– i.e. Accept as signal
• Proof this works is not obvious!
p-value
• Reject all null hypotheses
corresponding to
p(1), ... , p(r).
q ~ .15
q(i/m)/c(m)
0
For now, take c(m)=1
p(i)
0
i/m
1
Take all pi ≤ last one below
Plausibility argument
for easily separable signal of Miller et al.
• p(r) ≤ q r/m
• p(r) = q R /m
(definition of cutoff)
(r = R : def of # rejects)
remember: rejected null = reported as signal
• Now assume background uniform
– AND all signal p values ≈ 0, << p(background) i.e. easily separable
Then expected probability of last rejected background is:
• p(r) = Rbackground/m
• Solving, q = Rbackground / R
Full proof makes no assumptions on signal p’s
Other than distinguishable (p’s nearer 0 than background)
Benjamini & Hochberg:
Varying Signal Extent
p=
Signal Intensity 3.0
z=
Signal Extent 1.0
Noise Smoothness 3.0
1
Benjamini & Hochberg:
Varying Signal Extent (MC)
p=
z=
(none pass)
Flat
Signal Intensity 3.0
Signal Extent 3.0
Noise Smoothness 3.0
3
Benjamini & Hochberg:
Varying Signal Extent
p = 0.000252
Signal Intensity 3.0
z = 3.48
Signal Extent 5.0
(3.5 s cut chosen by FDR)
Noise Smoothness 3.0
4
Benjamini & Hochberg:
Varying Signal Extent
p = 0.007157
Signal Intensity 3.0
z = 2.45 (2.5 s: stronger signal)
Signal Extent 16.5
Noise Smoothness 3.0
6
Benjamini & Hochberg:
Varying Signal Extent
p = 0.019274
Signal Intensity 3.0
z = 2.07 (2.1 s: stronger signal)
Signal Extent 25.0
Noise Smoothness 3.0
7
Benjamini & Hochberg
c(m) factor
• c(m) = 1
– Positive Regression Dependency on Subsets
• Technical condition, special cases include
– Independent data
– Multivariate Normal with all positive correlations
• Result by Benjamini & Yekutieli, Annals of Statistics, in press.
• c(m) = i=1,...m 1/i  log(m)+0.5772
– Arbitrary covariance structure
• But this is more conservative—tighter cuts
FDR as Hypothesis Test
Quasi distribution-free
• Assumes specific null (flat p-values)
in this, like most null hypothesis testing
but works for any specific null distribution, not just Gaussian; 2
– distribution-free for alternative hypothesis
• Distribution-free estimate, control of s/b!
A nice surprise
– Fundamentally Frequentist:
• Goodness of Fit test to well-specified null hypothesis
• No crisp alternative to null needed: anti-Bayesian in feeling
Strength: search for ill-specified “something new”
if different enough to give small p-values
• No one claims it’s optimal
– With a specific alternative, could do sharper test
• Better s/b for same α or vice versa
Comments on FDR
• To use method, you must
not so new!
– know trials factor
– Be able to calculate small p values correctly
– Have p values of all m tests in hand (retrospective)
• Or, to use online, a good-enough sample of same mix of s+b
• Lowest p value p(1) always gets tested with q/m (i=1)
• If no signal , q FDR  Bonferroni in α/m = q/m
– FWER = q for FDR α for Bonferroni when no real signal
• Uses distribution of p’s
– Even if p(1) fails
– FDR sees other p(i) distorting the pure-null shape
– FRD raises the threshold and accepts p(1) … p(r)
FDR Talks on Web
Users:
– This talk:
user.pa.msu.edu/linnemann/public/FDR_Bio.pdf
• 3 more pages of references; and another 30 slides of details
– T. Nichol U Mich www.sph.umich.edu/~nichols/FDR/ENAR2002.ppt
Emphasis on Benjamini’s viewpoint; Functional MRI
– S. Scheid, MPI
cmb.molgen.mpg.de/compdiag/docs/storeypp4.pdf
Emphasis on Storey’s Bayesian viewpoint
Statiticians:
– C. Genovese CMU
www.stat.ufl.edu/symposium/2002/icc/web_records/genovese_ufltalk.pdf
– Y. Benjamini Tel Aviv
www.math.tau.ac.il/~ybenja/Temple.ppt
Random Field Theory (another approach to smoothed data)
– W. Penny, UCLondon, www.fil.ion.ucl.ac.uk/~wpenny/talks/infer-japan.ppt
- Matthew Brett, Cambridge www.mrc-cbu.cam.ac.uk/Imaging/randomfields.html
Some other web pages
•
http://medir.ohsu.edu/~geneview/education/Multiple test corrections.pdf
Brief summary of the main methods
•
www.unt.edu/benchmarks/archives/2002/april02/rss.htm
Gentle introduction to FDR
www.sph.umich.edu/~nichols/FDR/
FDR resources and references—imaging
http://www.math.tau.ac.il/~roee/index.htm
FDR resource page by discoverer
Some FDR Papers on Web
Astrophysics
arxiv.org/abs/astro-ph/0107034
Miller et. al. ApJ 122: 3492-3505
FDR explained very clearly; heuristic proof for well-separated signal
arxiv.org/abs/astro-ph/0110570
Hopkins et. Al.
Dec 2001
ApJ 123: 1086-1094 Dec 2002
2d pixel images; compare FDR to other methods
taos.asiaa.sinica.edu.tw/document/chyng_taos_paper.pdf
FDR comet search (by occultations)
will set tiny FDR limit 10-12 ~ 1/year
Statistics
http://www.math.tau.ac.il/~ybenja/depApr27.pdf
clarifies c(m) for different dependences of data
Benjamini et al:
(invented FDR)
Benjamani, Hochberg: JRoyalStatSoc-B (1995) 57:289-300
paper not on the web
defined FDR, and Bonferroni-Holm procedure
http://www-stat.stanford.edu/~donoho/Reports/2000/AUSCFDR.pdf Benjamani et al
study small signal fraction (sparsity), relate to minimax loss
http://www.stat.cmu.edu/www/cmu-stats/tr/tr762/tr762.pdf Genovese, Wasserman
conf limits for FDR; study for large m; another view of FDR as data-estimated method on mixtures
http://stat-www.berkeley.edu/~storey/
Storey
view in terms of mixtures, Bayes; sharpen with data; some intuition for proof
http://www-stat.stanford.edu/~tibs/research.html
Efron, Storey, Tibshirani
show Empirical Bayes equivalent to BH FDR