Title of slide - Royal Holloway, University of London
Download
Report
Transcript Title of slide - Royal Holloway, University of London
Statistics for HEP
Lecture 3: Further topics
http://indico.cern.ch/conferenceDisplay.py?confId=202569
69th SUSSP
LHC Physics
St. Andrews
20-23 August, 2012
Glen Cowan
Physics Department
Royal Holloway, University of London
[email protected]
www.pp.rhul.ac.uk/~cowan
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
1
Outline
Lecture 1: Introduction and basic formalism
Probability, statistical tests, parameter estimation.
Lecture 2: Discovery and Limits
Quantifying discovery significance and sensitivity
Frequentist and Bayesian intervals/limits
Lecture 3: Further topics
The Look-Elsewhere Effect
Unfolding (deconvolution)
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
2
Gross and Vitells, EPJC 70:525-530,2010, arXiv:1005.1891
The Look-Elsewhere Effect
Suppose a model for a mass distribution allows for a peak at
a mass m with amplitude .
The data show a bump at a mass m0.
How consistent is this
with the no-bump ( = 0)
hypothesis?
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
3
p-value for fixed mass
First, suppose the mass m0 of the peak was specified a priori.
Test consistency of bump with the no-signal ( = 0) hypothesis
with e.g. likelihood ratio
where “fix” indicates that the mass of the peak is fixed to m0.
The resulting p-value
gives the probability to find a value of tfix at least as great as
observed at the specific mass m0.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
4
p-value for floating mass
But suppose we did not know where in the distribution to
expect a peak.
What we want is the probability to find a peak at least as
significant as the one observed anywhere in the distribution.
Include the mass as an adjustable parameter in the fit, test
significance of peak using
(Note m does not appear
in the = 0 model.)
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
5
Gross and Vitells
Distributions of tfix, tfloat
For a sufficiently large data sample, tfix ~chi-square for 1 degree
of freedom (Wilks’ theorem).
For tfloat there are two adjustable parameters, and m, and naively
Wilks theorem says tfloat ~ chi-square for 2 d.o.f.
In fact Wilks’ theorem does
not hold in the floating mass
case because on of the
parameters (m) is not-defined
in the = 0 model.
So getting tfloat distribution is
more difficult.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
6
Gross and Vitells
Approximate correction for LEE
We would like to be able to relate the p-values for the fixed and
floating mass analyses (at least approximately).
Gross and Vitells show the p-values are approximately related by
where 〈N(c)〉 is the mean number “upcrossings” of -2ln L in
the fit range based on a threshold
and where Zfix is the significance for the fixed mass case.
So we can either carry out the full floating-mass analysis (e.g. use
MC to get p-value), or do fixed mass analysis and apply a
correction factor (much faster than MC).
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
7
Upcrossings of -2lnL
Gross and Vitells
The Gross-Vitells formula for the trials factor requires 〈N(c)〉,
the mean number “upcrossings” of -2ln L in the fit range based
on a threshold c = tfix= Zfix2.
〈N(c)〉 can be estimated
from MC (or the real
data) using a much lower
threshold c0:
In this way 〈N(c)〉 can be
estimated without need of
large MC samples, even if
the the threshold c is quite
high.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
8
Vitells and Gross, Astropart. Phys. 35 (2011) 230-234; arXiv:1105.4355
Multidimensional look-elsewhere effect
Generalization to multiple dimensions: number of upcrossings
replaced by expectation of Euler characteristic:
Applications: astrophysics (coordinates on sky), search for
resonance of unknown mass and width, ...
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
9
Summary on Look-Elsewhere Effect
Remember the Look-Elsewhere Effect is when we test a single
model (e.g., SM) with multiple observations, i..e, in mulitple
places.
Note there is no look-elsewhere effect when considering
exclusion limits. There we test specific signal models (typically
once) and say whether each is excluded.
With exclusion there is, however, the analogous issue of testing
many signal models (or parameter values) and thus excluding
some even in the absence of signal (“spurious exclusion”)
Approximate correction for LEE should be sufficient, and one
should also report the uncorrected significance.
“There's no sense in being precise when you don't even
know what you're talking about.” –– John von Neumann
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
10
Why 5 sigma?
Common practice in HEP has been to claim a discovery if the
p-value of the no-signal hypothesis is below 2.9 × 10-7,
corresponding to a significance Z = Φ-1 (1 – p) = 5 (a 5σ effect).
There a number of reasons why one may want to require such
a high threshold for discovery:
The “cost” of announcing a false discovery is high.
Unsure about systematics.
Unsure about look-elsewhere effect.
The implied signal may be a priori highly improbable
(e.g., violation of Lorentz invariance).
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
11
Why 5 sigma (cont.)?
But the primary role of the p-value is to quantify the probability
that the background-only model gives a statistical fluctuation
as big as the one seen or bigger.
It is not intended as a means to protect against hidden systematics
or the high standard required for a claim of an important discovery.
In the processes of establishing a discovery there comes a point
where it is clear that the observation is not simply a fluctuation,
but an “effect”, and the focus shifts to whether this is new physics
or a systematic.
Providing LEE is dealt with, that threshold is probably closer to
3σ than 5σ.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
12
Formulation of the unfolding problem
Consider a random variable y, goal is to determine pdf f(y).
If parameterization f(y;θ) known, find e.g. ML estimators qˆ.
If no parameterization available, construct histogram:
“true” histogram
New goal: construct
estimators for the μj (or pj).
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
13
Migration
Effect of measurement errors: y = true value, x = observed value,
migration of entries between bins,
f(y) is ‘smeared out’, peaks broadened.
discretize: data are
response
matrix
Note μ, ν are constants; n subject to statistical fluctuations.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
14
Efficiency, background
Sometimes an event goes undetected:
efficiency
Sometimes an observed event is due to a background process:
βi = expected number of background events in observed histogram.
For now, assume the βi are known.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
15
The basic ingredients
“true”
G. Cowan
“observed”
St. Andrews 2012 / Statistics for HEP / Lecture 3
16
Summary of ingredients
‘true’ histogram:
probabilities:
expectation values for observed histogram:
observed histogram:
response matrix:
efficiencies:
expected background:
These are related by:
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
17
Maximum likelihood (ML) estimator
from inverting the response matrix
Assume
can be inverted:
Suppose data are independent Poisson:
So the log-likelihood is
ML estimator is
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
18
Example with ML solution
Catastrophic
failure???
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
19
What went wrong?
Suppose μ really had a lot of
fine structure.
Applying R washes this
out, but leaves a residual
structure:
But we don’t have ν, only n. R-1 “thinks” fluctuations in n are
the residual of original fine structure, puts this back into mˆ .
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
20
ML solution revisited
For Poisson data the ML estimators are unbiased:
Their covariance is:
(Recall these statistical errors were huge for the example shown.)
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
21
ML solution revisited (2)
The information inequality gives for unbiased estimators the
minimum (co)variance bound:
invert →
This is the same as the actual variance! I.e. ML solution gives
smallest variance among all unbiased estimators, even though
this variance was huge.
In unfolding one must accept some bias in exchange for a
(hopefully large) reduction in variance.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
22
Correction factor method
Often Ci ~ O(1) so statistical errors far smaller than for ML.
Nonzero bias unless MC = Nature.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
23
Example from Bob Cousins
Reality check on the statistical errors
Suppose for some bin i we have:
(10% stat.
error)
But according to the estimate, only 10 of the 100 events
found in the bin belong there; the rest spilled in from outside.
How can we have a 10% measurement if it is based on only 10
events that really carry information about the desired parameter?
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
24
Discussion of correction factor method
As with all unfolding methods, we get a reduction in statistical
error in exchange for a bias; here the bias is difficult to quantify
(difficult also for many other unfolding methods).
The bias should be small if the bin width is substantially larger
than the resolution, so that there is not much bin migration.
So if other uncertainties dominate in an analysis, correction factors
may provide a quick and simple solution (a “first-look”).
Still the method has important flaws and it would be best to
avoid it.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
25
Regularized unfolding
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
26
Regularized unfolding (2)
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
27
Tikhonov regularization
Solution using Singular Value Decomposition (SVD).
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
28
SVD implementation of Tikhonov unfolding
A. Hoecker, V. Kartvelishvili, NIM A372 (1996) 469;
(TSVDUnfold in ROOT).
Minimizes
Numerical implementation using Singular Value Decomposition.
Recommendations for setting regularization parameter τ:
Transform variables so errors ~ Gauss(0,1);
number of transformed values significantly different
from zero gives prescription for τ;
or base choice of τ on unfolding of test distributions.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
29
SVD example
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
30
Edge effects
Regularized unfolding can lead to “edge effects”. E.g. in
Tikhonov regularization with Gaussian data, solution can go
negative:
Solution pushed
negative.
Important e.g. if New Physics would appear as a longer tail of
a distribution.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
31
Regularization function based on entropy
Can have Bayesian motivation:
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
32
Example of entropy-based unfolding
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
33
G. Cowan, Statistical Data Analysis, OUP (1998) Ch. 11
Estimating bias and variance
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
34
G. Cowan, Statistical Data Analysis, OUP (1998) Ch. 11
Choosing the regularization parameter
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
35
G. Cowan, Statistical Data Analysis, OUP (1998) Ch. 11
Choosing the regularization parameter (2)
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
36
G. Cowan, Statistical Data Analysis, OUP (1998) Ch. 11
Some examples with Tikhonov regularization
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
37
G. Cowan, Statistical Data Analysis, OUP (1998) Ch. 11
Some examples with entropy regularization
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
38
Iterative unfolding (“Bayesian”)
; see also arXiv:1010.0632.
Goal is to estimate probabilities:
For initial guess take e.g.
Initial estimators for μ are
Update according to the rule
uses Bayes’ theorem here
Continue until solution stable
using e.g. χ2 test with previous
iteration.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
39
Estimating systematic uncertainty
We know that unfolding introduces a bias, but quantifying this
(including correlations) can be difficult.
Suppose a model predicts a spectrum
A priori suppose e.g. θ ≈ 8 ± 2. More precisely, assign prior π(θ).
Propagate this into a systematic covariance for the unfolded
spectrum:
(Typically large
positive correlations.)
Often in practice, one doesn’t have π(θ) but rather a few models
that differ in spectrum. Not obvious how to convert this into
a meaningful covariance for the unfolded distribution.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
40
Stat. and sys. errors of unfolded solution
In general the statistical covariance matrix of the unfolded
estimators is not diagonal; need to report full
But unfolding necessarily introduces biases as well, corresponding
to a systematic uncertainty (also correlated between bins).
This is more difficult to estimate. Suppose, nevertheless,
we manage to report both Ustat and Usys.
To test a new theory depending on parameters θ, use e.g.
Mixes frequentist and Bayesian elements; interpretation of result
can be problematic, especially if Usys itself has large uncertainty.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
41
Folding
Suppose a theory predicts f(y) → μ (may depend on parameters θ).
Given the response matrix R and expected background β, this
predicts the expected numbers of observed events:
From this we can get the likelihood, e.g., for Poisson data,
And using this we can fit parameters and/or test, e.g., using
the likelihood ratio statistic
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
42
Versus unfolding
If we have an unfolded spectrum and full statistical and
systematic covariance matrices, to compare this to a model μ
compute likelihood
where
Complications because one needs estimate of systematic bias Usys.
If we find a gain in sensitivity from the test using the unfolded
distribution, e.g., through a decrease in statistical errors, then we
are exploiting information inserted via the regularization (e.g.,
imposed smoothness).
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
43
ML solution again
From the standpoint of testing a theory or estimating its parameters,
the ML solution, despite catastrophically large errors, is equivalent
to using the uncorrected data (same information content).
There is no bias (at least from unfolding), so use
The estimators of θ should have close to optimal properties:
zero bias, minimum variance.
The corresponding estimators from any unfolded solution cannot
in general match this.
Crucial point is to use full covariance, not just diagonal errors.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
44
Summary/discussion
Unfolding can be a minefield and is not necessary if goal is to
compare measured distribution with a model prediction.
Even comparison of uncorrected distribution with future theories
not a problem, as long as it is reported together with the expected
background and response matrix.
In practice complications because these ingredients have
uncertainties, and they must be reported as well.
Unfolding useful for getting an actual estimate of the distribution
we think we’ve measured; can e.g. compare ATLAS/CMS.
Model test using unfolded distribution should take account of
the (correlated) bias introduced by the unfolding procedure.
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
45
Summary of Lecture 3
Bayesian treatment of limits is conceptually easy (integrate
posterior pdf); appropriate choice of prior not obvious.
Look-Elsewhere Effect
Need to give probability to see a signal as big as the one
you saw (or bigger) anywhere you looked. Hard to define
precisely; approximate correction should be adequate.
Why 5 sigma? If LEE taken in to account, one is usually convinced
the effect is not a fluctuation much earlier (at 3 sigma?)
Unfolding
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
46
Extra slides
G. Cowan
St. Andrews 2012 / Statistics for HEP / Lecture 3
47