The small-n problem in High Energy Physics
Download
Report
Transcript The small-n problem in High Energy Physics
The small-n problem in High Energy Physics
Glen Cowan
Department of Physics
Royal Holloway, University of London
[email protected]
www.pp.rhul.ac.uk/~cowan
Statistical Challenges in Modern Astronomy IV
June 12 - 15, 2006
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
1
SCMA4, 12-15 June, 2006
Outline
I.
High Energy Physics (HEP) overview
Theory
Experiments
Data
II.
The small-n problem, etc.
Making a discovery
Setting limits
Systematic uncertainties
III.
Conclusions
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
2
SCMA4, 12-15 June, 2006
The current picture in particle physics
Matter...
+ force carriers...
photon (g)
W±
Z
gluon (g)
+ relativity + quantum mechanics + symmetries...
= “The Standard Model”
•
•
•
•
•
Glen Cowan
almost certainly incomplete
25 free parameters (masses, coupling strengths,...)
should include Higgs boson (not yet seen)
no gravity yet
SCMA4, 12-15 June,
agrees withGlen
allCowan,
experimental
observations so far
2006
3
SCMA4, 12-15 June, 2006
Experiments in High Energy Physics
HEP mainly studies particle collisions in accelerators, e.g.,
Large Electron-Positron (LEP) Collider at CERN, 1989-2000
4 detectors, each collaboration ~400 physicists.
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
4
SCMA4, 12-15 June, 2006
More HEP experiments
LEP tunnel now used for the Large Hadron Collider (LHC)
proton-proton collisions, Ecm=14 TeV, very high luminosity
Two general purpose detectors: ATLAS and CMS
Each detector collaboration has ~2000 physicists
Data taking to start 2007
The ATLAS
Detector
Glen Cowan, SCMA4,
12-15 June,
Glen Cowan
2006
5
SCMA4, 12-15 June, 2006
HEP data
Basic unit of data: an ‘event’.
Ideally, an event is a list of momentum vectors & particle types.
In practice, particles ‘reconstructed’ as tracks, clusters of energy
deposited in calorimeters, etc.
Resolution, angular coverage, particle id, etc. imperfect.
An event from the
ALEPH detector
at LEP
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
6
SCMA4, 12-15 June, 2006
Data samples
At LEP, event rates typically ~Hz or less
~106 Z boson events in 5 years for each of 4 experiments
At LHC, ~109 events/sec(!!!), mostly uninteresting;
do quick sifting, record ~200 events/sec
single event ~ 1 Mbyte
1 ‘year’ ≈ 107 s, 1016 pp collisions per year,
2 billion / year recorded (~2 Pbyte / year)
For new/rare processes, rates at LHC can be vanishingly small
Higgs bosons detectable per year could be e.g. ~103
→ ‘needle in a haystack’
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
7
SCMA4, 12-15 June, 2006
HEP game plan
Goals include:
Fill in the gaps in the Standard Model (e.g. find the Higgs)
Find something beyond the Standard Model (New Physics)
Example of an extension to SM: Supersymmetry (SUSY)
For every SM particle → SUSY partner (none yet seen!)
Minimal SUSY has 105 free parameters, constrained
models ~5 parameters (plus the 25 from SM)
Provides dark matter candidate (neutralino), unification
of gauge couplings, solution to hierarchy problem,...
Lightest SUSY particle can be stable (effectively invisible)
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
8
SCMA4, 12-15 June, 2006
Simulated HEP data
Monte Carlo event generators available for essentially all
Standard Model processes, also for many possible extensions
to the SM (supersymmetric models, extra dimensions, etc.)
SM predictions rely on a variety of approximations
(perturbation theory to limited order, phenomenological
modeling of non-perturbative effects, etc.)
Monte Carlo programs also used to simulate detector response.
Simulated event for ATLAS
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
9
SCMA4, 12-15 June, 2006
A simulated event
PYTHIA Monte Carlo
pp → gluino-gluino
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
10
SCMA4, 12-15 June, 2006
The data stream
Experiment records events of different types, with different
numbers of particles, kinematic properties, ...
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
11
SCMA4, 12-15 June, 2006
Selecting events
To search for events of a given type
(H0: ‘signal’), need discriminating
variable(s) distributed as differently
as possible relative to unwanted
event types (H1: ‘background’)
Count number of events in acceptance region defined by ‘cuts’
Expected number of signal events:
s = s s L
Expected number of background events:
b = b b L
s, b = cross section for signal, background
‘Efficiencies’: s = P( accept | s ), b = P( accept | b )
L = integrated luminosity
(related to beam intensity, data taking12time)
Glen Cowan, SCMA4, 12-15 June,
Glen Cowan
2006
SCMA4, 12-15 June, 2006
Poisson data with background
Count n events, e.g., in fixed time or integrated luminosity.
s = expected number of signal events
b = expected number of background events
n ~ Poisson(s+b):
Sometimes b known, other times it is in some way uncertain.
Goals: (i) convince people that s ≠ 0 (discovery);
(ii) measure or place limits on s, taking into
consideration the uncertainty in b.
Widely discussed in HEP community, see e.g. proceedings of
PHYSTAT meetings, Durham, Fermilab, CERN workshops...
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
13
SCMA4, 12-15 June, 2006
Making a discovery
Often compute p-value of the ‘background only’ hypothesis H0
using test variable related to a characteristic of the signal.
p-value = Probability to see data as incompatible with
H0, or more so, relative to the data observed.
Requires definition of ‘incompatible with H0’
HEP folklore: claim discovery if p-value equivalent to a 5
fluctuation of Gaussian variable (one-sided)
Actual p-value at which discovery becomes believable
will depend on signal in question (subjective)
Why not do Bayesian analysis?
Usually don’t know how to assign meaningful prior
probabilities
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
14
SCMA4, 12-15 June, 2006
Computing p-values
For n ~ Poisson (s+b) we compute p-value of H0 : s = 0
Often we don’t simply count events but also measure for
each event one or more quantities
number of events observed n
replaced by numbers of events
(n1, ..., nN) in a histogram
Goodness-of-fit variable could
be e.g. Pearson’s 2
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
15
SCMA4, 12-15 June, 2006
Example: search for the Higgs boson at LEP
Several usable signal modes:
Mass of jet pair =
mass of Higgs boson;
b jets contain tracks
not from interaction
point
Important background from e+e- → ZZ
b-jet pair of virtual
Z can mimic Higgs
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
16
SCMA4, 12-15 June, 2006
A candidate Higgs event
17 ‘Higgs like’ candidates seen but no claim of discovery -p-value of s=0 (background only) hypothesis ≈ 0.09
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
17
SCMA4, 12-15 June, 2006
Setting limits
Frequentist intervals (limits) for a parameter s can be found by
defining a test of the hypothesized value s (do this for all s):
Specify values of the data n that are ‘disfavoured’ by s
(critical region) such that P(n in critical region) ≤ g
for a prespecified g, e.g., 0.05 or 0.1.
(Because of discrete data, need inequality here.)
If n is observed in the critical region, reject the value s.
Now invert the test to define a confidence interval as:
set of s values that would not be rejected in a test of
size g (confidence level is 1 - g ).
The interval will cover the true value of s with probability ≥ 1 - g.
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
18
SCMA4, 12-15 June, 2006
Setting limits: ‘classical method’
E.g. for upper limit on s, take critical region to be low values of n,
limit sup at confidence level 1 - b thus found from
Similarly for lower limit at confidence level 1 - a,
Sometimes choose a = b = g /2 → central confidence interval.
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
19
SCMA4, 12-15 June, 2006
Calculating classical limits
To solve for slo, sup, can exploit relation to 2 distribution:
Quantile of 2 distribution
For low fluctuation of n this
can give negative result for sup;
i.e. confidence interval is empty.
b
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
20
SCMA4, 12-15 June, 2006
Likelihood ratio limits (Feldman-Cousins)
Define likelihood ratio for hypothesized parameter value s:
Here
is the ML estimator, note
Critical region defined by low values of likelihood ratio.
Resulting intervals can be one- or two-sided (depending on n).
(Re)discovered for HEP by Feldman and Cousins,
Phys. Rev. D 57 (1998) 3873.
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
21
SCMA4, 12-15 June, 2006
Coverage probability of confidence intervals
Because of discreteness of Poisson data, probability for interval
to include true value in general > confidence level (‘over-coverage’)
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
22
SCMA4, 12-15 June, 2006
More on intervals from LR test (Feldman-Cousins)
Caveat with coverage: suppose we find n >> b.
Usually one then quotes a measurement:
If, however, n isn’t large enough to claim discovery, one
sets a limit on s.
FC pointed out that if this decision is made based on n, then
the actual coverage probability of the interval can be less than
the stated confidence level (‘flip-flopping’).
FC intervals remove this, providing a smooth transition from
1- to 2-sided intervals, depending on n.
But, suppose FC gives e.g. 0.1 < s < 5 at 90% CL,
p-value of s=0 still substantial. Part of upper-limit ‘wasted’?
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
23
SCMA4, 12-15 June, 2006
Properties of upper limits
Example: take b = 5.0, 1 - g = 0.95
Upper limit sup vs. n
Glen Cowan
Mean upper limit vs. s
Glen Cowan, SCMA4, 12-15 June,
2006
24
SCMA4, 12-15 June, 2006
Upper limit versus b
Feldman & Cousins, PRD 57 (1998) 3873
b
If n = 0 observed, should upper limit depend on b?
Classical: yes
Bayesian: no
FC: yes
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
25
SCMA4, 12-15 June, 2006
Nuisance parameters and limits
In general we don’t know the background b perfectly.
Suppose we have a measurement
of b, e.g., bmeas ~ N (b, b)
So the data are really: n events
and the value bmeas.
In principle the confidence interval
recipe can be generalized to two
measurements and two parameters.
Difficult and not usually attempted, but
see e.g. talks by K. Cranmer at
PHYSTAT03, G. Punzi at PHYSTAT05.
G. Punzi, PHYSTAT05
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
26
SCMA4, 12-15 June, 2006
Bayesian limits with uncertainty on b
Uncertainty on b goes into the prior, e.g.,
Put this into Bayes’ theorem,
Marginalize over b, then use p(s|n) to find intervals for s
with any desired probability content.
For b = 0, b = 0, (s) = const. (s > 0), Bayesian upper limit
coincides with classical one.
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
27
SCMA4, 12-15 June, 2006
Cousins-Highland method
Regard b as random, characterized by pdf (b).
Makes sense in Bayesian approach, but in frequentist
model b is constant (although unknown).
A measurement bmeas is random but this is not the mean
number of background events, rather, b is.
Compute anyway
This would be the probability for n if Nature were to generate
a new value of b upon repetition of the experiment with b(b).
Now e.g. use this P(n;s) in the classical recipe for upper limit
at CL = 1 - b:
Widely used method in HEP.
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
28
SCMA4, 12-15 June, 2006
‘Integrated likelihoods’
Consider again signal s and background b, suppose we have
uncertainty in b characterized by a prior pdf b(b).
Define integrated likelihood as
also called modified profile likelihood, in any case not
a real likelihood.
Now use this to construct likelihood-ratio test and invert
to obtain confidence intervals.
Feldman-Cousins & Cousins-Highland (FHC2), see e.g.
J. Conrad et al., Phys. Rev. D67 (2003) 012002 and
Conrad/Tegenfeldt PHYSTAT05 talk.
Calculators available (Conrad, Tegenfeldt, Barlow).
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
29
SCMA4, 12-15 June, 2006
Digression: tangent plane method
Consider least-squares fit with parameter of interest 0 and
nuisance parameter 1, i.e., minimize
Standard deviations from
tangent lines to contour
Correlation between
causes errors
to increase.
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
30
SCMA4, 12-15 June, 2006
The profile likelihood
The ‘tangent plane’ method is a special case of using the
profile likelihood:
is found by maximizing L (0, 1) for each 0.
Equivalently use
The interval obtained from
is the same as
what is obtained from the tangents to
Well known in HEP as the ‘MINOS’ method in MINUIT.
See e.g. talks by Reid, Cranmer, Rolke at PHYSTAT05.
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
31
SCMA4, 12-15 June, 2006
Interval from inverting profile LR test
Suppose we have a measurement bmeas of b.
Build the likelihood ratio test with profile likelihood:
and use this to construct confidence intervals.
Not widely used in HEP but recommended in e.g. Kendall &
Stuart; see also PHYSTAT05 talks by Cranmer, Feldman,
Cousins, Reid.
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
32
SCMA4, 12-15 June, 2006
Wrapping up,
Frequentist methods have been most widely used but for many
questions (particularly related to systematics), Bayesian methods
are getting more notice.
Frequentist properties such as coverage probability of confidence
intervals seen as very important (overly so?)
Bayesian methods remain problematic in cases where it is
difficult to enumerate alternative hypotheses and assign
meaningful prior probabilities.
Tools widely applied at LEP; some work needed to extend
these to LHC analyses (ongoing).
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
33
SCMA4, 12-15 June, 2006
Finally,
The LEP programme was dominated by limit setting:
Standard Model confirmed, No New Physics
The Tevatron discovered the top quark and Bs mixing (both parts
of the SM) and also set many limits (but NNP)
By ~2012 either we’ll have discovered something
new and interesting beyond the Standard Model,
or,
we’ll still be setting limits and HEP should think
seriously about a new approach!
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
34
SCMA4, 12-15 June, 2006
Extra slides
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
35
SCMA4, 12-15 June, 2006
A recent discovery: Bs oscillations
Recently the D0 experiment (Fermilab) announced the
discovery of Bs mixing:
Moriond talk by Brendan Casey, also hep-ex/0603029
Produce a Bq meson at time t=0; there is a time dependent
probability for it to decay as an anti-Bq (q = d or s):
|Vts|À |Vtd| and so Bs oscillates quickly compared to decay rate
Sought but not seen at LEP;
early on predicted to be visible at Tevatron
Discovery quickly confirmed by the CDF experiment
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
36
SCMA4, 12-15 June, 2006
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
37
2006
Statistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester
Confidence interval from likelihood function
In the large sample limit it can be shown for ML estimators:
(n-dimensional Gaussian, covariance V)
defines a hyper-ellipsoidal confidence region,
If
Glen Cowan
then
Glen Cowan, SCMA4, 12-15 June,
2006
38
SCMA4, 12-15 June, 2006
Approximate confidence regions from L( )
So the recipe to find the confidence region with CL = 1-g is:
For finite samples, these are approximate confidence regions.
Coverage probability not guaranteed to be equal to 1-g ;
no simple theorem to say by how far off it will be (use MC).
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
39
SCMA4, 12-15 June, 2006
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
40
2006
Statistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester
Upper limit from test of hypothesized ms
Base test on likelihood ratio (here = ms):
Observed value is lobs , sampling distribution is g(l;) (from MC)
is excluded at CL=1-g if
D0 shows the distribution of ln l for ms = 25 ps-1
equivalent to
2.1 effect
95% CL
upper limit
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
41
SCMA4, 12-15 June, 2006
The significance of an observed signal
Suppose b = 0.5, and we observe nobs = 5.
Often, however, b has some uncertainty
this can have significant impact on p-value,
e.g. if b = 0.8, p-value = 1.4 10-3
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
42
SCMA4, 12-15 June, 2006
The significance of a peak
Suppose we measure a value
x for each event and find:
Each bin (observed) is a
Poisson r.v., means are
given by dashed lines.
In the two bins with the peak, 11 entries found with b = 3.2.
We are tempted to compute the p-value for the s = 0 hypothesis as:
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
43
SCMA4, 12-15 June, 2006
The significance of a peak (2)
But... did we know where to look for the peak?
→ give P(n ≥ 11) in any 2 adjacent bins
Is the observed width consistent with the expected x resolution?
→ take x window several times the expected resolution
How many bins distributions have we looked at?
→ look at a thousand of them, you’ll find a 10-3 effect
Did we adjust the cuts to ‘enhance’ the peak?
→ freeze cuts, repeat analysis with new data
How about the bins to the sides of the peak... (too low!)
Should we publish????
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
44
SCMA4, 12-15 June, 2006
Statistical vs. systematic errors
Statistical errors:
How much would the result fluctuate upon repetition
of the measurement?
Implies some set of assumptions to define
probability of outcome of the measurement.
Systematic errors:
What is the uncertainty in my result due to
uncertainty in my assumptions, e.g.,
model (theoretical) uncertainty;
modeling of measurement apparatus.
The sources of error do not vary upon repetition of the
measurement. Often result from uncertain
value of, e.g.,Glen
calibration
constants, efficiencies, etc. 45
Cowan, SCMA4, 12-15 June,
Glen Cowan
2006
SCMA4, 12-15 June, 2006
Systematic errors and nuisance parameters
y (measured value)
Response of measurement apparatus is never modeled perfectly:
model:
truth:
x (true value)
Model can be made to approximate better the truth by including
more free parameters.
systematic uncertainty ↔ nuisance parameters
Glen Cowan
Glen Cowan, SCMA4, 12-15 June,
2006
46
SCMA4, 12-15 June, 2006