Results from Small Numbers

Download Report

Transcript Results from Small Numbers

Results from Small Numbers
Roger Barlow
QMUL
2nd October 2006
Particle Physics is about counting
Pretty much everything is Poisson Statistics
N
Numbers of events give cross sections, branching
ratios…
E.g. Have T=200M B mesons, Efficiency E=0.02
Observe 100 events. Error 100=10
BR=N/TE
BR=(25.0 2.5) 10-6
Results from Small Numbers
2
Summary: 3 problems
1.
2.
3.
Number of events small (including 0)
Number of events  Background
Uncertainties in background and efficiency
Results from Small Numbers
3
1. What do you do with zero?
Observe 0 events. (Many searches do)
BR=00 is obviously wrong
We know BR is small. But not that it’s exactly 0.
Combination of small value+bad luck can
give 0 events
Need to go back two steps to consider


what we mean by measurement errors
what we mean by probability
Results from Small Numbers
4
Probability (conventional
definition)
A
Ensemble of Everything
Limit of frequency
P(A)= Limit N N(A)/N
Standard (frequentist) definition.
2 tricky features
1.
P(A) depends not just on A but on the ensemble.
Several ensembles may be possible
2.
If you cannot define an Ensemble there is no such
thing as probability
Results from Small Numbers
5
Feature 1:There can be many Ensembles
Probabilities belong to the event and the ensemble
 Insurance company data shows P(death) for 40 year old male
clients = 1.4% (Classic example due to von Mises)
 Does this mean a particular 40 year old German has a 98.6%
chance of reaching his 41st Birthday?
 No. He belongs to many ensembles
 German insured males
 German males
 Insured nonsmoking vegetarians
 German insured male racing drivers
 …
Each of these gives a different number. All equally valid.

6
Feature 2: Unique events have no
ensemble
Some events are unique.
Consider
“It will probably rain tomorrow.”
There is only one tomorrow (Tuesday 3rd October). There is
NO ensemble. P(rain) is either 0/1 =0 or 1/1 = 1
Strict frequentists cannot say 'It will probably rain tomorrow'.
This presents severe social problems.
7
Circumventing the limitation
A frequentist can say:
“The statement ‘It will rain tomorrow’ has
a 70% probability of being true.”
by assembling an ensemble of
statements and ascertaining that 70%
(say) are true.
(E.g. Weather forecasts with a verified
track record)
Say “It will rain tomorrow” with 70%
confidence
8
What is a measurement?
MT=1745 GeV
What does it mean?
For true value  and standard deviation  the probability (density) for a
result x is (for the usual Gaussian measurement)
P(x ; , )=(1/ 2) exp-[(x -)2/22]
So is there a 68% probability that MT lies between 169 and 179 GeV?
No. MT is unique. It is either in that range or outside. (Soon we will
know.)
For a given , the probability that x lies within  is 68%
This does not mean that for a given x, the ‘inverse’ probability that 
lies within  is 68%
P(x; , ) cannot be used as a probability for .
(It is called the likelihood function for  given x.)
Results from Small Numbers
9
What a measurement error says
A Gaussian measurement gives a result within
1 of the true value in 68% of all cases.
The statement “[x - , x + ]” has a 68%
probability of being true, using the ensemble
of all such statements.
We say “[x - , x + ]”, or “MT lies between
169 and 179 GeV” with 68% confidence.
Can also say “[x - 2, x + 2]” @ 95% or
“[-, x + ]” @ 84% or whatever
Results from Small Numbers
10
Extension beyond simple Gaussian
Choose construction (functions x1(), x2()) for
which
P(x[x1(), x2()])  CL for all 
Given a measurement X, make statement
[LO, HI]@ CL
Where X=x2(LO), X=x1(HI)
Results from Small Numbers
11
Confidence Belt
Constructed Horizontally
such that the probability
of a result lying inside
the belt is 68%(or
whatever)

Read Vertically using the
measurement
Example: proportional
Gaussian
= 0.1 
Measures with 10% accuracy
Result (say) 100.0
LO=90.91 HI= 111.1
Results from Small Numbers
X
x
12
Use for small numbers
Can choose CL
Just use one curve to give upper limit
Discrete observable makes smooth curves into ugly
staircases
Observe n. Quote upper limit as HI from solving
0n P(r, HI)=0n e-HI HI r/r!= 1-CL
English translation. n is small.  can’t be very large. If
the true value is HI (or higher) then the chance of a
result this small (or smaller) is only (1-CL) (or less)
Results from Small Numbers
13
Poisson table
Upper limits
n
0
1
2
3
4
5
90%
2.30
3.89
5.32
6.68
7.99
9.27
95%
3.00
4.74
6.30
7.75
9.15
10.51
99%
4.61
6.64
8.41
10.05
11.60
13.11
.....
14
Bayesian (Subjective) Probability
P(A) is a number describing my degree of belief in A
1=certain belief. 0=total disbelief
Intermediate beliefs can be calibrated against simple
frequentist probabilities.
P(A)=0.5 means: I would be indifferent given the
choice of betting on A or betting on a coin toss.
A can be anything. Measurements, true values, rain,
MT, MH, horse races, existence of God, age of the
current king of France…
Very adaptable. But no guarantee my P(A) is the same
as your P(A). Subjective = unscientific?
Results from Small Numbers
15
Bayes’ Theorem
General (uncontroversial) form
P(A|B)=P(B|A) P(A)
P(B)
“Bayesian” form
P(Theory|Data)=P(Data|Theory) P(Theory)
P(Data)
Posterior
Results from Small Numbers
Prior
16
Bayes at work
Prior and Posterior can be numbers
Successful predictions boost belief in theory
Several experiments modify belief cumulatively
Prior and Posterior can be distributions
P(|x) P(x|) P()
=
X
Ignore normalisation problems
Results from Small Numbers
17
Example: Poisson
P(r,)=exp(- )  r/r!
With uniform prior this
gives posterior for 
Shown for various
small r results
Read off intervals...
P()
r=0
r=1

r=2
r=6
18
Upper limits
Upper limit from n events

0 HI exp(- ) n/n! d = CL
Repeated integration by parts:
0n exp(- HI) HIr/r!=1-CL
Same as frequentist limit
This is a coincidence! Lower Limit formula is not
the same
19
Result depends on Prior
Example: 90% CL Limit from 0 events
Prior flat in 
2.30
X
=
Prior flat in 
X
=
1.65
20
Health Warning




Results using Bayesian Statistics will depend
on the prior
Choice of prior is arbitrary (almost always).
‘Uniform’ is not the answer. Uniform in what?
Serious statistical analyses will try several
priors and check how much the result shifts
(robustness)
Many physicists don’t bother
Results from Small Numbers
21
2. Next problem: add a background
=S+b
Frequentist Approach:
1.
Find range for 
2.
Subtract b to get range for S
Examples:
See 5 events, background 1.2
95% Upper limit: 10.5  9.3 
See 5 events, background 5.1
95% Upper limit: 10.5  5.2 ?
See 5 events, background 10.6
95% Upper limit: 10.5  -0.1 
Results from Small Numbers
22
S< -0.1? What’s going on?
If N<b we know that there is a downward fluctuation
in the background. (Which happens…)
But there is no way of incorporating this information
without messing up the ensemble
Really strict frequentist procedure is to go ahead and
publish.
 We know that 5% of 95%CL statements are wrong
– this is one of them
 Suppressing this publication will bias the global
results
Results from Small Numbers
23
=S+b for Bayesians



No problem!
Prior for  is uniform for Sb
Multiply and normalise as before
=
X
Posterior
Likelihood
Prior
Read off Confidence Levels by integrating posterior
Results from Small Numbers
24
Incorporating Constraints: Poisson
Work with total source strength (s+b) you know
is greater than the background b
Need to solve
n
1 CL
e
0
s b
n
s b
b
r
r!
r
e
b
r
!
0
Formula not as obvious as it looks.
25
Feldman Cousins Method
Works by attacking what looks like a different problem...
Also called* ‘the Unified Approach’
Physicists are human
Ideal Physicist
1. Choose Strategy
2. Examine data
3. Quote result
Real Physicist
1. Examine data
2. Choose Strategy
3. Quote Result
Example:
You have a background of
3.2
Observe 5 events? Quote
one-sided upper limit
(9.27-3.2 =6.07@90%)
Observe 25 events? Quote
two-sided limits
* by Feldman and Cousins, mostly
26
Feldman Cousins: =s+b
b is known. N is measured. s is what we're after
This is called 'flip-flopping' and
BAD because is wrecks the
whole design of the Confidence
Belt
Suggested solution:
1) Construct belts at chosen CL
as before
2) Find new ranking strategy to
determine what's inside and
what's outside
1 sided
90%
2 sided
90%
27
Feldman Cousins: Ranking
First idea (almost right)
Sum/integrate over outcomes with highest probabilities
(advantage that this is the shortest interval)
Glitch: Suppose N small. (low fluctuation)
P(N;s+b) will be small for any s and never get counted
Instead: compare to 'best' probability for this N, at s=N-b or
s=0 and rank on that number
Such a plot does an automatic ‘flip-flop’
N~b single sided limit (upper bound) for s
N>>b 2 sided limits for s
28
How it works
Has to be computed for the
appropriate value of
background b. (Sounds
complicated, but there is
lots of software around)
As n increases, flips from 1sided to 2-sided limits – but
in such a way that the
probability of being in the
belt is preserved
s
n
Means that
sensible 1-sided
limits are quoted
instead of
nonsensical 2sided limits!
29
Arguments against using
Feldman Cousins
Argument 1
It takes control out of hands of physicist. You might want to
quote a 2 sided limit for an expected process, an upper limit
for something weird
 Counter argument:
This is the virtue of the method. This control invalidates the
conventional technique. The physicist can use their
discretion over the CL. In rare cases it is permissible to say
”We set a 2 sided limit, but we're not claiming a signal”

30
Feldman Cousins: Argument 2
Argument 2
If zero events are observed by two experiments, the one with the
higher background b will quote the lower limit. This is unfair to
hardworking physicists
 Counterargument
An experiment with higher background has to be ‘lucky’ to get zero
events. Luckier experiments will always quote better limits.
Averaging over luck, lower values of b get lower limits to report.

Example: you reward a good student with a lottery
ticket which has a 10% chance of winning £10. A
moderate student gets a ticket with a 1% chance
of winning £ 20. They both win. Were you
unfair?
31
3. Including Systematic Errors
=aS+b
 is predicted number of events
S is (unknown) signal source strength.
Probably a cross section or branching ratio or
decay rate
a is an acceptance/luminosity factor known with
some (systematic) error
b is the background rate, known with some
(systematic) error
Results from Small Numbers
32
3.1 Full Bayesian
Assume priors
 for S (uniform?)
 For a (Gaussian?)
 For b (Poisson or Gaussian?)
Write down the posterior P(S,a,b).
Integrate over all a,b to get marginalised P(s)
Read off desired limits by integration
Results from Small Numbers
33
3.2 Hybrid Bayesian
Assume priors
 For a (Gaussian?)
 For b (Poisson or Gaussian?)
Integrate over all a,b to get marginalised P(r,S)
Read off desired limits by 0nP(r,S) =1-CL etc
Done approximately for small errors (Cousins and
Highand). Shows that limits pretty insensitive to a , b
Numerically for general errors (RB: java applet on SLAC
web page). Includes 3 priors for a that give slightly
different results
Results from Small Numbers
34
3.3-3.9
Extend Feldman Cousins
 Profile Likelihood: Use P(S)=P(n,S,amax,bmax)
where amax,bmax give maximum for this S,n
 Empirical Bayes
 And more…
Results being compared as outcome from Banff
workshop

Results from Small Numbers
35
Summary





Straight Frequentist approach is objective and
clean but sometimes gives ‘crazy’ results
Bayesian approach is valuable but has
problems. Check for robustness under choice
of prior
Feldman-Cousins deserves more widespread
adoption
Lots of work still going on
This will all be needed at the LHC
36