Transcript statpp2006

Statistics in Particle Physics
1
20-29 November 2006
Tatsuo Kawamoto
ICEPP, University of Tokyo
Outline
1. Introduction
2. Probability
3. Distributions
4. Fitting and extracting parameters
5. Combination of measurements
6. Errors, limits and confidence intervals
7. Likelihood, ANN, and sort of things
References
• Textbooks of statistics in HEP
• PDG review (Probability, Statistics)
• Relevant scientific papers
1. Introduction
Why bother statistics ?
It’s not fundamental.
As soon as we come to the point to present results of
an experiment, we face to a few questions like:
• What is the size of uncertainty?
• How to combine results from different runs?
• Discovered something new?
• If not discovery, what we can say from the
experiment?
Prescriptions to these problems often involve
considerations
based on statistics.
Particle Physics
Study of elementary particles that have been discovered
- Quarks
- leptons
- Gauge bosons
- Hadrons
And anything that has not been discovered
- Higgs
- Supersymmetry
- Extradimensions
Goals of experiments
For each particle we want to know, eg.
 What are its properties ?
- mass, lifetime, spin, ….
 What are its decay modes ?
 How it interacts with other particles ?
 Does it exist at all ?
Observation is a result of fundamental rules of the nature
these are random, quantum mechanical, processes
Also, the detector effects (resolution, efficiency, …) are
often of random nature
Systematic uncertainty is a subtle subject, but we have to do
our best to say something about it, and treat it reasonably.
Template for an experiment
To study X
• Arrange for X to occur
e.g colliding beams
• Record events that might be X
trigger, data acquisition,
• Reconstruct momentum, energy, … of visible particles
• Select events that could be X by applying CUTS
Efficiency < 100%, Background > 0
• Study distributions of interesting variables
• Compare with/ fit to Theoretical distributions
• Infer the value of parameter q and its uncertainty
Implications
• Essentially counting numbers
• Uncertainties of measurements are understood
• Distributions are reproduced to reasonable accuracy
We don’t use:
•Student’s t
•F test
•Markov chains
•…
Tools
•Monte Carlo simulation
Know in principle → Know in practice
Simple beautiful underlying physics
Unbeautiful effects (higher order, fragmentation,..)
Ugly detector imperfections (resolution, efficiency)
•Likelihood
Fundamental tool to handle probability
•Fitting
c2 , Likelihood,
Goodness of fit
•Toy Monte Carlo
Handle complicated likelihood
Extracting parameters
Example:
mZ = 91.1853±0.0029 GeV
GZ = 2.4947 ±0.0041 GeV
shad= 41.82 ±0.044 nb
E. Hubble
Combining results
Discovery or placing limits
Likelihood, Artificial Neural Net
Use as much
Information as
possible
Example:
W+W- → qqqq
There are other important things
which we don’t cover
•Blind analysis
•Unfolding
•….
2. Probability
What is it?
Mathematical
P(A) is a number obeying the rules:
Kolmogorov axioms
Ai are disjoint events
Mathematical
Lemma
And, that’s almost it.
Classical
Laplace, …
From considerations of games of chances
Given by symmetry for equally-likely outcomes, for which
we are equally undecided.
Classify things into certain number of equally-likely cases,
And count the number of such favorable cases.
P(A) = number of equally-likely favorable cases / total number
Tossing a coin P(H)=1/2, Throwing a dice P(1)=1/6
How to handle continuous variables ?
Frequentist
Probability is the limit of frequency (taken over some ensemb
The event A either occur or not. Relative frequency of occure
Law of large numbers
An example of throwing a dice
Frequency definition is associated to some ensemble of ‘events
Can’t say things like:
• It will probably rain tomorrow
• Probability of LHC collision in November 2007
• Probability of existence of SUSY
•…
But one can say:
• The statement ‘It will rain tomorrow’ is probably true
•…
Comeback later in the discussion of confidence level
Bayesian or Subjective probability
P(A) is the degree of belief in A
A can be anything:
Rain, LHC completion, SUSY, ….
You bet depending on odds
P vs 1-P
Bayes theoremOften used in subjective probability discussio
Conditional probability P(A|B)
Thomas Bayes 17021761
Bayes theorem
How it works?
Initial belief P(Theory) is modified by experimental results
If Result is negative, P(Result|Theory)=0, the Theory is killed
P(Theory|Result)=0
It’s an extreme case. Will comeback later in the discussion of
confidence level
Fun with Bayes theorem - 1
Monty Hall problem
• There are 3 doors
• Behind one of these, there is a prize (a car, etc)
• Behind each of the other two, there is a goat (you lost)
• you choose 1 door whatever you like (you bet), say, Nr 1.
• A door will be opened to reveal a goat, either of Nr 2 or Nr 3,
chosen randomly if goat is behind the both.
• Then you are asked if you stay Nr 1, or, switch to Nr 2.
You should stay or switch?
One would say:
you don’t know anyway if there is the prize
behind Nr 1 or Nr 2. They are equally probable.
To stay or to switch give equal chance.
But the correct strategy is to switch
A ‘classical’ reasoning (count the number of cases)
Before the door is opened
Odds to win :
stay
1/3
switch 2/3
After the door is opened
Using Bayes theorem
P(Ci) : Prize is behind door i = 1/3
P(Ok) : Door k is opened
We want to know P(C1| O3) vs P(C2| O3)
Exercise
A disease X (maybe AIDS, SARS, ….)
P(X)
= 0.001
P(no X) = 0.999
Prior probability
Consider a test of X
P(+ | X)
= 0.998
P(+ | no X) = 0.03
If the test result were +, how worried you should be ?
ie. What is P(X | +) ?
http://home.cern.ch/kawamoto/lecture06.html