Transcript Slide 1

METHODS
DUMMIES
BAYES FOR BEGINNERS
Any given Monday at
12.31 pm
“I’m sure this makes sense, but you lost me about here…”
Bayes for Beginners
•
•
•
•
What can Bayes do for you?
Conditional Probability
What is Bayes Theorem?
Bayes in SPM2
What can Bayes do for you?
• Problems with classical statistics approach
– All inferences relate to disproving the null
hypothesis
– Never fully reject H0, only say that the effect
you see is unlikely to occur by chance
– Corrections for multiple comparisons
– Very small effects can be declared significant
with enough data
• Bayesian Inference offers a solution
What can Bayes’ do for you?
• Classical
– ‘What is the likelihood of getting these data
given no activation occurred (b = 0)?’
– p(y|b)
• Bayesian
– ‘What is the chance of getting these
parameters, given these data?’
– p(b|y)
• p(b|y) ≠p(y|b)
Conditional Probability
•
•
•
•
•
•
•
Last year you were at a conference in Japan
You happen to notice that rather a lot of the professors smoke
At one of socials you met someone at the bar & had a few drinks
The next morning you wake up & it dawns on you that you told the person you were
talking to in the bar something rather indiscrete about you supervisor
You remember that the person you were talking kept stealing your cigarettes, and you
are start to worry that they might have been a professor (and therefore a friend of
your supervisor)
You decide to do a calculation to work out what the chances are that the person you
were talking to is a professor, given that you know they are a smoker.
You phone the hotel reception & they give you the following information:
–
–
–
–
100 delegates
40 requested non-smoking rooms
10 professors
4 requested non-smoking rooms
Given that the person you were talking to last night was a smoker, what
is the probability of them being a professor?
‘AND’ = multiply
p(P) = 0.1
p(S|P) = 0.6
p(S and P) = p(S|P)*p(P)
= 0.06
p(S’|P) = 0.4
p(S|P’) = 0.6
p(S and P’) = p(S|P’)*p(P’)
p(P’) = 0.9
‘OR’ = add
100 delegates
40 non-smokers
10 professors
4 non-smokers
p(S’|P’) = 0.4
p(P|S) = p(P and S)
p(S)
p(S) = p(S and P) or p(S and P’)
p(S) = 0.6*0.1 + 0.6*0.9
= 0.6
p(P|S) = p(S|P)*p(P)
p(S)
= 0.06/0.6
p(P|S) = 0.1
The following night you are introduced to a professor who you would very
much like to work for after you have finished your PhD. You want to make a
good impression. Given that this person is a professor, what are the
chances that they are also a smoker, in which case offering them a cigarette
won’t harm your career prospects?
p(S|P) = 0.6
p(P) = 0.1
p(P’) = 0.9
p(S’|P) = 0.4
p(S|P’) = 0.6
p(S|P) = p(S and P)
p(P)
= 0.6 * 0.1 / 0.1
= 0.6
p(P|S) = p(S|P)*p(P)
p(S)
= 0.06/0.6
= 0.1
p(P|S) ≠ p(S|P)
p(S’|P’) = 0.4
What is Bayes Theorem?
• p(P|S) = p(S|P)*p(P) / p(S)
• Posterior = Likelihood * Prior / Evidence
• p(P|S)
– Degree of belief in ‘P’, given the data ‘S’
depends on what the data tell you ‘p(S|P)’ and
any prior information ‘p(P) ’
This year, the conference is held in New York where smokers have to pay extra. This doesn’t deter
the professors, but lots of the other participants decide to give up for the week! If you repeat your
indiscretion this year, what are the chances of the smoker at the bar being a professor?
‘AND’ = multiply
p(P) = 0.1
p(S|P) = 0.6
p(S and P) = p(S|P)*p(P)
p(S’|P) = 0.4
p(S|P’) = 0.2
p(S and P’) = p(S|P’)*p(P’)
p(P’) = 0.9
p(S’|P’) = 0.8
‘OR’ = add
100 participants
80 non-smokers
10 professors
4 non-smokers
p(P|S) = p(P and S)
p(S)
p(P|S) = p(S|P)*p(P)
p(S)
= 0.06/0.234
p(S) = p(S and P) or p(S and P’)
= 0.25
p(S) = 0.6*0.1 + 0.2*0.9
= 0.234
Bayes in SPM2
• p(b|y) = p(y|b)*p(b) / p(y)
• p(b|y)  p(y|b)*p(b)
• Posterior Probability Map (PPM)
– Posterior Distribution
– Likelihood Function (equivalent to normal
SPM)
– Prior Probabilities of parameters
• PPM = SPM * priors
Bayes in SPM2
• Deciding on the priors
– Fully specified priors in DCM
– Estimating priors in PEB
• Computing the Posterior Distribution
• Making inferences
– Shrinkage priors
– Thresholds
Priors
• Everyone has prior beliefs about their data
• In Bayesian Framework priors are formally tested
Bayesian Inference
Full Bayes
Previous empirical data eg
biophysics of haemodynamic
response
Empirical Bayes
Mean & Variance of priors estimated from the
data
Hierarchical model: Parameters from one level
become the priors at next level
Between voxel variance over all voxels used as
prior on variance at each voxel
PPMs: 1st Level = within voxel of interest
2nd Level = between all brain voxels
Computing the Posterior Probability Distribution
y=w+e
w=m+z
p(y|w) = N(Md, ld-1)
p(w) = N(Mp, lp-1)
p(w|y) = p(y|w)* p(w) = N(Mpost, lpost-1)
Likelihood:
Prior:
Posterior:
lpost = ld + lp
Mpost = ld Md + lp Mp
lpost
lpost
lpost-1
ld-1
lp-1
Mp
Mpost Md
The effects of different precisions
lp = ld
lp < ld
lp > ld
lp ≈ 0
Multivariate Distributions
Shrinkage Priors
Small, variable effect
Large, variable effect
Small, consistent effect
Large, consistent effect
Reporting PPMs
• Posterior Distribution describes probability of
getting an effect, given the data
• Posterior distribution is different for every voxel
• Size of effect (Mean) & Variability (Precision)
• 2 Steps
– Decide what size of effect is physiologically relevant
– Each voxel 95% certain that the effect size is greater
than threshold
• Special extension of Bayesian Inference for PPMs
Thresholding
Small, variable effect
g
Large, variable effect
Small, consistent effect
Large, consistent effect
g
g
g
p(b > g | y) = 0.95
Bayesian Inference in SPM2
• Bayesian Inference offers a solution
– All inferences relate to disproving the null
hypothesis
• There is no null hypothesis
• Different hypotheses can be tested formally
– Multiple Comparisons & False Positives
• Voxel wise inferences are independent
• P-values don’t change with search volume
• Use of shrinkage priors
– Very small effects can be declared significant
with enough data
• Thresholding of effect size
With thanks to Klaas & Will