A Bayesian Approach

Download Report

Transcript A Bayesian Approach

Jan Štochl, Ph.D.
Department of Psychiatry
University of Cambridge
Email: [email protected]
Comparison of maximum likelihood and bayesian
estimation of Rasch model: What we gain by using
bayesian approach?
Comparison of results from General health
questionnaire
Content of the presentation
Brief introduction to the concept of bayesian statistics
Using R and Winbugs for estimation of bayesian Rasch model
Analysis and comparison of both methodologies in General health
questionnaire
General ideas and introduction to
bayesian statistics
A bit of theory……
What is Bayesian statistics?
• It is an alternative to the classical statistical inference (classical
statisticians are called „frequentist“)
• Bayesians view the probability as a statement of uncertainty. In other
words, probability can be defined as the degree to which a person (or
community) believes that a proposition is true.
• This uncertainty is subjective (differs across researchers)
Bayesians versus frequentists
• A frequentist is a person whose long-run ambition is to be
wrong 5% of the time
• A Bayesian is one who, vaguely expecting a horse, and catching
a glimpse of a donkey, strongly believes he has seen a mule
Bayes theorem and modeling
• Our situation – fit the model to the observed data
• Models give the probability of obtaining the data, given some
parameters:
P ( X  θ)
• This is called the likelihood
•
• We want to use this to learn about the parameters
Inference
•
We observe some data, X, and want to make inferences
about the parameters from the data
– i.e. find out about P(θ|X)
• We have a model, which gives us the likelihood P(X|θ)
•
independenceWe need to use P(X|θ) to find P(θ|X)
– i.e. to invert the probability
Bayes theorem
•
Published in 1763
•
Allows to go from P(X|θ) to
P(θ|X)
Prior distribution of parameters
P (θ  X ) =
P (θ) P ( X  θ)
P( X )
Posterior distribution
It´s a constant!
P ( X ) =  P ( X  θ)P (θ) d θ
Bayes theorem and adding more data
•
Suppose we observe some data, X1, and get a posterior
distribution:
P (  X 1 )  P ( ) P ( X 1   )
•
What if we later observe more data, X2? If this is independent of
X1, then
P ( X 1 X 2   )  P ( X 1   ) P ( X 2   )
so that
P (  X 1 , X 2 )  P ( ) P ( X 1   ) P ( X 2   )
i.e. the first posterior is used as the prior to get the second posterior
Features of Bayesian approach
• Flexibility to incorporate your expert opinion on the parameters
• Although this concept is easy to understand, it is not easy to
compute. Fortunately, MCMC methods have been developed
• Finding prior distribution can be difficult
• Misspecification of priors can be dangerous
• The less data you have the higher is the influence of priors
• The more informative are priors the more they influence the final
estimates
When to use Bayesian approach?
• When the sample size is small
• When the researcher has knowledge about the parameter values (e.g.
from previous research)
• When there are lots of missing data
• When some respondents have too few responses to estimate their ability
• Can be useful for test equating
• Item banking
Openbugs
• Can handle many types of data (including polytomous)
• Can handle many types of models (SEM, IRT, Multilevel……)
• Possibility to use syntax language or special graphical interface to
introduce the model (doodles)
• Provides standard errors of the estimates
• Provides fit statistics (bayesian ones)
• Can be remotely used from R (packages „R2Winbugs“, „R2Openbugs“,
„Brugs“, „Rbugs“…)
• Results from Openbugs can be exported to R and further analyzed
(packages „coda“, „boa“)
Practical comparison of maximum likelihood
and bayesian estimation of Rasch model
General Health Questionnaire, items 1-7
General Health Questionnaire (GHQ)
• 28 items, scored dichotomously (0 and 1), 4 unidimensional
subscales (7 items each)
• Only one subscale is analyzed (items 1-7)
• Rasch model is used, maximum likelihood estimates are
obtained in R (package „ltm“), bayesian estimates in Openbugs
(and analyzed in R)
• 2 runs in Openbugs :
•
- first one with vague (uninformative)
priors for difficulty parameters (normal distibution with
mean=0 and sd=10)
- second one with mix of informative and
uninformative priors for difficulty parameters (to demonstrate the
influence of priors)
Item fit of Rasch (1PLM) model
and Mokken model
item
Difficulty Discrimination Chi-square p-value
GHQ15
1.72
3.57
30.02
<0.0001
GHQ16
1.23
3.57
47.35
<0.0001
GHQ17
1.26
3.57
104.02
<0.0001
GHQ18
1.37
3.57
50.11
<0.0001
GHQ19
1.82
3.57
13.50
0.02
GHQ20
1.51
3.57
18.47
0.00
GHQ21
1.95
3.57
6.00
0.31
Item
GHQ15
GHQ16
GHQ17
GHQ18
GHQ19
GHQ20
GHQ21
# vi
#vi
ItemH monotonicity # z intersection
0.57
0
0
0
0.63
0
0
0
0.55
0
0
0
0.55
0
0
0
0.59
0
0
0
0.66
0
0
0
0.67
0
0
0
#t
0
0
0
0
0
0
0
Snapshots
Trace of delta[1]
0
2
2.0 2.4
Density of delta[1]
4000
5000
6000
7000
8000
9000
2.0
2.2
2.4
2.6
Iterations
N = 7000 Bandw idth = 0.01576
Trace of delta[2]
Density of delta[2]
2.8
0
2
2.0 2.4
3000
4000
5000
6000
7000
8000
9000
1.8
2.0
2.2
2.4
2.6
Iterations
N = 7000 Bandw idth = 0.01535
Trace of delta[3]
Density of delta[3]
0
2.0
2
2.6
3000
4000
5000
6000
7000
8000
9000
1.8
2.0
2.2
2.4
2.6
Iterations
N = 7000 Bandw idth = 0.0156
Trace of delta[4]
Density of delta[4]
2.6
3.2
0.0 1.5 3.0
3000
3000
4000
5000
6000
7000
Iterations
8000
9000
2.4
2.6
2.8
3.0
3.2
N = 7000 Bandw idth = 0.01883
3.4
Snapshots
Recovery of difficulty parameters
Maximum likelihood
standard
Item
difficulty
error
GHQ01 2.367
0.1097
Bayesian - vague priors
prior prior
standard
mean SD difficulty
error
0
10
2.369
0.1102
Bayesian - informative priors
prior prior
standard
mean SD difficulty
error
2.0
3.16 2.325
0.1088
GHQ02
2.293
0.1076
0
10
2.295
0.1071
1.2
1.00
2.239
0.1070
GHQ03
2.324
0.1085
0
10
2.327
0.1087
1.8
10.00 2.283
0.1077
GHQ04
2.958
0.1304
0
10
2.962
0.1307
0.0
10.00 2.914
0.1300
GHQ05
4.108
0.1970
0
10
4.120
0.1976
0.0
0.32
3.220
0.1306
GHQ06
4.108
0.1970
0
10
4.122
0.1988
3.0
1.00
4.027
0.1889
GHQ07
3.813
0.1757
0
10
3.820
0.1766
0.0
31.62 3.770
0.1746
Further reading and software
General literature on bayesian IRT analysis
• Congdon, P (2006). Bayesian Statistical Modelling, 2nd edition. Wiley.
• Congdon, P. (2005). Bayesian Methods for Categorical Data, Wiley.
• Congdon, P. (2003). Applied Bayesian Modelling, Wiley.
• Winbugs User Manual (available online) from
•
http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/manual14.pdf
• Winbugs discussion archive http://www.jiscmail.ac.uk/lists/bugs.html
• Lee, S.Y. (2007). Structural Equation Modelling: A Bayesian Approach, Wiley.
• Iversen, G. R. (1984). Bayesian Statistical Inference: Sage.
Available software
•Winbugs, Openbugs, Jags (freely available)
• R (freely available) - package „mokken“
•Mplus (commercial)