Intro to Graphical Models

Download Report

Transcript Intro to Graphical Models

Intro to Graphical Models
Dr. Marco Mesa-Frias
[email protected]
Aims of the course
› To familiarize you with graphical models and concepts
› To give you an overview of uses of graphical models
› To give you an idea of the techniques used in graphical
modeling
What Are Graphical Models?
Graph
Stat Model
M
Data
D ´ fX (1i ) ; X (2i ) ; :::; X m(i)gNi=1
Useful
Variables
Stat Model
Graph
Relationships
Formalize
What Graphical Models are useful for
› Allow structural inferences
› Simplify complex relationships
› Formalize relationships between variables
› Allow parametric inference
Communication
› Graphical models make communication easier
› Imagine you have a complex problem with many variables
› You want to explain to another person how they are
related
› You can write a list, or a paragraph describing this OR
› You can draw a graph
› It also help to understand statisticians
Reasoning under uncertainty
› Graphical models have become an extremely popular tool
for modeling uncertainty.
› They provide a principled approach to dealing with
uncertainty through the use of probability theory, and an
effective approach to coping with complexity through the
use of graph theory.
The Fundamental Questions to deal with
Uncertainty
Representation

How to capture/model uncertainties in possible worlds?

How to encode our domain knowledge/assumptions/constraints?
Inference

How do I answers questions/queries
X?9
according to my model and/or based
given data?
e.g.: P( X i | D )
X8
Learning

?
?
?
X
6
What model is "right"
X7
for my data?
X1
e.g.: M  arg max F (D ; M)
MM
X2
X3
X4
X5
Recap of Basic Prob. Concepts

Representation: what is the joint probability dist. on multiple
variables?
P( X 1 , X 2 , X 3, X 4 , X 5, X 6 , X 7 , X 8 )
A



How many state configurations in total? --- 28

Are they all needed to be represented?

Do we get any scientific/medical insight?
B
C
D
E
F
G
H
Learning: where do we get all this probabilities?

Maximal-likelihood estimation? but how many data do we need?

Are there other est. principles?

Where do we put domain knowledge in terms of plausible relationships between variables, and
plausible values of the probabilities?
Inference: If not all variables are observable, how to compute the
conditional distribution of latent variables given evidence?

Computing p(H|A) would require summing over all 26 configurations of the
unobserved variables
9
So What is a Graphical Model?
In a nutshell:
GM = Multivariate Statistics + Structure
Graphical modelling – Modern Statistical
Framework
Mathematics
Algorithms
Inference
11
1. Mathematics
Mathematics
Algorithms
Inference
12
Conditional independence
X and Z are conditionally independent given Y if, knowing
Y, discovering Z tells you nothing more about X: p(X|Y,Z)
= p(X|Y)
X
Y
Z
13
Conditional independence
as seen in data on perinatal mortality vs. antenatal care….
Clinic
Ante
A
less
more
B
Ante Survived
Survived
less
176 Died
373 293 20
more
316 197 6
less
more 23
Died % died
3% died
1.7
45.1 1.3
1.9 7.9
17
2
8.0
Does survival depend on antenatal care?
.... what if you know the clinic?
14
Conditional independence
survival
ante
clinic
survival and clinic are dependent
and ante and clinic are dependent
but survival and ante are CI given clinic
15
Graphical models
Use ideas from graph theory to
› represent structure of a joint probability distribution
› by encoding conditional independencies
C
D
F
B
E
A
16
C
D
F
B
A
E
Conditional independence provides a
mathematical basis for splitting up a large
system into smaller components
17
C
D
D
F
B
B
E
E
A
18
Two types of Graphical Models

Directed edges give causality relationships (Bayesian
Network or Directed Graphical Model):
ReceptorA
X1
Kinase C
X3
Receptor B
Kinase D
TF F
Gene G

X7
X4
X2
Kinase E
X5
X6
X8
Gene H
Undirected edges simply give correlations between variables
(Markov Random Field or Undirected Graphical model):
Receptor A
X1
Kinase C
X3
ReceptorB
Kinase D
TF F
Gene G
X7
X4
Kinase E
X6
Gene H
X2
X8
X5
2. Algorithms
Mathematics
Algorithms
Inference
20
Algorithms for probability and likelihood
calculations
Exploiting graphical structure:
› Markov chain Monte Carlo
› Bayesian Networks (probability propagation)
Graph representation used in user interface,
data structures and in controlling
computation
21
Markov chain Monte Carlo
?
Updating
?
- need only look at neighbours
22
Probability propagation (Bayes
Nets)
A
P(A)
A
B
P(B|A)
false
0.6
false
false
0.01
true
0.4
false
true
0.99
true
false
0.7
true
true
0.3
B
C
P(C|B)
false
false
0.4
false
true
0.6
true
false
0.9
true
true
0.1
Each node Xi has a conditional
probability distribution P(Xi |
Parents(Xi)) that quantifies the
effect of the parents on the node
The parameters are the
probabilities in these conditional
probability tables (CPTs)
A
B
C
D
B
D
P(D|B)
false
false
0.02
false
true
0.98
true
false
0.05
true
true
0.95
3. Inference
Mathematics
Algorithms
Inference
24
Bayesian
25
or nonBayesian
26
Bayesian paradigm in structured modelling
› automatically integrates out all sources of uncertainty
› properly accounting for variability at all levels
› including, in principle, uncertainty in model itself
› avoids over-optimistic claims of certainty
27
Rational Statistical Inference
The Bayes Theorem:
Likelihood
Posterior
probability
p(h | d ) 
Prior
probability
p(d | h) p(h)
 p(d | h)p(h)
hH
Sum over space
of hypotheses

This allows us to capture uncertainty about the model in a principled way
Probabilistic Graphical Models

If Xi's are conditionally independent (as described by a PGM), the
joint can be factored to a product of simpler terms, e.g.,
A
B
X1
X3
C
D
F
X7
G

X2
E
X4
X6
H
P(X1, X2, X3, X4, X5, X6, X7, X8)
X5
= P(X1) P(X2) P(X3| X1) P(X4| X2) P(X5| X2)
P(X6| X3, X4) P(X7| X6) P(X8| X5, X6)
X8
Why we may favor a PGM?

Incorporation of domain knowledge and causal (logical) structures
an 8-fold reduction from 28 in representation cost !

Bayesian Philosophy

Knowledge meets data




Application of Graphical Models

Machine Learning

Computational statistics

Epidemiology

Computer vision and graphics

Natural language processing

Informational retrieval

Robotic control

Decision making under uncertainty

Error-control codes

Computational biology

Genetics and medical diagnosis/prognosis

Finance and economics

Etc.
THANK YOU…
THANK YOU…..
32
Title and Content Layout with Chart
Category 4
Category 3
Category 2
Category 1
0
2
4
6
Series 1
Series 2
8
Series 3
10
12
14
Two Content Layout with Table
› First bullet point here
Group A
Group B
Class 1
82
95
› Second bullet point here
Class 2
76
88
› Third bullet point here
Class 3
84
90