Introduction of Probabilistic Reasoning and Dynamic
Download
Report
Transcript Introduction of Probabilistic Reasoning and Dynamic
Introduction of Probabilistic
Reasoning and Bayesian
Networks
Hongtao Du
Group Presentation
Outline
Uncertain Reasoning
Probabilistic Reasoning
Bayesian Network (BN)
Dynamic Bayesian Network (DBN)
Reasoning
The activity of guessing the state of the
domain from prior knowledge and
observations.
Causal reasoning
Diagnostic reasoning
Combinations of these two
Uncertain Reasoning (Guessing)
Some aspects of the domain are often
unobservable and must be estimated
indirectly through other observations.
The relationships among domain events
are often uncertain, particularly the
relationship between the observables
and non-observables.
The observations themselves may be
unreliable.
Even though observable, very often we
do not have sufficient resource to
observe all relevant events.
Even though events relations are
certain, very often it is impractical to
analyze all of them
Probabilistic Reasoning
Methodology founded on the Bayesian
probability theory.
Events and objects in the real world are
represented by random variables.
Probabilistic models:
Bayesian reasoning
Evidence theory
Robust statistics
Recursive operators
Graphical Model
A tool that visually illustrate conditional
independence among variables in a
given problem.
Consisting of nodes (Random variables
or States) and edges (Connecting two
nodes, directed or undirected).
The lack of edge represents conditional
independence between variables.
Chain, Path, Cycle, Directed Acyclic
Graph (DAG), Parents and Children
Bayesian Network (BN)
Probabilistic network, belief network,
causal network.
A specific type of graphical model that is
represented as a Directed Acyclic Graph.
BN consists of
variables (nodes) V={1, 2, …, k}
A set of dependencies (edges) D
A set of probability distribution functions
(pdf) of each variable P
Assumptions
0 P( X ) 1
P(X)=1 if and only if X is certain
If X and Y are mutually exclusive, then
P(X v Y) = P(X) + P(Y)
Joint probability P(X, Y)= P(X|Y) P(Y)
P(Y | X ) P( X )
P( X | Y )
P(Y )
X represents hypothesis
Y represents evidence
P(Y|X) is likelihood
P(X|Y) is the posterior probability
If X, Y are conditionally independent
P(X|Z, Y) = P(X|Z)
Given some certain evidence, BN
operates by propagating beliefs
throughout the network.
P(Z, Y, U, V) = P(Z) * P(Y|Z) * P(U|Y) * P(V|U)
n
P( X 1 ,, X n ) P[ X i | Pa( X i )]
i 1
where Pa( X i ) is the parents of node X i
Explaining away
If a node is observed, its parents become
dependent.
Two causes (parents) compete to explain the
observed data (child).
Tasks in Bayesian Network
Inference
Learning
Inference
Inference is the task of computing the
probability of each state of a node in a
BN when other variables are known.
Method: dividing set of BN nodes into
non-overlapping subsets of conditional
independent nodes.
Example
Z L X N YM
L N M
X N {x0 ,, xN 1} YM { y0 ,, yM 1}
UK ZL
Given Y is the observed variable.
Goal: find the conditional pdf P(U K | Y ) over U K
Case 1:
UK Y
K
P(U K | Y ) (ui yi )
i 1
x0
1
( x)
0 otherwise
Case 2:
UK X
P(U K , Y )
P(U K , Y )
P(U K | Y )
P(Y )
U P(U K , Y )
K
P(U K , Y )
P( x,U
xX \U K
K
,Y )
Learning
Goal: completing the missing beliefs in
the network.
Adjusting the parameters of the
Bayesian network so that the pdfs
defined by the network sufficiently
describes statistical behavior of the
observed data.
M: a BN model
: Parameter of probability of distribution
Z L : Observed data
Goal: Estimating to maximize the posterior
probability P( M | Z L )
P( M )
P( M | Z L )
P(Z L | , M ) P( | M )d
P( Z L )
Assume P(M | Z L ) is highly peaked
around maximum likelihood estimates ML
P( M )
P( M | Z L )
P( Z L | ML , M ) P( ML | M )
P( Z L )
ML arg maxlog P( Z L | )
Dynamic Bayesian Network (DBN)
Bayesian network with time-series to represent
temporal dependencies.
Dynamically changing or evolving over time.
Directed graphical model of stochastic
processes.
Especially aiming at time series modeling.
Satisfying the Markovian condition:
The state of a system at time t depends only on its immediate
past state at time t-1.
Representation
t1
Time slice
t2
tk
The transition matrix that represent these time
dependencies is called Conditional Probability
Table (CPT).
Description
T: time boundary we are investigating
Y { y0 ,, yT 1} : observable variables
X {x0 ,, xT 1} : hidden-state variables
T 1
T 1
t 1
t 1
P( X , Y ) P( xt | xt 1 ) P( yt | xt )P( x0 )
P( xt | xt 1 ) : state transition pdfs, specifying time
dependencies between states.
P( yt | xt ) : observation pdfs, specifying dependencies of
observation nodes regarding to other nodes at time slice t.
P( x0 ) : initial state distribution.
Tasks in DBN
Inference
Decoding
Learning
Pruning
Inference
Estimating the pdf of unknown states
through given observations and initial
probability distributions.
Goal: finding
P( X 0T 1 | Y0T 1 )
Y0T 1 { y1 ,, yT 1} : a finite set of T consecutive observations
X 0T 1 {x1 ,, xT 1} : the set of corresponding hidden variables
x0
x0
x0
y0
y0
y0
Decoding
Finding the best-fitting probability
values for the hidden states that have
generated the known observations.
Goal: determine the sequence of hidden
states with highest probabilities.
Xˆ 0T 1 arg max P( X 0T 1 | Y0T 1 )
X 0T 1
Learning
Given a number of observations,
estimating parameters of DBN that best
fit the observed data.
Goal: Maximizing the joint probability
distribution.
log P( X
T 1
0
T 1
0
,Y
T 1
T 1
i 1
i 1
| ) log P( xt | xt 1 ) log P( yt | xt ) log P( x0 )
: the model parameter vector
[log P( X 0T 1 , Y0T 1 | )]
0
Pruning
An important but difficult task in DBN.
Distinguishing which nodes are
important for inference, and removing
the unimportant nodes.
Actions:
Deleting states from a particular node
Removing the connection between nodes
Removing a node from the network
Time slice t
W1 (t ),,Wq (t ) : designated world nodes, a subset
of the nodes, representing the part we want to
inspect.
V (t k ) , tk t
If state of Wi (t ) is known, P(Wi (t ) si ) 1 ,
then V (t k ) are no longer relevant to the overall
goal of the inference.
Thus, (1) delete all nodes V (t k )
(2) incorporate knowledge that Wi (t ) si
Future work
Probabilistic reasoning in multiagent
systems.
Different DBNs and applications.
Discussion of DBN problems.