Transcript Document

Markov Models and Applications
Henrik Schiøler, Hans-Peter Schwefel
•
Mm1
Discrete time Markov processes
•
Mm2
Continuous time Markov processes
•
Mm3
M/M/1 type models
•
Mm4
Advanced queueing models
•
Mm5
Hidden Markov Models and their application (hps)
[email protected]
http://www.kom.auc.dk/~hps
Markov Models Lecture 5, Spring 08
Note: slide-set will be complemented by formulas, mathematical
derivations, and examples on the black-board!
Page 1
Hans Peter Schwefel
Motivation: Stochastic models
• Goals:
– Quantitative analysis of (communication) systems
• E.g., Quality of Service
– Enhanced Algorithms for Information Processing
• ’Extrapolation’, Error Concealment, Localisation, fault detection, etc.
• Stochastic Impact
–
–
–
–
Error Models
Randomization in Transmission Protocols
Complex systems  abstraction using statistics
Human Impact (e.g. Traffic, Mobility Models)
 Frequently use of stochastic models
– Simulation Models  Stochastic Simulation
– Analytic Models, e.g. Markovian Type, stochastic Petri Nets
Markov Models Lecture 5, Spring 08
Page 2
Hans Peter Schwefel
Content
1. Intro
2. Revision: Discrete Time Markov Processes
•
•
•
Definition, basic properties
State-probabilities, steady-state analysis
Parameter Estimation, Example: Mobility Model
3. Hidden Markov Models
•
•
•
•
Definition & Example
Efficient computation of Pr(observation)
Most likely state sequence
Parameter Estimation
4. Application Examples of HMMs
•
•
•
•
Link error models
Mobility models, positioning
Fault-detection
error concealment
5. Summary & Exercises
Markov Models Lecture 5, Spring 08
Page 3
Hans Peter Schwefel
Discrete Time Markov Processes
• Definition
– State-Space: finite or countable infinite,
w/o.l.g. E={1,2,...,N} (N= also allowed)
– Transition probabilities: pjk=Pr(transition from state j to state k)
– Xi = RV indicating the state of the Markov process in step i
– ’Markov Property’: State in step i only depends on state in step i-1
• Pr(Xi=s | Xi-1=si-1,Xi-2=si-2 ,..., X0=s0 ) = Pr(Xi=s | Xi-1=si-1)
• Computation of state probabilities
•
•
•
– Initial state probabilities (Step 0): 0
– Probability of state-sequence s0 ,s1 ,...,si: Pr(X0=s0 ,X1=s1 ,..., Xi=si ) = ...
– Pr(Xi=k)=j [Pr(Xi-1=j)*pjk]
– i = i-1 P
State-holding time: geometric with parameter pii
Parameter Estimation for ’observable’ discrete time Markov Chains
Example: 2-state Markov chain (state = link behavior at packet transmission
 {erroneous,ideal})
– Parameter estimation, Markov property validation, limitations
Markov Models Lecture 5, Spring 08
Page 4
Hans Peter Schwefel
Discrete Time Markov Processes (cntd.)
• Properties
– homogenuity: P independent of step i
– Irreducibility: each state is reachable from any other state (in potentially multiple
steps)
– Transient states, positive recurrent states
– Periodicity
• Steady-state probabilities
•
– =limi i
– Limit exists and is independent of 0 if Markov chain irreducible and aperiodic
– Aperiodic & positive recurrent = ergodic   is probability distribution
Examples (periodicity, ergodicity, steady-state probabilities, absorbing states)
•
Application example: mobility model – set-up, benefits, problems
Markov Models Lecture 5, Spring 08
Page 5
Hans Peter Schwefel
Content
1. Intro
2. Revision: Discrete Time Markov Processes
•
•
•
Definition, basic properties
State-probabilities, steady-state analysis
Parameter Estimation, Example: Mobility Model
3. Hidden Markov Models
•
•
•
•
Definition & Example
Efficient computation of Pr(observation)
Most likely state sequence
Parameter Estimation
4. Application Examples of HMMs
•
•
•
•
Link error models
Mobility models, positioning
Fault-detection
error concealment
5. Summary & Exercises
Markov Models Lecture 5, Spring 08
Page 6
Hans Peter Schwefel
Hidden Markov Models (HMMs): Definition
• Main property
– In each state s E, an ’observation symbol’ from some alphabet V is generated
probabilistically
– The underlying state cannot be observed, only the sequence O=[O1,O2,...,OT] of
generated symbols
• HMM = <E, V, 1, P, B>
–
–
–
–
–
•
•
•
E: state-space (discrete, finite/infinite), w/o.l.g. E={1,2,...,N}
V: set of possible observation symbols (discrete for now), w/o.l.g V={1,2,...,M}
1: initial state probabilities at step 1
P: NxN matrix of state transition probabilities pij = Pr(Xk+1=j | Xk=i)
B: NxM matrix of symbol generation probabilities: bij = Pr (Ok=j | Xk=i)
Example: 2-state HMM, observations = result from biased coin-toss
Note: Discrete time Markov model is special case of HMM,
namely each column of B contains at most one non-zero element
Exercise: Write a (Matlab) program with input (1, P, B,T) that generates a sequence of
observations of length T
Markov Models Lecture 5, Spring 08
Page 7
Hans Peter Schwefel
Hidden Markov Models (HMMs): Computations
• Problem 1: Compute probability of observing a certain sequence o=[o1,...,oT] in
a given HMM.
– First (inefficient) approach (’brute-force’):
• Generate all possible state-sequences of length T: q=[q1,...,qT]
• Sum up all Pr(o| q) weigthed by Pr(q) (total probabilities)
• Problem: Number of paths grows exponentially as NT
– More efficient (quadratic in N) approach: forward procedure
• Iterative method computing probabilities for pre-fixes of the observation sequence:
t := [Pr(O1=o1,...,Ot=ot, Xt=1), ..., Pr(O1=o1,...,Ot=ot, Xt=N)]
• At step t=1: 1(i) = Pr(O1=o1, X1=i) = 1(i) bi,o1 [ Matlab Notation: 1 = 1 .* B(:, o1 ) ’]
• tt+1 (t=1,2,...,T-1):
t+1(i) = (jE t(j) pji) Pr(Ot+1=ot+1 | Xt+1=i )
t+1 = (t P) .* B(:, ot+1 )’
• Finally: Pr(O=o) = jE T(j)
• Computation can be illustrated in Trellis structure
– Similarly (and identifiers needed later): Backwards procedure
• t := [Pr(Ot+1=ot+1,...,OT=oT| Xt=1), ..., Pr(Ot+1=ot+1,...,OT=oT | Xt=N)]
• T =1 (vector with all elements = 1); t = (P * B(:, ot+1 ))’ .* t+1
Markov Models Lecture 5, Spring 08
Page 8
Hans Peter Schwefel
HMMs: Computations (cntd.)
Problem 2: Find ’most likely’ state sequence for an observation
o=[o1,...,oT] in a given HMM.
• I.e. find the sequence q*=[q1*,...,qT*] that maximizes Pr(X1=q1,...,XT=qT | O=o)
(or, equivalently, the joint probability)
– Optimization via pre-fix of length t (Viterbi Algorithm):
t := [maxq1,...,qt-1{Pr(X1=q1,...,Xt-1=qt-1, Xt=1, O1=o1,...,Ot=ot)},
..., maxq1,...,qt-1{ Pr(X1=q1,...,Xt-1=qt-1, Xt=N, O1=o1,...,Ot=ot)}]
– Algorithm
• 1 = 1 .* B(:, o1 )
• t+1 (j) = [maxi=1,...,N t(i) pij] Bj,ot+1, t+1(j)=argmaxi=1,...,N t(i) pij,
• Maximum of probability: p*= maxi=1,...,N T(i),
t=1,2,...,T-1
qT*= argmaxi=1,...,N T(i)
• state sequence: qt*= t+1(qt+1*), t=T-1,...,1
– Efficient implementations: use of logarithms to avoid multiplications
Markov Models Lecture 5, Spring 08
Page 9
Hans Peter Schwefel
HMMs: Computations (cntd.)
Problem 3: Find ’most likely’ HMM model for an observation o=[o1,...,oT].
•
•
•
Assumption: State-space E and symbol alphabet V are given
Hence, desired is <1*, P*, B*> such that Pr <1, P, B> (O=o) is maximized
Iterative procedure for maximization: <1(m), P(m), B(m)>  <1(m+1), P(m+1), B(m+1)>
– Compute using model <1(m), P(m), B(m)>:
• t(i):=Pr(Xt=i | O=o) = t(i) t(i) / i [t(i) t(i)]
• t(i,j):= Pr(Xt=i, Xt+1=j | O=o) = t(i) pij bj,ot+1 t+1(j) / j i [t(i) pij bj,ot+1 t+1(j)]
– ’Expectations’:
•
•
•
•
T(i):= t=1T-1 t(i) = expected number of transitions from state i in o
T(i,j):= t=1T-1 t(i,j) = expected number of transitions from state i to state j in o
S(i,k):= t=1,...,T, ot=k t(i) = expected number of times in state i in o and observing symbol k
S(i):= t=1,...,T, t(i) = expected number of times in state i in o
– Updated HMM:
• 1(m+1) =[1(1),..., 1(N)], pij(m+1)=T(i,j)/T(i),
• bik(m+1)= S(i,k)/S(i)
– Update-step increases Pr <1, P, B> (O=o), but potentially convergence to local maximum
Markov Models Lecture 5, Spring 08
Page 10
Hans Peter Schwefel
Content
1. Intro
2. Revision: Discrete Time Markov Processes
•
•
•
Definition, basic properties
State-probabilities, steady-state analysis
Parameter Estimation, Example: Mobility Model
3. Hidden Markov Models
•
•
•
•
Definition & Example
Efficient computation of Pr(observation)
Most likely state sequence
Parameter Estimation
4. Application Examples of HMMs
•
•
•
•
Link error models
Mobility models, positioning
Fault-detection
error concealment
5. Summary & Exercises
Markov Models Lecture 5, Spring 08
Page 11
Hans Peter Schwefel
HMMs: Application Examples
• Link error models
–
–
–
–
State-space=different levels of link quality, observation V={error, correct}
Equivalent to ’biased’ coin toss example
Extensions to multiple link-states
Advantage: more general types of burst errors
• Mobility models
– State-space=product space(different classification of user-behavior, current coordinates)
– observation = set of discrete positions of user/device
•
Positioning
– State-space same as mobility model
– Observations now e.g. RSSI distributions
Markov Models Lecture 5, Spring 08
Page 12
Hans Peter Schwefel
HMMs: Application Examples II
• Fault-detection (Example from last semester student project)
– State-space={Congested, lowly utilized} x {good wireless link, bad link}
– Observations: discrete levels of RTT measurements (per packet) and packet loss events
(binary)
– Discussion of advantages/disadvantages, comparison to Bayesian Networks
• Error concealment
– E.g. Transmission of speech over noisy/lossy channel
– State-space=speaker model
– observation = received symbols, subject to loss/noise
Markov Models Lecture 5, Spring 08
Page 13
Hans Peter Schwefel
Summary
1. Intro
2. Revision: Discrete Time Markov Processes
•
•
•
Definition, basic properties
State-probabilities, steady-state analysis
Parameter Estimation, Example: Mobility Model
3. Hidden Markov Models
•
•
•
•
Definition & Example
Efficient computation of Pr(observation)
Most likely state sequence
Parameter Estimation
4. Application Examples of HMMs
•
•
•
•
Link error models
Mobility models, positioning
Fault-detection
error concealment
5. Summary & Exercises
Markov Models Lecture 5, Spring 08
Page 14
Hans Peter Schwefel
References
•
L. Rabiner, B-H Juang: ’Fundamentals of Speech Recognition’, Prentice Hall, 1993.
– Sections 6.1-6.4
Markov Models Lecture 5, Spring 08
Page 15
Hans Peter Schwefel
Exercises 1
Hidden Markov Models: Given is the following 3-state hidden Markov model with parameters
pi1=[0.2,0.3,0.5], P=[0.2,0.4,0.4; 0.5,0.1,0.4; 0.2,0.2,0.6]. The observations are coin-toss results
(Heads=1, Tails=2) with B=[0.8,0.2;0.5,0.5;0.1,0.9].
a) write a (Matlab) program that generates observation sequences of length T from the given
HMM.
b) Write a program that efficiently compute the probability of a given observation sequence. Run
the program for S=’HHTHTTTHT’. Compare with a probability estimate via simulation using
the program from Task a.
c) Write a program to determing the most-likely state sequence and run the program for the
sequence in (b).
Markov Models Lecture 5, Spring 08
Page 16
Hans Peter Schwefel
Exercises 2
Localisation with HMMs: Consider a 5mx5m squared room in which 3 access points are placed in
the three corners (0,5), (5,5), (5,0). Use a grid with 1mx1m elements to discretize this geographic
space. A mobile device is moving through the room and the Access Points measure received signal
strength which follows a path-loss model RSSI[dB] = Round(- 6 log10 (d/d0)+13+N), with d0=0.1m.
The Noise N is assumed to be Normal distributed with standard deviation sigma=2.
Write Matlab functions to
a) Compute for each grid position (i,j), the probabilities of observing an RSSI triplet (R1,R2,R3),
Ri=0,...,9.
b) Determine the MLE of the trajectory of the mobile device for observation sequence
[1,2,1],[2,0,4],[4,2,1],[7,3,4].
c) Assume that the mobile device moves equally likely in any of the possible (2-4)
vertical/horizontal directions, with velocity 1m/timeunit. Setup the matrices P and B that
describe the resulting HMM. (Use lexiographic order for the 2-dimensional coordinates and for
the RSSI triplets)
d) Determine the most likely trajectory for the above observation sequence resulting from the
HMM.
Markov Models Lecture 5, Spring 08
Page 17
Hans Peter Schwefel