Hidden Markov Models
Download
Report
Transcript Hidden Markov Models
HIDDEN MARKOV MODELS
OVERVIEW
Markov models
Hidden Markov models(HMM)
Issues Regarding HMM
Algorithmic approach to Issues of HMM
MARKOV MODELS
A Markov model is a finite state machine with
N distint states begins at (Time t = 1) in initial
state .
It moves from current state to Next state
according to the transition probabilities
associated with the Current state
This kind of system is called Finite or Discrete
Markov model.
HIDDEN MARKOV MODELS
A Hidden Markov model is a statistical model in
which the system being modelled is assumed to
be markov process with unobserved hidden
states.
In Regular Markov models the state is clearly
visible to others in which the state transition
probabilities are the parameters only where as in
HMM the state is not visible but the output is
visible.
HIDDEN MARKOV MODELS - HMM
Hidden variables
H1
H2
Hi
HL-1
HL
X1
X2
Xi
XL-1
XL
Observed data
DESCRIPTION
Formally, HMM is defined by an alphabet
M( ∑,Q,A,E ).
∑ is an alphabet of symbols.
Q is a set of states, each state will emit symbols
from the alphabet ∑.
A = ( ak,l) is a Q × Q matrix describing the
probability of changing to state l after the HMM
is in state k.
E = (ek(b)) is Q × ∑ matrix describing the
probability of emitting the symbol b during a step
in which the HMM is in state k.
FAIR BET CASINO PROBLEM
Given a sequence of coin tosses, with a sequence
of x1,x2,x3……xn of coin tosses can be either
heads or tails made by two possible coins(F or B)
as input.
We need to find a sequence with each ∏ = ∏1, ∏2,
∏3… ∏n being either F or B indicating that xi is
the result of tossing the fair or biased coin.
EXPLAINATION
The above problem is ambiguous because the
sequence of coins generated may be FFFFF.. Or
BBBB..BB .
We need to design the way to grade different coin
sequences.
This ill-defined problem should be converted into
Decoding problem based on HMM paradigm.
FAIR BET CASINO PROBLEM
This is an sample HMM designed for Fair Bet
casino problem. There are two states F( Fair )
and B ( Biased ) .
Each state can emit either heads( H ) or tails ( T )
with probabilities
For above problem we define the parameters as
probability of getting head or tail when we used
fair coin is 0.5
Probability of getting head is 0.75 when we used
biased coin and getting tail is 0.25.
If the resulting sequences of tosses is X =
x1,x2,x3….xn, the the probability that x was
generated by fair coin is
P ( X|fair coin ) = ∏i=1n( p( xi )) = 1/ 2n .
P ( X|biased coin ) = 3k/ 4n
where k is the no of heads in X.
If P( X|fair coin) > P(X|biased coin), then the
dealer most likely used a fair coin.
If P( X|fair coin) < P(X|biased coin), then the
dealer most likely used a biased coin.
The probabilities of getting fair coin and biased
coin will be equal at k = n/ log23.
If k < n/ log23 dealer uses fair coin else dealer
uses biased coin.
log2( P(X|fair coin) / P( X|biased coin))= n-k log23
A path ∏ = ∏1, ∏2, ∏3… ∏n is the sequence of
states. If dealer uses fair coin for first 3 times
and last 3 tosses and the corresponding ∏ would
be FFFBBBBBFFF and the resulting sequence of
tosses is 01011101001.
Probability of Xi being generated by ∏I is
We denote P( Xi/ ∏I ) to denote the probability
that symbol Xi was emitted from state ∏I
We write the transition probability as
P(∏i ----> ∏i+1)
The transition probabilities for above matrix is
defined as
The probability of generating X through the path
∏ can be calculated as
(1\2*1\2)(1\2*9\10)……..(1\2*9\10)(1\2*9\10)
is 2.66* 10-6.
This probability should be maximum. If it is not
maximum then it is not the most probable path
We need to select another sequence for ∏ which
will get maximum probabilty.
If we select ∏ = FFFBBBFFFFF then the
probability is 3.54* 10-6
The probability that sequence x was generated by
the path ∏ given the model M, is
Since the above solution is not optimal solution
because only the dealer knows the real sequence
of states by ∏ that emitted X, we say that ∏ is
hidden and attempt to solve the following
decoding problem.
MAIN ISSUES ?
Evaluation problem: Given the HMM
M = ( ∑,Q,A,E )and observation sequence
X = x1,x2 ……xk, Caluculate the probability
that model m has generated sequence X.
Decoding problem : Given the HMM
M =( ∑,Q,A,E ) and observation sequence
X = x1,x2 ……xk, Caluculate the most likely
sequence of hidden states ∏i that generated
sequence X.
SOLUTION TO DECODING PROBLEM ?
Decoding problem: Viterbi Algorithm
In this algorithm we go through the observations
from start to end referring a state of hidden
machine for each observation.
We also record the values of Overall Probability,
Viterbi path (sequence of states) and the viterbi
probability( Probability of observed state
sequences in viterbi path )
The probability of possible step given its
corresponding observation is probability of
transmission times the emission probability.
VITERBI ALGORITHM FOR DYNAMIC
PROGRAMMING
Overall Probability : Multiply each new
probability with the oldone and then add
together.
Viterbi probability : Take the highest next step
probability and multiply with the next step
viterbi probability.
Viterbi path : Add the next step path to viterbi
path.
VITERBI ALGORITHM
Here we use HMM-inspired analog of Manhattan
grid for Decoding problem
To calculate P(X| ∏) we need to set the edge
weights in this graph such that product of edge
weights will generate the sequence.
There are |Q|2( n-1) edges in the graph where
weight of each edge from (k,i) to (l,i+1) given by
el(xi=1) * akl
Probability of path ending at any particular
vertex is caluculated as
Decoding problem is now reduced to finding a
longest path in the directed acyclic graph ( DAG)
The length of path is defined as product of its
edges weights instead of caluculating sum of
weights in dynamic programming algorithms.
Application of logarithms to the solution makes
the same as to previous case.
To calculate the probability of path that ends at
state k we need to calculate the most likely path
ending at the state k.
The computations in the viterbi algorithm are
usualy done using logarithmic scores Sk,I = log Sk,i
to avoid the overflow.
Viterbi algorithm is essentially a search through
the space of all possible paths in that graph for
the one that maximizes the value of P ( X| ∏ )
We can also caluculate the probability of HMM
was in state k at time i/
P(X, ∏i=k) = ∑ all with ∏ i=kP( X| ∏ )
Probability that the dealer had a biased coin at
moment I is given by
HMM PARAMETER ESTIMATION
Previously we know the transition probabilities
and emission probabilities of HMM so it is easy
to caluculate the hidden states using an observed
sequence and probabilities.
It is much difficult to calculate the probable
sequence or path when both the probabilities are
unknown.
Let Θ be vector combining unknown transition
and emission probabilities of the HMM.
We define P( X| Θ) as the maximum probability
of x given the assignment of parameters Θ.
Our goal is to find
max Θ P( X| Θ)
Instead of getting single string x, we can obtain a
sample of training sequencies x1,x2…..xm
max Θ ∏ i=1mP( Xi| Θ)
The common algorithms used for the above
approach is heuristics for parameter
optimization.
If we know the number of transitions from state
k to l and Ek(b) is the number of times b is
emitted from state k, then the reasonable
estimators are
ak,l = Ak,l/ ∑q€Q Akq
ek(b) = Ek(b)/ ∑σ€∑Ek( σ )
PROFILE HMM ALIGNMENT
This can be used for searching for new family
members from a database using paiwise
alignments when the functionally related
biological sequences are given.
This approach may fail because distant
sequences may have weak similarities which will
not pass the statistical significance test.
The representation of the family of related
proteins is given by their multiple allignment
and corresponding profile
A profile represented in terms of frequencies of
nucleotides.
HMM can also be used for sequence comparision
in particular for alidning a sequence against a
profile.
It contains n sequentially linked match states
M1,M2,….Mn.
PROFILE HMM
REFERENCES
www.cedar.buffalo.edu/~govind/CS661/Lec12.ppt
www.bios.niu.edu/johns/bioinf.../Hidden%20Mar
kov%20Models.ppt
www.mathcs.emory.edu/~cs153000/share/0123/bo
ok123.pdf
Thank you