Hidden Markov Models - Computer Science

Download Report

Transcript Hidden Markov Models - Computer Science

Hidden Markov Models
Wassnaa AL-mawee
Western Michigan University
Department of Computer Science
CS6800 Adv. Theory of Computation
Prof. Elise De Doncker
02/17/2016
Outline
 Introduction
 Motivation
 Markov Models
 Hidden Markov Models (HMMs)
 HMMs Definition and Components
 HMMs Problems
 HMMs Basic Algorithms
 Applications
 HMMs in Speech Recognition
 HMMs for Gene Finding in DNA
 Summary
 References
2
Introduction
 Motivation :
 Predictions based on models of observed data
(independent and identically distributed).
 This assumption not always is the best:
 Measurements of weather patterns.
 Daily values of stocks.
 The composition of DNA.
 The composition of texts.
 Time frames used for speech recognition.
3
Introduction
(Continue…)
 Markov Models:
 If the n’th observation in a chain of observations is
influenced only by the n-1’th observation, then the chain of
the observation is called 1st order Markov chain:
 P(Xn|X1,…….,Xn-1) = P(Xn|Xn-1)
 Example: Weather Prediction(What the weather would be
tomorrow depends on observations on the past:
4
Introduction
(Continue…)
 Weather of the day (day n), Xn €(sunny, rainy, cloudy).
 If the weather yesterday was cloudy and today is rainy,
what is the probability that tomorrow will be sunny?
P(X3=
|X2=
) = 0.6
5
Introduction
(Continue…)
 What if the n’th observation in a chain of observations is
influenced by a corresponding HIDDEN variable?
Source: http://www.bbc.com/news/health-29708755
6
Outline
 Introduction
 Hidden Markov Models (HMMs)
 HMMs Definition and Components
 HMMs Problems
 HMMs Basic Algorithms
 Applications
 Summary
 References
7
HMMs Definition and Components
 HMMs: are powerful statistical models for modeling
sequential or time-series data, and have been
successfully used in many tasks such as [1]:
 Robot control,
 Speech recognition,
 Protein/DNA sequence analysis,
 and information extraction from text data.
8
HMMs Definition and Components
 HMM is a 5-tuple (S, O, Π, A, B), where:
 S = {s1, ..., sN } is a finite set of N states,
 O = {o1, ..., oM } is a set of M possible outputs (observation)
 Π = {πi} are the initial state probabilities,
 A = {aij} are the state transition probabilities,
 B = {bi(ok)} are the output or emission probabilities.
 We use λ(HMM) = (Π, A, B) to denote all the parameters.
9
HMMs Definition and Components
Source: http://7-themes.com/6912015little-robot.html
 HMMs Example: Assistive Technology
 Assume you have a little robot that tries to estimate the
probability that you are happy (h) or sad (s), given that the
robot has observed whether you are watching your favorite
TV show(w), sleeping(s), crying(c), or face booking (f) [4].
 Let hidden states be X= h, X=s.
 Let observations be Y, which can be w, s, c, or f.
 We want to find out those probabilities:
 P(X=h|Y=w)?
 P(X=s|Y=c)?
10
HMMs Definition and Components
 Using Bayes Rule: describes the probability of an event based
on conditions that might be related to the event [5].
 For i,
 For n,
11
HMMs Definition and Components Continue…
 Solve them with Prior and Likelihood Model [4]:
 P(X|Y)?
s
h
0.2
0.8
 P(X)=
 P(Y|X)=
, P(X=s)= 0.2
w
s
c
f
s
0.1
0.3
0.5
0.1
h
0.4
0.4
0.2
0
, P(X=h|Y=w)?= 0.94
0.4
0.4
0.8
0.4
0.1
0.2
12
HMMs Definition and Components Continue…
What if we have Transition Prior rather than absolute prior.
Let assume we have observation like this ccc, wcw, Hence we have
 S = {(H),(S) },
 O = {(w),(s),(c)),(f)}
 Π = {H:0.8, S:0.2},
 A = {{H-H:0.9,H-S:0.1},{S-H:0.1,S-S:0.9}}
 B = {{H-w:0.4, H-s:0.4, H-c:0.2, H-f:0 }, {S-w:0.1, S-s:0.3, S-c:0.5,S-f: 0.1 }}
13
HMMs Problems
 There are three basic problems are associated with an
HMM:
1) Evaluation: Evaluating the probability of an observed
sequence of symbols O over all of possible sequence given
a particular HMM λ, i.e., p(O|λ).
2) Decoding: Finding the best state sequence, given an
observation sequence O and HMM λ, i.e, q∗ = argmaxs
p(q|O).
3) Training: Finding all the parameters of HMMλ to maximize
the probability of generating an observed sequence, i.e.,
to find λ∗ = argmaxλ p(O|λ).
14
HMMs Basic Algorithms
Problem
Algorithm
Evaluation
p(O|λ)
Forward-Backward
Decoding
q∗ = argmaxs p(q|O)
Viterbi Decoding
Training
λ∗ = argmaxλ p(O|λ)
Baum-Welch(EM)
15
HMMs Basic Algorithms
continue…
 The Forward Algorithm to solve Evaluation Problem :
 Sum over all possible paths of the state sequence that generate the
given observation sequence by the following recursive procedure:
 α (i) = πibi(o1), 1<=i<=N
1
 αt+1(i) = bi(ot+1)

, 1 ≤ t < T,
we may end at any of the N states.
 The Backward Algorithm along with Forward Algorithm to solve
the third Problem:
 r
 R
 e
16
HMMs Basic Algorithms
continue…
 Viterbi Algorithm to solve Decoding Problem:
 The Viterbi algorithm is a dynamic programming algorithm.
 It computes the most likely state transition path given an observed
sequence of symbols.
 It is similar to the forward algorithm, except takes “max”, rather than
a “summation ”.
 Viterbi Algorithm :
 VP1(i)= πibi(o1) and q1*(i)= (i)
 For 1≤t≤T, VPt+1(i)=max1≤j≤N VPt(j)ajibi(ot+1) and qt+1*(i)= qt*(k).(i), where
k=argmax1≤j≤N VPt(j)ajibi(ot+1)
17
HMMs Basic Algorithms
continue…
 Baum-Welch Algorithm (also called Forward-Backward Algorithm) to
solve Training Problem:
 We assume that we know the HMMs parameters, but often these
parameters re-estimated or annotated training that has two
drawbacks:
1) Annotation is difficult/expensive.
2) Training data is different from the current data.
 The goal of Baum-Welch algorithm is to tune the parameters of HMMs
using EM (Expectation Maximization) that maximize the parameters of
HMMs to the current data.
18
Outline
 Introduction
 Hidden Markov Models (HMMs)
 Applications
 HMMs in Speech Recognition
 HMMs for Gene Finding in DNA
 Summary
 References
19
HMMs In Speech Recognition
The word ball spoken by two different speakers: (a) female and (b) male[3]
The phoneme /e/ in three different contexts: (a) let’s, (b) phonetic and (c) sentence[3]
20
HMMs In Speech Recognition Continue..
 Using HMM to model some unit of speech word and
sentence level from phoneme level-units. For example,
 One with pronunciation “W AX N” [2]:
 Representing speech as a sequence of symbols.
 Output Probabilities: Probabilities of observing symbol in a
state.
 Transition Probabilities: Probabilities of staying in or
skipping state.
21
HMMs In Speech Recognition Continue..
 Training HMMs for Continuous Speech
 Concatenate phone models to give word model.
 Concatenate word models to give sentence model.
 Train entire sentence model on entire spoken sentence.
22
HMMs In Speech Recognition Continue..
Recognition Search
Source:
http://www.cs.cmu.edu/~r
oni/10601-slides/hmm-forasr-whw.pdf
23
HMMs In Speech Recognition Continue..
Forward-Backward Training for Continuous Speech
Source: http://www.cs.cmu.edu/~roni/10601-slides/hmm-for-asr-whw.pdf
24
HMMs for Gene Finding in DNA
Source: http://www.wignallandwales.co.nz/Revision-samples/7F1_Chromosomes.htm
25
HMMs for Gene Finding in DNA (Continue..)
Basic Structure of Gene [6]
26
HMMs for Gene Finding in DNA (Continue..)
 Input: DNA sequence X=(x1,…xn) , Îå
*
where å = A,C,G,T.
 Output: Correct Labeling of each element in X as
coding, non-coding, and intergenic regions.
 The goal of gene finding is then to annotate the sets of
genomic data with the location of genes and within
these genes [6].
27
HMMs for Gene Finding In DNA (Continue..)
• Enter: start codon or intron (3’ Splice Site)
• Exit: 5’ Splice site or three stop codons
(taa, tag, tga) [7]
28
Outline
 Introduction
 Hidden Markov Models (HMMs)
 Applications
 Summary
 References
29
Summary
 Introduced Hidden Markov Models(HMMs).
 Defined HMMs basic problems and the corresponding
solving algorithms of them.
 Presented the most known applications of HMMs.
30
References
[1] http://sifaka.cs.uiuc.edu/course/498cxz06s/hmm.pdf
[2] http://www.cs.tut.fi/~puhtunn/lecture_model_search.pdf
[3] http://www.kbs.twi.tudelft.nl/docs/syllabi/speech.pdf.
[4] https://www.youtube.com/watch?v=jY2E6ExLxaw.
[5] https://en.wikipedia.org/wiki/Bayes%27_theorem
[6] K. Smith, “ Hidden Markov Models in Bionformatics with Application to
Gene Findingin Human DNA,” available on:
https://www.rose-hulman.edu/~shibberu/MA490/Smith.pdf.
[7] Nagiza F. Samatova, “Computational Gene Finding using HMMs,” ppt
31
Questions
Q1) Define HMM and give its components?
Q2) What are the three problems that associate with an HMMs?
And how to solve them?
Q3) What the difference between Forward and Viterbi algorithms?
Q4) Given the example of Assistive Technology find P(X=s|Y=c)?
Hint, Use Absolute prior
Q5) How to find a gene in human DNA using HMMs? What are the
input and outputs of HMMs?
32