Transcript Slides1

Chapter 4: Stochastic Processes
Poisson Processes and Markov Chains
Presented by Vincent Buhr
Overview
The Homogeneous Poisson Process
The Poisson and Binomial Distributions
The Poisson and Gamma Distributions
The Pure-Birth Process
Finite Markov Chains
Modeling
The Poisson Process
A sequence of events occurring during a time interval
forms a homogeneous Poisson process if two conditions
are met:


The occurrence of any event in the time interval (a,b) is
independent of the occurrence of any event in the time interval
(c,d)
There is a constant λ such that for any sufficiently small time
interval (t,t+h), the probability that an event occurs in the time
interval is independent of t, and is λh+o(h)
Condition 2 ensures that the probability of an event
occurring within an interval is proportional to the length
of the interval
With these two conditions, the number of events N that
occur up to any time t has a Poisson distribution with
parameter λt
The Poisson Process (cont)
Pj(t) is the probability that N=j at time t
At time 0 the value of N is necessarily 0, so P0(0)=1 and
Pi(0)=0 for all i>0
At time t+h, N=0 only if no events occur in the interval
(0,t) or the interval (t,t+h), so:
At time t+h, N=1 can happen one of two ways, either
N=1 at time t and no events occur in the interval (t,t+h)
or N=0 at time t and one event occurs in the interval
(t,t+h), which gives:
The Poisson Process (cont)
Finally, N>1 can occur at time t+h one of three ways



N=j at time t and no events occur in the interval (t,t+h)
N=j-1 at time t and one event occurs in the interval (t,t+h)
N=j-k, where k > 1, and more than one event occurs in the
interval (t,t+h)
These possibilities give us an equation that looks a lot
like 4.3:
The only difference between the equations is the order
of o(h), so we can use 4.4 for all j > 0
The Poisson Process (cont)
After subtracting P0(t) from both sides of (4.2) and Pj(t)
from both sides of (4.4), dividing through by h, and
letting h → 0 we get two equations:
(4.5) has the solution:
And since we know that P0(0)=1 we can infer that C=1,
so:
The Poisson Process (cont)
Using (4.9) and mathematical induction we can
prove that the solution to (4.6) is:
Which is the Poisson distribution with
parameter λt as we set out to prove
The Poisson and Binomial
Distributions
Under special circumstances the binomial
distribution can be made to approximate the
Poisson distribution
If n → + Inf., p → 0, and np = λ then for any y
the binomial probability approaches the
Poisson probability
To prove this, first we write the binomial
probability equation as:
The Poisson and Binomial
Distributions (cont)
If we then fix y and λ, and write p as λ / n then
as n approaches infinity each term in (4.12)
has a finite limit
Terms of the form (n-i) λ / n approach λ
With these limits (4.12) approaches
Which is the Poisson probability
The Poisson and Gamma
Distributions
Equation (4.9) hints at a connection between the Poisson
distribution and the exponential distribution
The random time until the first event occurs in a Poisson
process with parameter λ is given by the exponential
distribution with parameter λ
To prove this we can let F(t) be the probability that the first
event occurs before time t, which means that the density
function for the time until the first occurrence is the
derivative of F(t)
From (4.9)
So:
Which is the exponential distribution
The Poisson and Gamma
Distributions (cont)
Additionally, the distribution of the time between successive events is
also given by the exponential distribution
This means that the random time until the kth event occurs is the
sum of k independent exponentially distributed times, which has the
gamma distribution
To prove this, let t0 be some fixed value of t
Then, if the time until the kth event occurs exceeds t0, the number of
events occurring before time t0 is less than k
So the probability that k-1 or less events occur before time t0 is equal
to the probability that the time until the kth event occurs exceeds t0
This leads us to the following equality:
The RHS of this equality is essentially the gamma distribution
The Pure-Birth Process
When deriving the Poisson distribution we assumed that the
probability of an event in a time interval is independent of the
number of events that have occurred up to time t
This assumption does not always hold in biological applications
In the pure-birth process it is assumed that given the value of a
random variable at time t is j, the probability that it increases to j+1
in a given time interval (t,t+h) is λjh
The Poisson case arises when λj is independent of j and is just
written as λ
As with the Poisson process we can arrive at a set of differential
equations for the probability that the random variable takes the
value j at time t
The Pure-Birth Process (cont)
One example of an application of the Pure-Birth
process is the Yule process, where it is
assumed that λj= jλ

The motivation for this process arises from
populations where if the size of the population is j the
probability that it increases to size j+1 is proportional
to j
For this case, the solution to the differential
equations given before is:
The Pure-Birth Process (cont)
Another example of the application of the Pure-Birth
process comes from polymerase chain reaction (PCR)
In PCR, sequential additions of base pairs to a primer
occur to create the product
For this process, λj=m-j, which implies that once the
length of the product reaches m no further increase in
length is possible
With this condition, the solution is:
Neither this example nor the last follow the Poisson
distribution, which shows the importance of verifying the
event independence assumption
Introduction to Finite Markov
Chains
A Markov chain process occupies one of a finite number
of discrete states at a given time unit
In a time step from t to t+1 the process either stays in the
same state or moves to another state in a probabilistic
way (as opposed to deterministic)
A simple Markov chain has two basic properties:


The Markov property- The probability that the process changes
from Ej to Ek in one time step depends only on the current state
Ej and not on any past states
The temporally homogeneous transition probabilities property- If
at time t the process is in state Ej, the probability that it changes
to Ek in one time step is independent of t
Some Markov processes ignore one or both of these
properties, but we will assume both hold
Transition Probabilities and the
Transition Probability Matrix
If at time t a Markovian random variable is in state Ej the
probability that at time t+1 it is in state Ek is denoted by
pjk, which is the transition probability from Ej to Ek

This notion implicitly contains both the properties mentioned
before
A transition probability matrix P of a Markov chain
contains all of the transition probabilities of that chain
Transition Probabilities and the
Transition Probability Matrix (cont)
It is also assumed that there is a initial probability
distribution for the states in the process

This means that there is a probability πi that at the initial time
point the Markovian random variable is in state Ei
To find the probability that the Markov chain process is in
state Ej two time steps after being in state Ei you must
consider all the possible intermediate steps after one
time step that the process could be in
This can also be done for the whole process at once by
matrix multiplication, the notation Pn is used to denote an
n-step transition probability matrix
Markov Chains with Absorbing
States
A Markov chain with an absorbing state can be
recognized by the appearance of a 1 along the
main diagonal of its transition probability matrix
A Markov chain with an absorbing state will
eventually enter that state and never leave it
Markov chains with absorbing states bring up
new questions, which will be addressed later,
but for now we will only consider Markov chains
without absorbing states
Markov Chains with No Absorbing
States
In addition to having no absorbing states, the Markov
models that we will consider are also finite, aperiodic, and
irreducible



Finite means that there are a finite number of possible states
Aperiodic means that there is no state such that a return to that
state is possible only t0, 2t0, 3t0, … transitions later, where t0 > 1
Irreducible means that any state can eventually be reached from
any other state, but not necessarily in one step
Stationary Distributions
Let the probability that at time t a Markov chain process
is in state Ej be φj
This means that the probability that at time t+1 the
process is in state Ej is given by
If we assume that these two probabilities are equal then
we get:
If this is the case, then the process is said to be
stationary, that is, from time t onwards, the probability of
the process being in state Ej does not change
Stationary Distributions (cont)
If the row vector φ’ is defined by:
Then we get the following from (4.25)
The row vector must also satisfy
With these equations we can find the stationary
distribution when it exists

Note that (4.27) generates one redundant equation
that can be omitted
Stationary Distribution Example
We are given a Markov chain with the following transition probability matrix
Using (4.27) and (4.28) we can form a set of equations to solve
The solution to these equations is:
This means that over a long time period a random variable with the given
transition matrix should spend about 24.14% of the time in state E1, 38.51%
of the time in state E2, etc.
Stationary Distribution Example
(cont)
With matrix multiplication we can see how quickly the
Markov chain process would reach the stationary
distribution
From this it appears that the stationary distribution is
approximately reached after 16 time steps
The Graphical Representation of a
Markov Chain
It can be convenient to represent a Markov chain by a directed
graph, using the states as nodes and the transition probabilities as
edges
Additionally, start and end states can be added as needed
The graph structure without probabilities added is called the topology
These definitions are used later in the book to discuss hidden
Markov models
Modeling
While the homogeneous Poisson process has many
uses in Bioinformatics, the two assumptions made at the
beginning of the chapter (homogeneity and
independence) do not always hold
Similarly, the assumptions made by Markov chain
processes do not always hold
For example, from analyzing DNA, it has become
apparent that the probability that the nucleotide a is
followed by g depends to some extent on the location of
a gene in a chromosome, also the nucleotides preceding
a may have an affect on the probability of g following a
However, the memoryless Markov chain property is often
assumed even when its applicability is uncertain
Modeling (cont)
In general, mathematical models often make simplifying
assumptions about properties of the events being
modeled
There are few cases where we could predict outcomes
of events with exact accuracy, but even if we could,
modeling may be more desirable due to the complexity
of the phenomena being analyzed
It is often necessary to find a middle ground between
easily solveable, simple models, and combersome,
complex ones
The key to this is knowing that a model only needs to
capture enough of the true complexity of the situation to
serve our purposes
Modeling (cont)
Finding the middle ground may not always be easy
either however, since benchmarks on model
performances are not often easily available, and
subjectivity may come into play
An example of modeling simplification comes from the
early version of BLAST, which assumed that nucleotides
are identically and independently distributed along a
DNA sequence
We now know that this is not true, but the BLAST
procedure does work, in that it captures enough of
biological reality to be effective
Another aspect of modeling is that we often assume that
a model will become more refined as it is used and we
learn more about the reality of the phenomena; models
are rarely said to be in their final versions