Stochastic Processes & Markov Chain
Download
Report
Transcript Stochastic Processes & Markov Chain
From DeGroot & Schervish
Example
Occupied Telephone Lines
Suppose that a certain business office has five telephone lines
and that any number of these lines may be in use at any given
time.
The telephone lines are observed at regular intervals of 2
minutes and the number of lines that are being used at each time
is noted.
Let X1 denote the number of lines that are being used when the
lines are first observed at the beginning of the period; let X2
denote the number of lines that are being used when they are
observed the second time, 2 minutes later;
and in general, for n = 1, 2, . . . , let Xn denote the number of lines
that are being used when they are observed for the nth time.
Stochastic Process
sequence of random variables X1, X2, . . . is called a
stochastic process or random process with discrete
time parameter.
The first random variable X1 is called the initial state
of the process;
and for n = 2, 3, . . . , the random variable Xn is called
the state of the process at time n.
Stochastic Process (cont.)
In the example, the state of the process at any time is
the number of lines being used at that time.
Therefore, each state must be an integer between 0
and 5.
Stochastic Process (cont.)
In a stochastic process with a discrete time parameter,
the state of the process varies in a random manner
from time to time.
To describe a complete probability model for a
particular process, it is necessary to specify the
distribution for the initial state X1 and also to specify
for each n = 1, 2, . . . the conditional distribution of the
subsequent state Xn+1 given X1, . . . , Xn.
Pr(Xn+1 = xn+1|X1 = x1, X2 = x2, . . . , Xn = xn).
Markov Chain
A stochastic process with discrete time parameter is a
Markov chain if,
for each time n, the probabilities of all Xj for j >n given
X1, . . . , Xn depend only on Xn and not on the earlier
states X1, . . . , Xn−1.
Pr(Xn+1 = xn+1|X1 = x1, X2 = x2, . . . , Xn= xn)
= Pr(Xn+1 = xn+1|Xn = xn).
A Markov chain is called finite if there are only finitely
many possible states.
Example: Shopping for
Toothpaste
Consider a shopper who chooses between two brands
of toothpaste on several occasions. Let Xi = 1 if the
shopper chooses brand A on the ith purchase, and let
Xi = 2 if the shopper chooses brand B on the ith
purchase.
Then the sequence of states X1, X2, . . . is a stochastic
process with two possible states at each time.
the shopper will choose the same brand as on the
previous purchase with probability 1/3 and
will switch with probability 2/3.
Example: Shopping for
Toothpaste
Since this happens regardless of purchases that are
older than the previous one, we see that this stochastic
process is a Markov chain with:
Pr(Xn+1 = 1|Xn = 1) = 1/3
Pr(Xn+1 = 2|Xn = 1) = 2/3
Pr(Xn+1 = 1|Xn = 2) = 2/3
Pr(Xn+1 = 2|Xn = 2) = 1/3
Transition Distributions/Stationary
Transition Distributions
Consider a finite Markov chain with k possible states.
The conditional distributions of the state at time n + 1
given the state at time n, that is, Pr(Xn+1 = j |Xn = i) for
i, j = 1, . . . , k and n = 1, 2, . . ., are called the transition
distributions of the Markov chain.
If the transition distribution is the same for every time
n (n = 1, 2, . . .), then the Markov chain has stationary
transition distributions.
Stationary Transition
Distributions
The notation for stationary transition distributions, pij
suggests that they could be arranged in a matrix.
The transition probabilities for Shopping for
Toothpaste example can be arranged into the following
matrix:
Transition Matrix
Consider a finite Markov chain with stationary
transition distributions given by
pij = Pr(Xn+1 = j |Xn = i) for all n, i, j.
The transition matrix of the Markov chain is defined
to be the k × k matrix P with elements pij . That is,
Transition Matrix (cont.)
A transition matrix has several properties that are
apparent from its definition.
For example, each element is nonnegative because all
elements are probabilities.
Since each row of a transition matrix is a conditional
p.f. for the next state given some value of the current
state, we have
Stochastic Matrix
Square matrix for which all elements are nonnegative
and the sum of the elements in each row is 1 is called a
stochastic matrix.
It is clear that the transition matrix P for every finite
Markov chain with stationary transition probabilities
must be a stochastic matrix.
Conversely, every k × k stochastic matrix can serve as
the transition matrix of a finite Markov chain with k
possible states and stationary transition distributions.
Example
Suppose that in the example involving the office with
five telephone lines, the numbers of lines being
used at times 1, 2, . . . form a Markov chain with
stationary transition distributions.
This chain has six possible states 0, 1, . . . , 5, where i is
the state in which exactly i lines are being used at a
given time (i = 0, 1, . . . , 5).
Suppose that the transition matrix P is as follows:
Example
(a) Assuming that all five lines are in use at a certain observation time,
we shall determine the probability that exactly four lines will be in use
at the next observation time.
(b) Assuming that no lines are in use at a certain time, we shall
determine the probability that at least one line will be in use at the next
observation time.
Example
A manager usually checks the server at her store every
5 minutes to see whether the server is busy or not. She
models the state of the server (1= busy or 2 = not busy)
as a Markov chain with two possible states and
stationary transition distributions given by the
following matrix:
Example (cont.)
Pr(Xn+2 = 1|Xn = 1) = Pr(Xn+1 = 1, Xn+2 = 1|Xn= 1) +
Pr(Xn+1 = 2, Xn+2 = 1|Xn = 1).
Pr(Xn+1 = 1, Xn+2 = 1|Xn = 1) = Pr(Xn+1 = 1|Xn= 1) Pr(Xn+2 = 1|Xn+1 = 1)
= 0.9 × 0.9 = 0.81.
Similarly,
Pr(Xn+1 = 2, Xn+2 = 1|Xn = 1) = Pr(Xn+1 = 2|Xn= 1) Pr(Xn+2 = 1|Xn+1 =
2) = 0.1× 0.6 = 0.06.
It follows that Pr(Xn+2 = 1|Xn= 1)=0.81+ 0.06= 0.87, and hence
Pr(Xn+2 =2|Xn=1) = 1− 0.87 = 0.13.
By similar reasoning, if Xn = 2, Pr(Xn+2 = 1|Xn = 2)
= 0.6 × 0.9 + 0.4 × 0.6 = 0.78,
and Pr(Xn+2 = 2|Xn = 2) = 1− 0.78 = 0.22.
The Transition Matrix for Several
Steps
Consider a general Markov chain with k possible states 1, . . . , k and the
transition matrix P.
Assuming that the chain is in state i at a given time n, we shall now
determine the probability that the chain will be in state j at time n + 2.
In other words, we shall determine the conditional probability of
Xn+2 = j given Xn = i. The notation for this probability is p(2)ij .
The Transition Matrix for
Several Steps (cont.)
Let r denote the value of Xn+1:
The Transition Matrix for Several
Steps (cont.)
The value of p(2) ij can be determined in the following
manner: If the transition matrix P is squared, that is, if
the matrix P2 = PP is constructed, then the element in
the ith row and the jth column of the matrix P2 will be
Therefore, p(2)ij will be the element in the ith row and
the jth column of P2.
Multiple Step Transitions
Let P be the transition matrix of a finite Markov chain
with stationary transition distributions.
For each m = 2, 3, . . ., the mth power Pm of the matrix
P has in row i and column j the probability p(m) ij that
the chain will move from state i to state j in m steps.
Example
Consider again the transition matrix P given by the
example for the Markov chain based on five telephone
lines.
We shall assume first that i lines are in use at a certain
time, and we shall determine the probability that
exactly j lines will be in use two time periods later.
If we multiply the matrix P by itself, we obtain the
following two-step transition matrix:
Example
i. If two lines are in use at a certain time, then the
probability that four lines will be in use two time periods
later is ..
ii. If three lines are in use at a certain time, then the
probability that three lines will again be in use two time
periods later is ..
The Initial Distribution
The manager in Example enters the store thinking that
the probability is 0.3 that the server will be busy the
first time that she checks.
Hence, the probability is 0.7 that the server will be not
busy.
We can represent this distribution by the vector
v = (0.3, 0.7)
that gives the probabilities of the two states at time 1 in
the same order that they appear in the transition matrix.
Probability Vector/Initial
Distribution
A vector consisting of nonnegative numbers that add
to 1 is called a probability vector.
A probability vector whose coordinates specify the
probabilities that a Markov chain will be in each of its
states at time 1 is called the initial distribution of the
chain or the intial probability vector.
Example
Consider again the office with five telephone lines and the
Markov chain for which the transition matrix P
Suppose that at the beginning of the observation process at
time n = 1, the probability that no lines will be in use is 0.5,
the probability that one line will be in use is 0.3, and the
probability that two lines will be in use is 0.2.
The initial probability vector is v = (0.5, 0.3, 0.2, 0, 0, 0).
Distribution of the number of lines in use at time 2, one
period later.
Example
By an elementary computation it will be found that
vP = (0.13, 0.33, 0.22, 0.12, 0.10, 0.10).
Since the first component of this probability vector is
0.13, the probability that no lines will be in use at time
2 is 0.13; since the second component is 0.33, the
probability that exactly one line will be in use at time 2
is 0.33; and so on.