Simon Fraser University
Transcript Simon Fraser University
CMPT820 Multimedia Systems
School of Computing Science
Simon Fraser University
Nov 2nd , 2010
Basic Probability Theory and Information Theory Concepts.
Discrete MemoryLess Channels (DMC)
Discreet Random variables
Binary Erasure Channel (BEC)
Packet Erasure Channel (PEC)
BEC with Feedback
Channels with memory
Evaluating a desired loss/recovery probability
Why to model channels ?
Play a crucial role in the design and development of multimedia
applications as well as resilience techniques to protect the content
over the Internet.
End-to-end multimedia applications parameters influence the
particulate techniques used for recovering lost packets.
the application designer may choose to adopt a strategy for recovering
lost packets that is based on retransmission, Forward Error Correction
(FEC), or both.
We will cover fundamental analysis tools that are used to
characterize the loss performance of channels and networks that
carry multimedia packets
Then (next talk) describe a comprehensive Internet video study
conducted for gaining insight into a variety end-end performance
parameters that are crucial for real-time multimedia applications.
Discrete Random Variable
Probability Mass Functions:
Joint Probability Mass Functions
Example of pmf for d.r.v X
Marginal Probability Mass Functions:
Conditional Probability Mass Functions:
Independent Random Variables:
Joint pmf for two r.v X and Y
Entropy (uncertainty ):
measure of the amount of uncertainty (randomness)
associated with the value of the discrete random
variable X (the expected value of the information
content of X )
0 <= H(X) <= Log L ;
where X have L possible values
Conditional entropy (equivocation)
Individual (H(X),H(Y)), joint (H(X,Y)), and conditional entropies for a pair
of correlated subsystems X,Y with mutual information I(X; Y).
mutual information measures the information that X and Y share: it
measures how much knowing one of these variables reduces our
uncertainty about the other. For example,
if X and Y are independent, then knowing X does not give any information
about Y and vice versa, so their mutual information is zero.
if X and Y are identical then all information conveyed by X is shared with Y:
knowing X determines the value of Y and vice versa. As a result, in the case
of identity the mutual information is the same as the uncertainty contained in
Y (or X) alone, namely the entropy of Y (or X: clearly if X and Y are identical
they have equal entropy).
The mutual information of X relative to Y is given by:
its the maximum amount of error-free digital
data (that is, information) that can be
transmitted over a communication link with a
specified bandwidth in the presence of noise
interference, ( Shannon limit or Shannon
capacity) or ( the theoretical maximum
information transfer rate of the channel ).
is given by the maximum of the mutual
information between the input and output of the
channel, where the maximization is with respect
to the input distribution
developed by Claude E.
Shannon during World
Bits or packets per channel use
Discrete Memoryless Channels DMC
p(y|x) that expresses the probability of observing the output symbol y
given that we send the symbol x. The channel is said to be
memoryless if the probability distribution of the output depends only on
the input at that time and is conditionally independent of previous
channel inputs or outputs.
Binary Erasure Channel (BEC) *
The input X: binary (Bernoulli) random variable (0 or 1)
: The loss (erased or deleted) probability
The output Y: ternary random variable (0,1 or erasure)
* introduced by Peter Elias of MIT in 1954 as a toy example.
No Error occur over a BEC as:
Pr[Y=0 | X=1] = Pr[Y=1 | X=0] = 0 .
This capacity, which is measured in “bits” per “channel use,” can be achieved when the
channel input X is a uniform random variable with Pr[X = 0] = Pr[X =1] = 1/2.
Packet Erasure Channel (PEC)
The multimedia content is usually packetized and transmitted over internet link as
“integrated vectors” of bits rather than individual bits (the bits on same packet are
100% dependent on each other; either all transmitted successfully or all bits are
The input is a vector of random variables :
is a binary random variable
The capacity in this case is measured in “packets” per “channel use.”
Cascaded BEC Channels
Lets assume that we have two BEC channels that are in cascade (two internet links
over which multimedia packets are routed):
Then the loss probability for the two channels could be different :
Assuming the two channels are independent
For L cascaded links of BEC-independent channels:
The Overall channel capacity:
The end-end path of L BEC channels is equivalent to a BEC channel
with an effective end-end loss probability:
the overall capacity C of the end-to-end route is bounded by the
capacity Cmin of the link with the smallest capacity among all
cascaded channels in the route. In other words, C ≤ Cmin = mini(Ci).
Cmin should not be confused by the “bottleneck” bandwidth Bmin.
The BEC Channel with Feedback
Discrete memoryless channel with feedback
It is quite common for many Internet applications, including multimedia ones, to
support some form of feedback from the receiver to the transmitter. This could include
feedback regarding requests for retransmissions of lost packets.
what happens to the overall performance bounds of such channels with feedback?
Information theory provides an interesting and arguably a surprising answer for any
(DMC) Channel, feedback does not improve (or worsen for that matter) :
* Proof: Elements of Information Theory, Second Edition, Thomas M. Cover, Joy A.
Cascaded BEC/PEC with feedback
For cascaded BEC/PEC with feedback on an end-end bases:
If the feedback is done on a link-by-link basis, then the overall channel capacity of a
cascaded set of links is bounded by the capacity of the “bottlneck” link:
The aforementioned results for channel capacity with feedback are applicable to
memoryless channels. Meanwhile, it has been well established that channels with
memory could increase their capacity by employing feedback. The performance
aspects of channels with memory are addressed next.
Packet Losses over channel with memory
Packet losses over Internet links and routes exhibit a high level of
correlation and tend to occur in bursts. This observation, which has been
well established by several Internet packet-loss studies, indicates that
Internet links and routes exhibit memory.
The most popular analysis and modeling tool used to capture memory is
based on Markov chains
A special case of Markov channels is the Gilbert–Elliott channel, which
consists of a two-state Markov chain
Good State (G) : the receiver receives all packets correctly
Bad State (B) : all packets are lost
G and B: represent the Good state and the Bad state, respectively
• pGG = 1 - pGB and pBB = 1 – pBG
• The steady state probabilities :
Evaluating a desired loss/recovery probability
We want to evaluate receiving i packets among n packets transmitted over
Gilbert erasure channels.
Construct another Markov process by extending the two-state Gilbert model
using the number of correctly received packets as the indexes for the sate of the
extended Markov process
The probability that the sender transmits n packets and the receiver correctly receives i
packets is the probability that the process starts at G0 or B0 and is in state Gi or Bi after n
The generating function  ФG0Gi (z) of φG0Gi (n)
taking the inverse Z transform, we have:
The detail of the derivation of this equation is outlined in Appendix A in .
Similarly, it can be shown that:
• If we use φ(n, i) to represent the probability that the sender
transmits n packets and the receiver correctly receives i packets,
• For FEC codes, we are often concerned about the probability that
a node can receive enough packets to decode an FEC block. For
a (n, k) FEC code, this probability (decodable probability) is:
Packet Correlation over Channels with
A more useful insight and analysis can be gained by considering other
The average loss p
the packet correlation ρ, where:
The Steady state probabilities are directly related to the average loss rate :
The packet erasure correlation ρ provides an average measure of how the
states of two consecutive packets are correlated to each other:
when ρ = 0, the loss process is memoryless (BEC)
as the value of ρ increases, then the states of two consecutive packets
become more and more correlated.
Probability of receiving i packets given that the transmitter sends n = 30
packets, for loss probability p = 0.1;
A detailed view of the probability of receiving i packets given that the
transmitter sends n packets;
as the correlation is strong, once the process initially starts in bad
or good state, it has the inertia to stay at that state
the average decodable probability of a receiver when the sender sends FEC blocks
through a Gilbert channel. The block size n is set 30, the average loss rate of the
channel is set to be 1%; k is changed from 20 to 30, and the packet correlation ρ is
changed from 0 to 0.9.
Packet Losses over cascaded channels with
Internet routes and paths consist of multiple links that can be modeled as
cascaded channels. Further, each of these links/channels usually exhibits
A representation of the reception of i packets when transmitting n packets
over two-cascaded channels with memory.
the desired probability φ(n, i) of receiving i packets at the output of these two
cascaded channels when the transmitter sends n packets into the input of the
in other words:
Hence if the cascaded channels with memory are Gilbert channels,
Now for N channels with memory:
A more Compact representation is :
 Schaar and Chou (editors), Multimedia over IP and
Wireless Networks: Compression, Networking, and
Systems, Elsevier, 2007.
 Thomas M. Cover, Joy A. Thomas, Elements of Information
Theory, Second Edition, Wiley, 1991.
 M. Wu and H. Radha. “Network-Embedded FEC (NEF)
Performance for Multi-Hop Wireless Channels with
Memory,” IEEE International Conference on Communications (ICC),
 M.Wu and H. Radha. “Network Embedded FEC for Overlay and
P2P Multicast over Channels with Memory,” Conference on
Information Sciences and Systems (CISS), Johns Hopkins
University, March 2005.
 R. A. Howard, Dynamic Probabilistic Systems. New York: