Transcript UNIT I

UNIT I
Entropy and Uncertainty





Entropy is the irreducible complexity below which a signal
cannot be compressed.
Capacity is the ultimate transmission rate for reliable
communication over noisy channel.
Entropy: The probabilistic behavior of a source of
information.
K-1
H(X) = ∑ pk log2(1/pk)
k=0
Capacity: Intrinsic ability of a channel to convey information.
Information: Measure of uncertainty of an event.
I k = log2(1/pk)
Source coding Theorem


Given a discrete memoryless source of entropy
H(X). The average length of code word L for
any distortionless source-encoding scheme is
bounded
as
L >= H(X).
According to the theorem, the entropy H(X)
represents a fundamental limit on L. Therefore,
Lmin = H(X).
Huffman coding
Definition: A minimal variable-length character coding
based on the frequency of each character.
The Huffman coding algorithm proceeds as follows



The source symbols are listed in order of decreasing
probability. The two source symbols of lowest probability
are assigned a0 and a1.
These two symbols are regarded as being combined into a
new source symbol with probability equal to the sum of
the two original probabilities.
The procedure is repeated until we are left with final list of
source statistics on only two for which a0 and a1 are
assigned.
Huffman coding
An example of the Huffman encoding algorithm
.
Shannon Fano coding


Definition: A variable-length coding based on the
frequency of occurrence of each character. Divide the
characters into two sets with the frequency of each set
as close to half as possible, and assign the sets either 0
or 1 coding. Repeatedly divide the sets until each
character has a unique coding.
Note: Shannon-Fano is a minimal prefix code. Huffman is
optimal for character coding (one character-one code word) and
simple to program. Arithmetic coding is better still, since it can
allocate fractional bits, but is more complicated and has patents.
Discrete Memory less channels
When the probabilities of selection of
successive events are independent, then the
source is said to be discrete memoryless source
or zero memory source.
Channel capacity
Channel capacity of a discrete
memoryless channel is the maximum mutual
information I(X,Y) in any single use of channel,
where the maximization is overall possible input
probability distributions { p(x j)} on x.
C = max I(X,Y)
Channel coding Theorem
Let a discrete memoryless source with an
alphabet x have entropy H(X) and produces symbols
every Ts seconds. Let a discrete memory less channel
have capacity C and be used every Tc seconds, then
if
there exist a coding scheme for which the source
information can be transmitted over the channel and
be reconstructed at the receiver with very small
probability of error.
Channel capacity Theorem
Shannon’s information capacity theorem states that
the channel capacity of a continuous channel of
bandwidth W Hz, perturbed by bandlimited Gaussian
noise of power spectral density n0 /2, is given by
Cc = W log2(1 + SN) bits/s (32.1)
where S is the average transmitted signal power and the
average noise power is
W
N =∫ n0/2 dw = n0W (32.2)
-W