18 Information Theory
Download
Report
Transcript 18 Information Theory
Information Theory
Eighteenth Meeting
A Communication Model
Messages are
produced by a source
transmitted over a channel to the destination.
encoded for transmission via the communication channel, then
decoded at the destination.
Codes
source symbols
A fixed set of source symbols (source alphabet)
code words are a representation for transmission.
A Sequence of binary digits
sometimes described as the code alphabet.
For a binary code, the code alphabet consists of only the
symbols 0 and 1.
radix
The number of symbols in the code alphabet
binary code has radix 2.
Morse code uses dot, dash and space, has radix 3.
efficient code
minimize the number of binary digits needed.
Information Probability
Efficiency of a code to be quantified.
The information conveyed by a
message depends on the
probability of receiving that
particular message.
The information, I, gained by
receiving a message of probability P
is given by:
I = -log (P)
Example
Consecutive video frames typically
have very similar content.
two consecutive frames is
considerably less than the sum of the
information in each individual frame.
000101010101010101010
000101010101010101010
Information Entroby
Memoryless source
Two independent message produced by the source
I = −log (P1) + (−log (P2)) = −log (P1P2)
Entropy of the source (H).
probabilities of all the possible source symbols are known,
the average information for the source can be found.
the source can generate n different symbols and the ith symbol has probability Pi, then the entropy
H is given by:
Representing the average amount of information the source provides.
The source entropy H is the minimum achievable value for the average length L of the code words:
L≥H
The average code word length L is given by:
The efficiency E :
Coding Tree
Variable length
Carefully design
Uniquely and
instantaneously decodable.
Example:
four source symbols are
represented by the binary
sequences 0, 01, 011 and
111.
message is: 00101101110
working from the right to the
left.
0, 01, 011, 0, 111, 0
Huffman Code
source symbols with the largest probabilities are allocated systematically to the
shortest code words
Original Source: are ordered according to their probabilities,
First reduction: the two lowest probabilities (0.1 and 0.02)
Second reduction: probabilities 0.12 and 0.13 are combined to give 0.25,
If necessary, the resulting values are re-ordered to ensure that the values in a column
are in descending order.
This process is repeated until a probability of 1.0 is reached.
Channel Coding
Assumptions
that the error rate is low,
that errors occur independently of each other, and
that there is a negligible probability of several errors
occurring together.
Rectangular Code
Hamming Code
1 = even parity for 357
2 = even parity for 367
4 = even parity for 567
1
0
2
1
1011
3
1
4
0
5
1
6
1
7
0