18 Information Theory

Download Report

Transcript 18 Information Theory

Information Theory
Eighteenth Meeting
A Communication Model

Messages are




produced by a source
transmitted over a channel to the destination.
encoded for transmission via the communication channel, then
decoded at the destination.
Codes


source symbols
 A fixed set of source symbols (source alphabet)
code words are a representation for transmission.
 A Sequence of binary digits
 sometimes described as the code alphabet.
 For a binary code, the code alphabet consists of only the
symbols 0 and 1.
 radix




The number of symbols in the code alphabet
binary code has radix 2.
Morse code uses dot, dash and space, has radix 3.
efficient code
 minimize the number of binary digits needed.
Information Probability



Efficiency of a code to be quantified.
The information conveyed by a
message depends on the
probability of receiving that
particular message.
The information, I, gained by
receiving a message of probability P
is given by:


I = -log (P)
Example


Consecutive video frames typically
have very similar content.
two consecutive frames is
considerably less than the sum of the
information in each individual frame.
000101010101010101010
000101010101010101010
Information Entroby

Memoryless source



Two independent message produced by the source
I = −log (P1) + (−log (P2)) = −log (P1P2)
Entropy of the source (H).





probabilities of all the possible source symbols are known,
the average information for the source can be found.
the source can generate n different symbols and the ith symbol has probability Pi, then the entropy
H is given by:
Representing the average amount of information the source provides.
The source entropy H is the minimum achievable value for the average length L of the code words:

L≥H

The average code word length L is given by:

The efficiency E :
Coding Tree

Variable length




Carefully design
Uniquely and
instantaneously decodable.
Example:



four source symbols are
represented by the binary
sequences 0, 01, 011 and
111.
message is: 00101101110
working from the right to the
left.

0, 01, 011, 0, 111, 0
Huffman Code




source symbols with the largest probabilities are allocated systematically to the
shortest code words
Original Source: are ordered according to their probabilities,
First reduction: the two lowest probabilities (0.1 and 0.02)
Second reduction: probabilities 0.12 and 0.13 are combined to give 0.25,


If necessary, the resulting values are re-ordered to ensure that the values in a column
are in descending order.
This process is repeated until a probability of 1.0 is reached.
Channel Coding

Assumptions



that the error rate is low,
that errors occur independently of each other, and
that there is a negligible probability of several errors
occurring together.
Rectangular Code
Hamming Code



1 = even parity for 357
2 = even parity for 367
4 = even parity for 567
1
0
2
1
1011
3
1
4
0
5
1
6
1
7
0