CSE 326 Huffman coding

Download Report

Transcript CSE 326 Huffman coding

RAIK 283
Data Structures & Algorithms
Huffman Coding
Dr. Ying Lu
[email protected]
RAIK 283
Data Structures & Algorithms

Giving credit where credit is due:


Most of slides for this lecture are based
on slides created by Dr. Richard
Anderson, University of Washington.
I have modified them and added new
slides
Coding theory



ASCII coding
Conversion,
Encryption,
Compression
Binary coding
A
B
C
D
E
F
For fixed-length binary coding of a 6-character
alphabet, how many bits are needed?
Coding theory (cont.)



ASCII coding
Conversion,
Encryption,
Compression
Binary coding
A
B
C
D
E
F
000
001
010
011
100
101
Coding theory (cont.)




ASCII coding
Conversion,
Encryption,
Compression
Binary coding
Variable length
coding
Probability
A
B
C
D
E
F
000
001
010
011
100
101
00
010
011
100
11
101
0.3
0.1
0.1
0.1
0.3
0.1
Average bits/character = ?
Compression Ratio = ?
Decode the following
E
0
E 0
T
11
T 10
N
100
N 100
I
1010
I
S
1011
S 1010
11010010010101011
0111
100100101010
Prefix(-free) codes


No prefix of a
A
codeword is a
B
codeword
C
Uniquely decodable
D
E
F
00
010
011
100
11
101
1
01
001
0001
00001
000001
00
10
11
0001
11000
101
Prefix codes and binary trees

Tree representation of
prefix codes
A
B
C
D
E
F
00
010
0110
0111
10
11
0
0
1
1
A
0
B
1
0
C
1
0
F
E
1
D
Minimum length code
How to code so that average bits/character is minimized?
Probability
A
B
C
D
E
1/4
1/8
1/16
1/16
1/2
Minimum length code (cont.)


Huffman tree – prefix codes tree with
minimum weighted path length
C(T) – weighted path length
Huffman code algorithm

Derivation



Two rarest items will have the longest codewords
Codewords for rarest items differ only in the last
bit
Idea: suppose the weights are
with
and
the smallest weights


Start with an optimal code for
and
Extend the codeword for
codewords for
and
to get
Huffman code
H = new minHeap()
for each wi
T = new Tree(wi)
H.Insert(T)
while H.Size() > 1
T1 = H.DeleteMin()
T2 = H.DeleteMin()
T3 = Merge(T1, T2)
H.Insert(T3)
Example
character
probability
A
0.35
B
0.1
C
0.2
D
0.2
_
0.15
In-class exercises

P332 Exercises 9.4.1
In-class exercises


9.4.3 What is the maximal length of a
codeword possible in a Huffman encoding of
an alphabet of n characters?
9.4.5 Show that a Huffman tree can be
constructed in linear time if the alphabet’s
characters are given in a sorted order of
their frequencies.