Transcript 0 0 0 0 0
Binary Image Compression
The art of modeling image source
– Image pixels are NOT independent events
Run length coding of binary and graphic
images
– Applications: BMP, TIF/TIFF
Lempel-Ziv coding*
– How does the idea of building a dictionary can
achieve data compression?
EE465: Introduction to Digital Image
Processing
From Theory to Practice
So far, we have discussed the problem of data
compression under the assumption that the source
distribution is known (i.e., the probabilities of all
possible events are given).
In practice, the source distribution (probability
models) is unknown and can only be
approximately obtained by relative frequency
counting method due to the inherent dependency
among source data.
EE465: Introduction to Digital Image
Processing
An Image Example
Binary image sized 100100
(approximately 1000 dark pixels)
EE465: Introduction to Digital Image
Processing
A Bad Model
Each pixel in the image is observed as an
independent event
All pixels can be characterized by a single
discrete random variable – binary Bernoulli
source
We can estimate the probabilities by relative
frequency counting
EE465: Introduction to Digital Image
Processing
Synthesized Image by the Bad Model
Binary Bernoulli distribution
(P(X=black)=0.1)
EE465: Introduction to Digital Image
Processing
Why does It Fail?
Roughly speaking
– Pixels in an image are not independent events (source is
not memoryless but spatially correlated)
– Pixels in an image do not observe the same probability
model (source is not stationary but spatially varying)
Fundamentally speaking
– Pixels are the projection of meaningful objects in the
real world (e.g., characters, lady, flowers, cameraman,
etc.)
– Our sensory processing system has learned to
understand meaningful patterns in sensory input
through evolution
EE465: Introduction to Digital Image
Processing
Similar Examples in 1D
Scenario I: think of a paragraph of English texts.
Each alphabet is NOT independent due to
semantic structures. For example, the probability
of seeing a “u” is typically small; however, if a
“q” appears, the probability of the next alphabet
being “u” is large.
Scenario II: think of a piece of human speech. It
consists of silent and voiced segments. The silent
segment can be modeled by white Gaussian
noise; while the voiced segment can not (e.g.,
pitches)
EE465: Introduction to Digital Image
Processing
Data Compression in Practice
discrete
source X
source
modeling
Y
entropy
coding
P(Y)
probability
estimation
binary
bit stream
Probabilities can be estimated by counting relative
frequencies either online or offline
The art of data compression is the art of source modeling
EE465: Introduction to Digital Image
Processing
Source Modeling Techniques
Transformation
– transform the source into an equivalent yet
more convenient representation
Prediction
– Predict the future based on the causal past
Pattern matching
– Identify and represent repeated patterns
EE465: Introduction to Digital Image
Processing
Non-image Examples
Transformation in audio coding (MP3)
– Audio samples are transformed into frequency
domain
Prediction in speech coding (CELP)
– Human vocal tract can be approximated by an
auto-regressive model
Pattern matching in text compression
– Search repeated patterns in texts
EE465: Introduction to Digital Image
Processing
How to Build a Good Model
Study the source characteristics
– The origin of data: computer-generated,
recorded, scanned …
– It is about linguistics, physics, material science
and so on …
Choose or invent the appropriate tool
(modeling technique)
Q: why pattern matching is suitable for texts, but
not for speech or audio?
EE465: Introduction to Digital Image
Processing
Image Modeling
Binary images
– Scanned documents (e.g., FAX) or computergenerated (e.g., line drawing in WORD)
Graphic images
– Windows icons, web banners, cartoons
Photographic images
– Acquired by digital cameras
Others: fingerprints, CT images,
astronomical images …
EE465: Introduction to Digital Image
Processing
Lossless Image Compression
No information loss – i.e., the decoded
image is mathematically identical to the
original image
– For some sensitive data such as document or
medical images, information loss is simply
unbearable
– For others such as photographic images, we
only care about the subjective quality of
decoded images (not the fidelity to the original)
EE465: Introduction to Digital Image
Processing
Binary Image Compression
The art of modeling image source
– Image pixels are NOT independent events
Run length coding of binary and graphic
images
– Applications: BMP, TIF/TIFF
Lempel-Ziv coding*
– How does the idea of building a dictionary can
achieve data compression?
EE465: Introduction to Digital Image
Processing
Run Length Coding (RLC)
What is run length?
Run length is defined as the length of consecutively
identical symbols
Examples
Coin-flip
HHHHH T HHHHHHH
5
1
7
random walk SSSS EEEE NNNN WWWW
4
4
4
4
EE465: Introduction to Digital Image
Processing
Run Length Coding (Con’t)
discrete
source X
Transformation
by run-length
counting
Y
Entropy
coding
P(Y)
probability
estimation
binary
bit stream
Y is the sequence of run-lengths from which X
can be recovered losslessly
EE465: Introduction to Digital Image
Processing
RLC of 1D Binary Source
X
Y
0000 111 000000 11111 00
4
3
6
5
2
Huffman coding
compressed data
(need extra 1 bit to denote
what the starting symbol is)
Properties
-“0” run-length (red) and “1” run-length (green) alternates
- run-lengths are positive integers
EE465: Introduction to Digital Image
Processing
Variation of 1D Binary RLC
When P(x=0) is close to 1, we can record
run-length of dominant symbol (“0”) only
Example
00000100000001000011000000001…
5
7
4
0
8
run-length
Properties
- all coded symbols are “0” run-lengths
- run-length is a nonnegative integer
EE465: Introduction to Digital Image
Processing
Modeling Run-length
geometric source: P(X=k)=(1/2)k, k=1,2,…
Run-length k
1
2
3
4
5
…
Probability
1/2
1/4
1/8
1/16
1/32
…
EE465: Introduction to Digital Image
Processing
Golomb Codes
Optimal VLC for geometric source: P(X=k)=(1/2)k, k=1,2,…
k
1
2
3
4
5
6
7
8
…
codeword
0
10
110
1110
11110
111110
1111110
11111110
……
1
1
1
1
0
0
0
…
EE465: Introduction to Digital Image
Processing
0
From 1D to 2D
white run-length
black run-length
Question: what distinguishes 2D from 1D coding?
Answer: inter-line dependency of run-lengths
EE465: Introduction to Digital Image
Processing
Relative Address Coding (RAC)*
00000111111100000000000
7
previous
line
00000011111111000001100
current
line
d1=1
d2=-2
NS,run=2
NS – New Start
Its variation was adopted by CCITT for Fax transmission
EE465: Introduction to Digital Image
Processing
Image Example
CCITT test image No. 1
Size: 17282376
Raw data (1bps)
filesize of ccitt1.pbm: 513216 bytes
filesize of ccitt1.tif: 37588 bytes
Compression Ratio=13.65
EE465: Introduction to Digital Image
Processing
Graphic Images (Cartoons)
Total 12 different colors (black,white,red,green,blue,yellow …)
Observations:
-dominant background color (e.g., white)
-objects only contain a few other colors
EE465: Introduction to Digital Image
Processing
Palette-based Image Representation
index
0
1
2
color
white
3
4
black
red
green
blue
5
…
yellow
…
Any (R,G,B)
24-bit color
can be represented by its
index in the
palette.
EE465: Introduction to Digital Image
Processing
RLC of 1D M-ary Source
Basic idea
Record not only run-lengths but also color indexes
Example
WWWWGGGWWWWWWWRRRR… color sequence
0000033300000002222100000000…
5
3
7
4
1
8
run-length
0
3
0
2
1
0
color-index
(color, run) representation: (0,5) (3,3) (0,7) (2,4) (1,1) (0,8) …
Note: run-length is a positive integer
EE465: Introduction to Digital Image
Processing
Variation of 1D M-ary RLC
When P(x=0) is close to 1, we can record
run-length of dominant symbol (“0”) only
Example
00000100000004000032000000001…
5
7
4
0
8
run
1
4
3 2
1 level
(run, level) representation: (5,1) (7,4) (4,3) (0,2) (8,1) …
Properties - “0” run-lengths only
- run-length is a nonnegative integer
EE465: Introduction to Digital Image
Processing
Image Example
Raw data: party8.ppm, 526286, 451308 bytes
Compressed: party8.bmp, 17094 bytes
Compression ratio=26.4
EE465: Introduction to Digital Image
Processing
Binary Image Compression
The art of modeling image source
– Image pixels are NOT independent events
Run length coding of binary and graphic
images
– Applications: BMP, TIF/TIFF
Lempel-Ziv coding*
– How does the idea of building a dictionary can
achieve data compression?
EE465: Introduction to Digital Image
Processing
History of Lempel-Ziv Coding
Invented by Lempel-Ziv in 1977
Numerous variations and improvements
since then
Widely used in different applications
–
–
–
–
Unix system: compress command
Winzip software (LZW algorithm)
TIF/TIFF image format
Dial-up modem (to speed up the transmission)
EE465: Introduction to Digital Image
Processing
Dictionary-based Coding
Use a dictionary
– Think about the evolution of an English
dictionary
• It is structured - if any random combination of alphabets
formed a word, the dictionary would not exist
• It is dynamic - more and more words are put into the dictionary
as time moves on
– Data compression is similar in the sense that
redundancy reveals as patterns, just like English
words in a dictionary
EE465: Introduction to Digital Image
Processing
Toy Example
I took a walk in town one day
And met a cat along the way.
What do you think that cat did say?
Meow, Meow, Meow
I took a walk in town one day
And met a pig along the way.
What do you think that pig did say?
Oink, Oink, Oink
I took a walk in town one day
And met a cow along the way.
What do you think that cow did say?
Moo, Moo, Moo
entry pattern
1
2
3
4
5
6
7
…
I took a walk in town one day
And met a
along the way
What do you think that
did say?
cat
meow
……
- from “Wee Sing for Baby”
EE465: Introduction to Digital Image
Processing
Basic Ideas
Build up the dictionary on-the-fly (based on
causal past such that decoder can duplicate
the process)
Achieve the goal of compression by
replacing a repeated string by a reference to
an earlier occurrence
Unlike VLC (fixed-to-variable), LZ parsing
goes the other way (variable-to-fixed)
EE465: Introduction to Digital Image
Processing
Lempel-Ziv Parsing
Initialization:
– D={all single-length symbols}
– L is the smallest integer such that all codewords whose
length is smaller than L have appeared in D (L=1 at the
beginning)
Iterations: wnext parsed block of L symbols
– Rule A: If wD, then represent w by its entry in D and
update D by adding a new entry with w concatenated
with its next input symbol
– Rule B: If wD, then represent the first L-1 symbols in
w by its entry in D and update D by adding a new entry
EE465: Introduction to Digital Image
with w
Processing
Example of Parsing a Binary Stream
input: 0 1 1 0 0 1 1 1 0 1 1 …
Dictionary
entry pattern
1
0
w: 0, 11, 10, 00, 01, 110, 011, …
2
1
rule: A B B B A B A
L=1
01
output: 1, 2, 2, 1, 3, 4, 7, … (entries in D) 3
11 L=2
4
Illustration:
10
5
Step 1: w=0, Rule A, output 1, add 01 to D, L=2
Step 2: w=11, Rule B, output 2, add 11 to D
Step 3: w=10, Rule B, output 2, add 10 to D
Step 4: w=00, Rule B, output 1, add 00 to D
Step 5: w=01, Rule A, output 3, add 011 to D, L=3
Step 6: w=110, Rule B, output 4, add 110 to D
EE465: Introduction to Digital Image
Processing
6
7
8
00
011
110
fixed variable
-length -length
L=3
Binary Image Compression Summary
Theoretical aspect
– Shannon’s source entropy formula tells us the lower bound for
coding a memoryless discrete source
– To achieve the source entropy, we need to use variable length
codes (i.e., long codeword assigned to small-probability event and
vice versa)
– Huffman’s algorithm (generating the optimal prefix codes) offers a
systematic solution
Practical aspect
– Data in the real world contains memory (dependency, correlation
…)
– We don’t know the probability distribution function of the source
– Tricky point: the goal of image compression (processing) is not to
make your algorithm work for one image, but for a class of images
EE465: Introduction to Digital Image
Processing
The Art of Source Modeling
How do I know a model matches or not?
Hypothesis Model:
Observation Data
Bernoulli distribution
(P(X=black)=0.1)
EE465: Introduction to Digital Image
Processing
A Matched Example
Synthesis
Model
EE465: Introduction to Digital Image
Processing
Good Models for Binary Images
discrete
source X
Transformation
by Lempel-Ziv
run-length
parsing
counting
Y
Huffman
coding
P(Y)
probability
estimation
binary
bit stream
Y is the sequence of run-lengths
from which
X
dictionary entries
from which
X
can be recovered losslessly
EE465: Introduction to Digital Image
Processing
Remaining Questions
How to choose different compression techniques, say RLC
vs. LZC?
– Look at the source statistics
– For instance, compare binary image (good for RLC) vs. English
texts (good for LZC)
What is the best binary image compression technique?
– It is called JBIG2 – an enhanced version of JBIG
What is the entropy of binary image source?
– We don’t know and the question itself is questionable
– A more relevant question would be: what is the entropy for a
probabilistic model which we believe generates the source?
EE465: Introduction to Digital Image
Processing