8)Image Compression

Download Report

Transcript 8)Image Compression

Based on ENEE631 Spring’04
Section 8
Basics on Image Compression
Spring ’04 Instructor: Min Wu
ECE Department, Univ. of Maryland, College Park
 www.ajconline.umd.edu (select ENEE631 S’04)
 [email protected]
ENEE631 Digital Image Processing (Spring'04)
UMCP ENEE631 Slides (created by M.Wu © 2004)
Image Compression
Part-1. Basics
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [4]
UMCP ENEE631 Slides (created by M.Wu © 2001)
Why Need Compression?

Savings in storage and transmission
– multimedia data (esp. image and video) have large data volume
– difficult to send real-time uncompressed video over current
network

Accommodate relatively slow storage devices
– they do not allow playing back uncompressed multimedia data in
real time
1x CD-ROM transfer rate ~ 150 kB/s
 320 x 240 x 24 fps color video bit rate ~ 5.5MB/s
=> 36 seconds needed to transfer 1-sec uncompressed video from CD

ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [5]
From Ken Lam’s DCT talk 2001 (HK Polytech)
UMCP ENEE631 Slides (created by M.Wu © 2001)
Example: Storing An Encyclopedia
– 500,000 pages of text (2kB/page)
~ 1GB => 2:1 compress
– 3,000 color pictures (64048024bits)
~ 3GB => 15:1
– 500 maps (64048016bits=0.6MB/map)
~ 0.3GB => 10:1
– 60 minutes of stereo sound (176kB/s)
~ 0.6GB => 6:1
– 30 animations with average 2 minutes long
(64032016bits16frames/s=6.5MB/s)
~ 23.4GB => 50:1
– 50 digitized movies with average 1 minute long
(64048024bits30frames/s = 27.6MB/s) ~ 82.8GB => 50:1
 Require a total of 111.1GB storage capacity if without compression
 Reduce to 2.96GB if with compression
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [6]
PCM coding
UMCP ENEE631 Slides (created by M.Wu © 2001)

How to encode a digital image into bits?
– Sampling and perform uniform quantization




“Pulse Coded Modulation” (PCM)
8 bits per pixel ~ good for grayscale image/video
10-12 bpp ~ needed for medical images
Reduce # of bpp for reasonable quality via quantization
– Quantization reduces # of possible levels to encode
– Visual quantization: dithering, companding, etc.


Halftone use 1bpp but usually upsampling ~ saving less than 2:1
Encoder-Decoder pair  codec
I(x,y)
Sampler
Quantizer
Encoder
transmit
Input image
image capturing device
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [8]
Discussions on Improving PCM
UMCP ENEE631 Slides (created by M.Wu © 2001)

Quantized PCM values may not be equally likely
– Can we do better than encode each value using same # bits?

Example
– P(“0” ) = 0.5, P(“1”) = 0.25, P(“2”) = 0.125, P(“3”) = 0.125
– If to use same # bits for all values

Need 2 bits to represent the four possibilities if treat
– If to use less bits for likely values “0” ~ Variable Length Codes
(VLC)



“0” => [0], “1” => [10], “2” => [110], “3” => [111]
Use i pi li =1.75 bits on average ~ saves 0.25 bpp!
Bring probability into the picture
– Use prob. distribution to reduce average # bits per quantized sample
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [9]
UMCP ENEE631 Slides (created by M.Wu © 2001/2004)
Entropy Coding

Idea: use fewer bits for commonly seen values

At least how many # bits needed?
– Limit of compression => “Entropy”

Measures the uncertainty or avg. amount of information of a source

Definition: H =



i
pi log2 (1 / pi) bits
e.g., entropy of previous example is 1.75
Can’t represent a source perfectly with less than avg. H bits per sample
Can represent a source perfectly with avg. H+ bits per sample
( Shannon Lossless Coding Theorem )
– “Compressability” depends on the sources

Important to design a codebook to decode coded stream
efficiently and without ambiguity
 See info. theory course (EE721) for more theoretical details
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [10]
E.g. of Entropy Coding: Huffman Coding
UMCP ENEE631 Slides (created by M.Wu © 2001)

Variable length code
– assigning about log2 (1 / pi) bits for the ith value


has to be integer# of bits per symbol
Step-1
– Arrange pi in decreasing order and consider them as tree leaves

Step-2
– Merge two nodes with smallest probabilities to a new node and
sum up probabilities
– Arbitrarily assign 1 and 0 to each pair of merging branch

Step-3
– Repeat until no more than one node left.
– Read out codeword sequentially from root to leaf
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [11]
UMCP ENEE631 Slides (created by M.Wu © 2001)
Huffman Coding (cont’d)
000
00
S0
0.25
001
10
S1
0.21
0
010
S2
0.15
011
011
S3
0.14
100
1100
S4
0.125
101
1101
S5
0.0625 0
0.0625 1
110
1110
S6
1111
0.125
111
S7
0.0625 0
0.0625 1
PCM
Huffman
0
0
1
010
0
1
0.54
1.0
0.29
0.46
0
1
1
0.25
1
(trace from
root)
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [12]
Huffman Coding: Pros & Cons
UMCP ENEE631 Slides (created by M.Wu © 2001)

Pro
– Simplicity in implementation (table lookup)
– Given symbol size, Huffman coding gives best coding efficiency

Con
– Need to obtain source statistics
– The codelength has to be integer => lead to gaps between its average
codelength and entropy

Improvement
– Code a group of symbols as a whole: allow fractional # bits/symbol
– Arithmetic coding: fractional # bits/symbol
– Lempel-Ziv coding or LZW algorithm


“universal”, no need to estimate source statistics
fix-length codeword for variable-length source symbols
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [13]
Run-Length Coding
UMCP ENEE631 Slides (created by M.Wu © 2001)

How to efficiently encode it? e.g. a row in a binary doc image:
“ 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 0 0 1 1 1 …”

Run-length coding (RLC)
– Code length of runs of “0” between successive “1”


run-length of “0” ~ # of “0” between “1”
good if often getting frequent large runs of “0” and sparse “1”
– E.g., => (7) (0) (3) (1) (6) (0) (0) … …
– Assign fixed-length codeword to run-length in a range (e.g. 0~7)
– Or use variable-length code like Huffman to further improve

RLC also applicable to general a data sequence with
many consecutive “0” (or long runs of other values)
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [14]
UMCP ENEE631 Slides (created by M.Wu © 2001)
Probability Model of Run-Length Coding

Assume “0” occurs independently w.p. p: (p is close to 1)

Prob. of getting an L runs of “0”: possible runs L=0,1, …, M
– P( L = l ) = pL (1-p) for 0  l  M-1 (geometric distribution)
– P( L = M ) = pM (when having M or more “0”)

Avg. # binary symbols per zero-run
– Savg = L (L+1) pL (1-p) + M pM = (1 – pM ) / ( 1 – p )

Compression ratio
C = Savg / log2 (M+1) = (1 – pM ) / [( 1–p ) log2(M+1)]

Example
– p = 0.9, M=15  Savg = 7.94, C = 1.985, H = 0.469 bpp
– Avg. run-length coding rate Bavg = 4 bit / 7.94 ~ 0.516 bpp
– Coding efficiency = H / Bavg ~ 91%
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [15]
UMCP ENEE631 Slides (created by M.Wu © 2004)
Quantization
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [17]
UMCP ENEE631 Slides (created by M.Wu © 2001/2004)
Review of Quantization

L-level Quantization
tmin
tmax
– Minimize errors for this lossy process
– What L values to use?
– Map what range of continous values to each of L values?

Uniform partition
tmin
tmax
(tmax—tmax)/2L
– Maximum errors = ( tmax - tmin ) / 2L = A / 2L

dynamic range A
– Best solution?


tk
tk+1
quantization
error
Consider minimizing maximum absolute error (min-max) vs. MSE
what if the value between [a, b] is more likely than other intervals?
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [18]
UMCP ENEE631 Slides (created by M.Wu © 2001)
Bring in Probability Distribution
p.d.f pu(x)
t1
r1
rL
tL+1

Allocate more reconstruct. values in more probable ranges

Minimize error in a probability sense
– MMSE
(minimum mean square error)



assign high penalty to large error
and to likely occurring values
squared error gives convenience in math.: differential, etc.
An optimization problem
– What {tk} and {rk } to use?
– Necessary conditions: by setting partial differentials to zero
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [19]
UMCP ENEE631 Slides (created by M.Wu © 2001/2004)
MMSE Quantizer (Lloyd-Max)

Reconstruction and decision levels need to satisfy
– A nonlinear problem to solve

Solve iteratively
– Choose initial values of {tk}(0) , compute {rk}(0)
– Compute new values {tk}(1), and {rk}(1) ……

For large number of quantization levels
– Approx. constant pdf within t[tk, tk+1), i.e. p(t) = p(tk’) for tk’=(tk+tk+1)/2
Reference: S.P. Lloyd: “Least Squares Quantization in PCM”, IEEE Trans. Info.
Theory, vol.IT-28, March 1982, pp.129-137
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [20]
UMCP ENEE631 Slides (created by M.Wu © 2001/2004)
MMSE Quantizer for Uniform Distribution

p.d.f. of uniform
distribution
Uniform quantizer
1/A
t1
tL+1
– Optimal for uniform distributed r.v. in MMSE sense
– MSE = q2 / 12 with q = A / L

t1
A
tL+1
SNR of uniform quantizer
– Variance of uniform distributed r.v. = A2 / 12
– SNR = 10 log10 (A2 / q2) = 20 log10 L (dB)
– If L = 2B, SNR = (20 log102)*B = 6B (dB)


“1 bit is worth 6 dB.”
Rate-Distortion tradeoff
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [22]
UMCP ENEE631 Slides (created by M.Wu © 2001)
Think More About Uniform Quantizer

Uniformly quantizing [0, 255] to 16 levels

Compare relative changes
– x1 = 100  [96, 112)  quantize to “104” ~ 4% change
– x2 = 12  [0, 16)  quantize to “8” ~ 33% change
– Large relative changes could be easily noticeable by eyes!
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [25]
UMCP ENEE631 Slides (created by M.Wu © 2001)
Compandor
u

f(.)
compressor
w
Uniform
quantizer
y
g(.)
expander
u’
Compressor-Expander
– Uniform quantizer proceeded and succeeded by nonlinear transformations


nonlinear function amplifies or suppresses particular ranges
Why use compandor?
– Inputs of smaller values suffer higher percentage of distortion under
uniform quantizer
– Nonlinearity in perceived luminance

small difference in low luminance is more visible
– Overall companding system may approximate Lloyd-Max quantizer
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [26]
UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
Quantization in Coding Correlated Sequence

Consider: high correlation between successive samples

Predictive coding
– Basic principle: Remove redundancy between successive pixels and only
encode residual between actual and predicted
– Residue usually has much smaller dynamic range

Allow fewer quantization levels for the same MSE => get compression
– Compression efficiency depends on intersample redundancy

First try:
u(n)
e(n)
_
Any problem with
this codec?
Predictor
eQ(n)
Quantizer
u’P(n) = f[u(n-1)]
Encoder
uQ (n)
eQ(n)
+
uP(n) = f[uQ(n-1)]
ENEE631 Digital Image Processing (Spring'04)
Predictor
Decoder
Lec9 – Basics on Compression [28]
Encoder
Predictive Coding (cont’d)
UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
u(n)

e(n)
Problem with 1st try
eQ(n)
_
– Input to predictor are different at
encoder and decoder

Quantizer
uQ(n)
+
decoder doesn’t know u(n)!
Predictor
uP(n)=f[uQ(n-1)]
– Mismatch error could propagate to
future reconstructed samples

Solution: Differential PCM (DPCM)
– Use quantized sequence uQ(n) for
prediction at both encoder and decoder
–
–
–
–
Simple predictor f[ x ] = x
Prediction error e(n)
Quantized prediction error eQ(n)
Distortion d(n) = e(n) – eQ(n)
ENEE631 Digital Image Processing (Spring'04)
uQ (n)
eQ(n)
+
uP(n)
= f[uQ(n-1)]
Predictor
Decoder
Note: “Predictor” contains one-step
buffer as input to the prediction
Lec9 – Basics on Compression [29]
More on Predictor
UMCP ENEE631 Slides (created by M.Wu © 2001)

Causality required for coding purpose
– Can’t use the samples that decoder hasn’t got as reference

Use last sample uq(n-1)
– Equiv. to coding the difference

pth–order auto-regressive (AR) model
p
– Use a linear predictor from past samples u (n)   ai  u (n  i )  e(n)
i 1
– Determining the predictor coeff. (discuss more later)

Line-by-line DPCM
– predict from the past samples in the same line

2-D DPCM
– predict from past samples in the same line and from previous lines
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [31]
Vector Quantization
UMCP ENEE631 Slides (created by M.Wu © 2001)

Encode a set of values together
– Find the representative combinations
– Encode the indices of combinations

Stages
– Codebook design
– Encoder
– Decoder

vector quantization of 2 elements
Scalar vs. Vector quantization
– VQ allows flexible partition of coding cells
– VQ could naturally explore the correlation
between elements
– SQ is simpler in implementation
scalar quantization of 2 elements
From Bovik’s Handbook Sec.5.3
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [33]
UMCP ENEE631 Slides (created by M.Wu © 2001/2004)
Outline of Core Parts in VQ

Design codebook
– Optimization formulation is similar to MMSE scalar quantizer
– Given a set of representative points

“Nearest neighbor” rule to determine partition boundaries
– Given a set of partition boundaries


“Probability centroid” rule to determine representative
points that minimizes mean distortion in each cell
Search for codeword at encoder
– Tedious exhaustive search
– Design codebook with special structures to
speed up encoding

E.g., tree-structured VQ
vector quantization of 2 elements
Reference: Wang’s book Section 8.6
A. Gersho and R. M. Gray, Vector Quantization and Signal Compression, Kluwer Publisher.
R. M. Gray, ``Vector Quantization,'' IEEE ASSP Magazine, pp. 4--29, April 1984.
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [34]
Summary: List of Compression Tools
UMCP ENEE631 Slides (created by M.Wu © 2004)

Lossless encoding tools
– Entropy coding: Huffman, Lemple-Ziv, and others (Arithmetic coding)
– Run-length coding

Lossy tools for reducing bit rate
– Quantization: scalar quantizer vs. vector quantizer
– Truncations: discard unimportant parts of data

Facilitating compression via Prediction
– Convert the full signal to prediction residue with smaller dynamic range
– Encode prediction parameters and residues with less bits

Facilitating compression via Transforms
– Transform into a domain with improved energy compaction
ENEE631 Digital Image Processing (Spring'04)
Lec9 – Basics on Compression [35]