Dalsa talk - McMaster University
Download
Report
Transcript Dalsa talk - McMaster University
Context-based
Data Compression
Xiaolin Wu
Polytechnic University
Brooklyn, NY
Part 3. Context modeling
Context model – estimated
symbol probability
Variable length coding schemes need estimates
of probability of each symbol - model
Model can be
Static - Fixed global model for all inputs
English text
Semi-adaptive - Computed for specific data being
coded and transmitted as side information
C programs
Adaptive - Constructed on the fly
Any source!
7/17/2015
2
Adaptive vs. Semi-adaptive
Advantages of semi-adaptive
Simple decoder
Disadvantages of semi-adaptive
Overhead of specifying model can be high
Two-passes of data required
Advantages of adaptive
one-pass universal As good if not better
Disadvantages of adaptive
Decoder as complex as encoder
Errors propagate
7/17/2015
3
Adaptation with Arithmetic and
Huffman Coding
Huffman Coding - Manipulate Huffman tree on
the fly - Efficient algorithms known but
nevertheless they remain complex.
Arithmetic Coding - Update cumulative
probability distribution table. Efficient data
structure / algorithm known. Rest essentially
same.
Main advantage of arithmetic over Huffman is
the ease by which the former can be used in
conjunction with adaptive modeling techniques.
7/17/2015
4
Context models
If source is not iid then there is complex dependence
between symbols in the sequence
In most practical situations, pdf of symbol depends on
neighboring symbol values - i.e. context.
Hence we condition encoding of current symbol to its
context.
How to select contexts? - Rigorous answer beyond our
scope.
Practical schemes use a fixed neighborhood.
7/17/2015
5
Context dilution problem
The minimum code length of sequence x1 x2 xn
achievable by arithmetic coding, if P( xi | xi 1xi 2 x1 )
is known.
The difficulty of estimating P( xi | xi 1xi 2 x1 ) due
to insufficient sample statistics prevents the use
of high-order Markov models.
7/17/2015
6
Estimating probabilities different
contexts
Two approaches
Maintain symbol occurrence counts within each
context
number of contexts needs to be modest to avoid
context dilution
Assume pdf shape within each context same (e.g.
Laplacian), only parameters (e.g. mean and variance)
different
Estimation may not be as accurate but much
larger number of contexts can be used
7/17/2015
7
Entropy (Shannon 1948)
Consider a random variable
X
{
x
,
x
,...,
x
N 1}
source alphabet : 0 1
probability mass functionpX:( xi ) prob( X xi )
i( xisi ) log P( xi )
Self-information xofi
measured in bits if the log is base 2.
event of lower the probability carries more
information
Self-entropy is the weighted average of self
information
H ( X ) P( x ) log P( x )
i
i
i
7/17/2015
8
Conditional Entropy
Consider two random variables
X
{
x
,
x
,...,
x
}
X
Alphabet of
: 0 1
N 1
Alphabet of C {:c0 , c1 ,...,cM 1}
C
and
xi of
Conditional Self-information
is
i( xi | cm ) log P( xi | cm )
Conditional Entropy is the average value of
conditional self-information
N 1 M 1
H ( X | C ) P( xi | cm ) P(cm ) log2 P( xi | cm )
i 0 m 0
7/17/2015
9
Entropy and Conditional Entropy
H ( X | C)
The conditional entropy
can be
X
interpreted as the amount of uncertainty
C we know
remaining about the , given that
random variable .
C
The additional knowledge
of
X
the uncertainty about
.
should reduce
H ( X | C) H ( X )
7/17/2015
10
Context Based Entropy Coders
Consider a sequence of symbols
S
symbol to code
form the context space, C
0 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 1 0 1 1 0 1 1 0 ...
C0 0 0 0
C1 0 0 1
C7 1 1 1
7/17/2015
EC
EC
H (C0 )
H (C1 )
R
P(C )H (C )
i
i
i
EC
H (C7 )
11
Decorrelation techniques to
exploit sample smoothness
Transforms
DCT, FFT
wavelets
Differential Pulse Coding Modulation (DPCM)
predict current symbol with past observations
d
xˆi t xi t
t 1
code prediction residual rather than the symbol
ei xi xˆi xi xˆi ei
7/17/2015
12
Benefits of prediction and
transform
A priori knowledge exploited to reduce the self-entropy
of the source symbols
Higher coding efficiency due to
Fewer parameters to be estimated adaptively
Faster convergence of adaptation
7/17/2015
13
Further Reading
Text Compression - T.Bell, J. Cleary and I.
Witten. Prentice Hall. Good coverage on
statistical context modeling. Focus on text
though.
Articles in IEEE Transactions on Information
Theory by Rissanen and Langdon
Digital Coding of Waveforms: Principles and
Applications to Speech and Video. Jayant and
Noll. Good coverage on Predictive coding.
7/17/2015
14