Watermarking with Side Information

Download Report

Transcript Watermarking with Side Information

Watermarking with Side
Information
Information Technologies for IPR
Protection
Outline
•
•
•
•
Information-theoretic basis
Channels with side information
Writing on dirty paper
Dirty paper codes
2
Basic Definitions
p( x, y )
I ( X ; Y )   p( x, y ) log
p( x ) p( y )
x
y
H ( X )    p( x ) log p( x )
x
H ( X , Y )    p( x, y ) log p( x, y )
x
I ( X ;Y )  H ( X )  H ( X | Y )
 H (Y )  H (Y | X )
 H ( X )  H (Y )  H ( X , Y )
y
H (Y | X )   p( x ) H (Y | X  x )
x
  p( x ) p( y | x ) log p( y | x )
x
y
  p( x, y ) log p( y | x )
x
H(X,Y)
y
H ( X , Y )  H ( X )  H (Y | X )
H(X|Y) I(X;Y) H(Y|X)
H(X)
H(Y)
3
Communication Rates
• Code rates: a property of a code
log 2 | M |
R
L
•|M|: the number of distinct messages that
can be encoded with sequences of
length L
•R: the number of bits of information encoded
in each symbol
• Achievable rate: a property of a communication channel
– Due to the noise in the channel, even the best code will have
some probability of errors.
– A rate is achievable if we can obtain arbitrarily low probability of
errors by using codes with arbitrarily long symbol sequences and
correspondingly large sets of messages.
4
Channel Capacity
• The capacity of a channel is defined as the
maximum achievable rate for that channel, or
more precisely, the supremum of all achievable
rates.
• Capacity of arbitrary channels
– Any channel can be characterized by a conditional
distribution Py|x(y), which gives the probability of
receiving y when x is transmitted
– The capacity of a channel is
C  max I ( x; y )
P( x )
5
Capacity of AWGN Channels
n
x
y
• x: a transmitted symbol, limited by
a power constraint
• n: the additive noise, drawn from a
Gaussian distribution with variance
• The capacity of such
 2 channel is
n
1
N
N
2
x
[
i
]
 p

i 1
1
p
C  log 2 (1  2 )
2
n
6
Viewing the Capacity of AWGN
Channels Geometrically
//p365 Figure A.1
• Each code word of length L can be viewed as a point in an Ldimensional space.
• The power constraint restricts all code words to lie within a sphere,
centered at zero.
• y lies within a sphere around x.
7
Writing on Dirty Paper
• Imagine that we have a piece of paper covered
with independent dirt spots of normally
distributed intensity, and we write a message on
it with a limited amount of ink.
• The dirty paper with the message on it, is then
sent to someone else, and acquire more
normally distributed dirt.
• If the recipient cannot distinguish between the
ink and the dirt, how much information can we
reliably send?
8
Dirty Paper Channels
First noise Second noise
source
source
s
m
n
Transmitter
Receiver
x
u
y
mn
• The channel has two independent white Gaussian noise
sources, generating s and n.
• Before the transmitter chooses a signal to send, x, it is told the
exact noise s that will be added to the signal by the first source.
• The signal x is limited by a power constraint.
N
1
x[i ]2  p

N i 1
The first noise source has no effect on the channel capacity!
9
A Dirty-paper Code for a Simple
Channel
//P137 Figure 5.14
•
•
•
•
The first noise is restricted to produce only one of two specific noise vectors:
s=s1 or s=s2.
Assume that s=s1, the transmitter must choose a vector x, limited by the
power constraint p, to encode a message.
Since s1 is known, it can choose a desired value of the sum u=x+s1 instead,
and then sets x=u-s1. u must lie within a ball of (LP)1/2 centered at s1.
The problem of choosing u is essentially the same as that of choosing a
signal to transmit over a AWGN channel.
10
A Dirty Paper Code for A Simple
Channel
• ….
•
•
•
For the case the first noise source
generates s2, the transmitter
choose a u from within a radius
(LP)1/2 ball centered at s2.
If s1 and s2 are sufficiently different
from each other, the transmitter
can simply use two different sets
of code vectors.
The transmitter decides on a value
of u based on the message m,
and the vector will be added by s.
It than transmit x=u-s.
11
A Dirty Paper Code for A Simple
Channel
• The receiver receives y=x+s+n=u+n, and it
will search for the code vector closest to y,
and output the associated message.
• The resulting message is reliable.And the
capacity is unaffected by the first noise.
12
Dirty Paper Codes for More
Complex Channels (I)
•
•
•
We would like to generate a set of
code vectors such that, whatever
the value of s, we will always find
at least one code for each
message within a distance of
(LP)1/2 around it.
Finding a set U of vectors and
dividing them into subsets (cosets)
Um corresponding to possible
messages m.
All codes in any given coset
represent the same message,
and they are widely distributed so
that one of them will be probably
close to s, whatever value s may
take.
13
Dirty Paper Codes for More
Complex Channels (I)
• Given a message m and a known noise vector s,
the transmitter finds the member u of Um that is
closest to s.It then transmit x=u-s.
• The receiver finds the member of U that is
closest to the received vector y, and then
identifies the received message by determining
which coset contains the code vector.
14
Dirty Paper Codes for More
Complex Channels (II)
•
•
To achieve full capacity,
exactly only one code word
for each message within the
ball around every given value
of s is desired.
We must ensure that there is
always a code word for
message G, that is, when one
G code word is on the edge
of the ball, another is always
inside it.
15
•
•
•
•
Now we transmit x=a(u-s) where
0<a<1.
After adding s, the result is u’=au+(1a)s, which is a weighted average of u
and s.
The power constraint is now relaxed to
(LP)1/2/a.
Trade-off: the closer a is to 1, the less
noise is added to u, but the more strict
the power constraint during search is.
The closer a is to 0, the greater the
noise, but the more relax the power
constraint is.
16
Similarity between Dirty Paper
Channel and Watermarking System
Original
signal
Noise
source
co
m
Encoder
Decoder
w
•
•
•
•
n
cw
m’
c’w
The first noise (dirty paper) s plays the role of the original signal.
The transmitted signal x is the added pattern.
The power constraint p expresses the fidelity constraint.
The second noise n is the distortion occurred due to normal
processing or malicious tampering.
If a watermarking system behaves enough like the dirty-paper
channel, its maximum data payload should not depend on the
cover media.
17
Difference between Dirty Paper
Channel and Watermarking System
• The original signal and subsequent noise
are rarely Gaussian.
• The fidelity constraint represented as a
power constraint, implying that the
distortion is measured by the MSE
distance measure, is a poor predictor of
perceptual distance.
18
General Form of Channels with
Side Information
s
m
Transmitter
x
Py|x,s(y)
y
Receiver
mn
• For each use of the channel, t, a random value s[t] is drawn from a
set of possible values, S.
• Before the transmitter choose a codeword to transmit, it is told what
all values of s[t] will be during the entire transmission.
• The transmitter then choose a sequence of L values, x, and
transmits them.
• The sequence received by the receiver, y[t], is a random value
dependent on both x[t] and s[t]. The dependency is described by a
conditional probability distribution, Py|x=x[t],s=s[t](y)
19
Capacity of Channels with Side
Information
C  max ( I (u; y )  I (u; s))
Psux
• s: the value of the side information known to
the transmitter.
• u: the auxiliary variable. We can regard it as
an element of a code word, u, chosen from a
set U. It depends on the
20