Vector Error Diffusion - The University of Texas at Austin

Download Report

Transcript Vector Error Diffusion - The University of Texas at Austin

PhD Defense
Perceptually Based Methods for
Robust Image Hashing
Vishal Monga
Committee Members:
Prof. Ross Baldick
Prof. Brian L. Evans (Advisor)
Prof. Wilson S. Geisler
Prof. Joydeep Ghosh
Prof. John E. Gilbert
Prof. Sriram Vishwanath
Ph.D. Defense
Communications, Networks, and Systems Area
Dept. of Electrical and Computer Engineering
The University of Texas at Austin
April 13th , 2005
1
Introduction
The Dichotomy of Image Hashing
f

Image hash
• Signal processing methods
– Capture perceptual attributes
well, yield robust and visually
meaningful representations
– Little can be said about how
secure these representations are
• Cryptographic methods
– Provably secure
– However, do not respect
underlying structure on
signals/images
Towards a joint signal processing-cryptographic
approach……..
2
Introduction
Hash Example
• Hash function: Projects value from set with large
(possibly infinite) number of members to set with
fixed number of (fewer) members
– Irreversible
– Provides short, simple representation
of large digital message
– Example: sum of ASCII codes for
characters in name modulo N (= 7),
a prime number
Database name
search example
Name
Hash
Value
Ghosh
Monga
Baldick
Vishwanath
Evans
Geisler
Gilbert
1
2
3
3
5
5
6
3
Introduction
Image Hashing: Motivation
• Hash functions
– Fixed length binary string from a large digital message
– Used in compilers, database searching, cryptography
– Cryptographic hash: security applications e.g. message
authentication, ensuring data integrity
• Traditional cryptographic hash
– Not suited for multimedia  very sensitive to input, i.e.
change in one input bit changes output dramatically
• Need for robust perceptual image hashing
– Perceptual: based on human visual system response
– Robust: hash values for “perceptually identical” images
must be the same (with high probability)
4
Introduction
Image Hashing: Applications
• Applications
– Image database search and indexing
– Content dependent key generation for watermarking
– Robust image authentication: hash must tolerate incidental
modifications yet be sensitive to content changes
JPEG Compressed
Original Image
Same hash value h1
Tampered
Different hash values
h2
5
Outline
• Perceptual image hashing
–
Motivation & applications
• Contribution # 1: A unified framework
–
–
Formal definition of desired hash properties/goals
Novel two-stage hashing algorithm
• Review of existing feature extraction techniques
• Contribution # 2: Robust feature extraction
• Contribution # 3: Clustering algorithms for feature
vector compression
–
Randomized clustering for secure hashing
• Summary
6
Contribution # 1: A Unified Framework
Perceptual Hash: Desirable Properties
• Hash function takes two inputs
– Image I I (class of images, e.g. natural images)
– Secret key K K (key space)
Symbol
• Perceptual robustness
Pr( H ( I , K )  H ( I ident , K ))  1 - 1
• Fragility to visually distinct inputs
Pr( H ( I , K )  H ( I diff , K ))  1 -  2
• Unpredictability
1
Pr( H ( I , K )  v)  q ,  v  {0,1}q
2
H(I, K)
Meaning
Hash value
extracted from
image I
I ident  I
Image identical in
appearance to I
I diff  I
Image clearly
distinct in
appearance vs. I
q
Length of hash
(in bits)
7
Contribution # 1: A Unified Framework
Hashing Framework
• Two-stage hash algorithm [Monga & Evans, 2004]
Input Image I
Final Hash
Intermediate
hash
(e.g. 128 bits)
Compression
(e.g. 1 MB)
Extract visually robust
feature vector
Clustering of similar
feature vectors
• Feature vectors extracted from “perceptually
identical” images should be close in some distance
metric
8
Outline
• Perceptual image hashing
–
Motivation & applications
• Contribution # 1: A unified framework
–
–
Formal definition of desired hash properties/goals
Novel two-stage hashing algorithm
• Review of existing feature extraction techniques
• Contribution # 2: Robust feature extraction
• Contribution # 3: Clustering algorithms for feature
vector compression
–
Randomized clustering for secure hashing
• Summary
9
Existing techniques
Invariant Feature Extraction
• Image statistics based approaches
– Intensity statistics: Intensity histograms of image blocks
[Schneider et al., 1996] mean, variance and kurtosis of intensity
values extracted from image blocks [Kailasanathan et al., 2001]
– Statistics of wavelet coefficients [Venkatesan et al., 2000]
• Relation based approaches [Lin & Chang, 2001]
– Invariant relationship between corresponding discrete cosine
transform (DCT) coefficients in two 8  8 blocks
• Preserve coarse representations
– Threshold low frequency DCT coefficients [Fridrich et al., 2001]
– Low-res wavelet sub-bands [Mihcak & Venkatesan, 2000, 2001]
– Singular values and vectors of sub-images [Kozat et al., 2004]
10
Related Work
Open Issues
• A robust feature point scheme for hashing
–
–
–
Inherent sensitivity to content-changing
manipulations (useful in authentication)
Representation of image content robust to
global and local geometric distortions
Exploit properties of human visual system
• Randomized algorithms for secure image hashing
–
–
Quantifying impact of randomization in enhancing hash
security
Trade-offs with robustness/perceptual significance of hash
Necessitates a
joint signal processing-cryptographic
approach
11
Outline
• Perceptual image hashing
–
–
Motivation & applications
Review of existing techniques
• Contribution # 1: A unified framework
–
–
Formal definition of desired hash properties/goals
Novel two-stage hashing algorithm
• Contribution # 2: Robust feature extraction
• Contribution # 3: Clustering algorithms for feature
vector compression
–
Randomized clustering for secure hashing
• Summary
12
Hypercomplex or End-Stopped Cells
• Cells in visual cortex that help in object recognition
– Respond strongly to line end-points, corners and points of
high curvature [Hubel et al.,1965; Dobbins, 1989]
• End-stopped wavelet basis [Vandergheynst et al., 2000]
– Apply First Derivative of Gaussian (FDoG) operator to
detect end-points of structures identified by Morlet wavelet
Synthetic L-shaped image
Morlet wavelet response
End-stopped wavelet response
13
Contribution # 2: Robust Feature Extraction
Computing Wavelet Transform
• Generalize end-stopped wavelet
 E (x)  ( FDoG) o ( M ( x,y; ))
• Employ wavelet family
–
–
( E ( i ( x, y,  k )) ,   , i  Z
Scale parameter  = 2, i – scale of the wavelet
Discretize orientation range [0, π] into M intervals i.e.
θk = (k π/M ), k = 0, 1, … M - 1
• End-stopped wavelet transform
Wi ( x, y,  )   I ( x1 , y1 )  E* ( i ( x  x1 , y  y1 ),  ) dx1dy1
14
Contribution # 2: Robust Feature Extraction
Proposed Feature Detection Method
[Monga & Evans, 2004]
1. Compute wavelet transform of image I at suitably chosen
scale i for several different orientations
2. Significant feature selection: Locations (x,y) in the image
that are identified as candidate feature points satisfy
m
'
'
Wi ( x, y, )  ' max
W
(
x
,
y
, )
i
'
( x , y )N ( x , y )
3. Avoid trivial (and fragile) features: Qualify a location as a
final feature point if
max Wi ( x, y, )  T
m

• Randomization: Partition image into N (overlapping) random regions using a
secret key K, extract features from each random region
• Perceptual Quantization: Quantize feature vector based on distribution
(histogram) of image feature points to enhance robustness
15
Contribution # 2: Robust Feature Extraction
Iterative Feature Extraction Algorithm
[Monga & Evans, 2004]
1. Extract feature vector f of length P from image I, quantize f
perceptually to obtain a binary string bf1 (increase count*)
2. Remove “weak” image geometry: Compute 2-D order
statistics (OS) filtering of I to produce Ios = OS(I;p,q,r)
3. Preserve “strong” image geometry: Perform low-pass linear
shift invariant (LSI) filtering on Ios to obtain Ilp
4. Repeat step 1 with Ilp to obtain bf2
5. IF (count = MaxIter) go to step 6.
MaxIter, ρ, P, and count are
algorithm parameters.
* count = 0 to begin with
ELSE IF D(bf1, bf2) < ρ go to step 6. fv(I) denotes quantized feature
vector
ELSE set I = Ilp and go to step 1.
6. Set fv(I) = bf
2
D(.,.) – normalized Hamming
distance between its arguments
16
Contribution # 2: Robust Feature Extraction
Image Features at Algorithm Convergence
Original
image
Additive
White
Gaussian
Noise
with zero
mean and
σ = 10
JPEG with
Quality Factor
of 10
Stirmark local
geometric attack
17
Contribution # 2: Robust Feature Extraction
Quantitative Results: Feature Extraction
• Quantized feature vector comparison
D(fv(I), fv(Iident)) < 0.2
D(fv(I), fv(Idiff)) > 0.3
Table 1. Comparison of
quantized feature vectors
Normalized Hamming
distance between
quantized feature vectors
of original and attacked
images
*Attacked images
generated by Stirmark
benchmark software
*Attack
JPEG, QF = 10
AWGN, σ = 20
Contrast
Enhancement
Lena
0.04
0.04
0.00
Bridge Peppers
0.04
0.06
0.03
0.02
0.06
0.04
Gaussian
Smoothing
Median Filtering
Scaling by 50%
Rotation by 20
Rotation by 50
Cropping by 10%
Cropping by 20%
0.01
0.03
0.05
0.02
0.08
0.12
0.18
0.12
0.21
0.03
0.14
0.15
0.20
0.13
0.22
0.07
0.11
0.14
0.19
0.15
0.24
18
Contribution # 2: Robust Feature Extraction
Comparison with other approaches
Attack
Threshold Preserve low freq, Proposed
coarse wavelet DCT coefficients feature point
(Fridrich et al., 2001)
coefficients
detector
(Mihcak et al., 2001)
JPEG, QF = 10
AWGN, σ = 20
Gaussian
Smoothing
Median Filtering
Scaling 50%
Rotation 2 degrees
Cropping 10%
Cropping 20%
* Small object
addition
* Tamper with
facial features
YES
YES
YES
YES
NO
YES
YES
YES
YES
YES
YES
YES
YES
YES
NO
NO
YES
NO
NO
NO
YES
YES
YES
YES
YES
NO
NO
YES
YES
NO
YES 
survives
attack, i.e.
features were
close
*content
changing
manipulations,
should be
detected 19
Outline
• Perceptual image hashing
–
Motivation & applications
• Contribution # 1: A unified framework
–
–
Formal definition of desired hash properties/goals
Novel two-stage hashing algorithm
• Review of existing feature extraction techniques
• Contribution # 2: Robust feature extraction
• Contribution # 3: Clustering algorithms for feature
vector compression
–
Randomized clustering for secure hashing
• Summary
20
Clustering: Problem Statement
Feature Vector Compression
• Goals in compressing to a final hash value
Significant dimensionality reduction while retaining
robustness, fragility to distinct inputs, randomization.
Question: Minimum length of the final hash value (binary
string) needed to meet the above goals ?
–
• Problem Statement
–
Let (li, lj) denote vectors in the metric space of feature
vectors V, then it is desired with high probability
if D(li , l j )   then C (li )  C (l j )
if D(li , l j )   then C (li )  C (l j )
where 0 < ε < δ, C(li), C(lj) denote the clusters to which these
vectors are mapped
21
Clustering: Possible compression methods
Possible Solutions
• Error correction decoding [Venkatesan et al., 2000]
–
–
Applicable to binary feature vectors
Break the vector down to segments close to the length of
codewords in a suitably chosen error correcting code
• More generally vector quantization/clustering
–
Minimize an “average distance” to achieve compression
close to the rate distortion limit
K 1
min
  P(l ) D(l , c )  
k  0 lS k
–
–
–
–
k
l L  V (metric space of feature vectors)
P(l) – probability of occurrence of vector l
D(.,.) distance metric defined on feature vectors
ck – codewords/cluster centers, Sk – kth cluster
22
Contribution # 3: Clustering Algorithms
Is Average Distance the Appropriate Cost for the
Hashing Application?
• Problems with average distance VQ
–
–
–
No guarantee that “perceptually distinct” feature vectors
indeed map to different clusters – no straightforward way to
trade-off between the two goals
Must decide number of codebook vectors in advance
Must penalize some errors harshly e.g. if vectors really
close are not clustered together, or vectors very far apart are
compressed to the same final hash value
• Define alternate cost function for hashing
–
Develop clustering algorithm that tries to minimize that
cost
23
Contribution # 3: Clustering Algorithms
Cost Function for Feature Vector Compression
• Define joint cost matrices C1 and C2 (n x n)
 D ( li ,l j )

if D(li , l j )   , C (li )  C (l j )

α > 0, Г > 1
c1 (i, j )  

otherwise
0
are
 D ( li , l j )

if D(li , l j )   , C (li )  C (l j ) algorithm

c2 (i, j )  
parameters

otherwise
0
–
n – total number of vectors be clustered, C(li), C(lj)
denote the clusters that these vectors are mapped to
• Exponential cost
–
Ensures that severe penalty is associated if feature
vectors far apart and hence “perceptually distinct” are
clustered together
24
Contribution # 3: Clustering Algorithms
Cost Function for Feature Vector Compression
• Further define S1 as
 D ( l i , l j )


if D(li , l j )  

s1 (i, j )  

otherwise
0


• Normalize to get C1 , C 2

c1 (i, j ) 
c1 (i, j )
 s1 (i, j )
i
j

c2 (i, j ) 
*S2 is defined similarly
c2 (i, j )
 s2 (i, j )
i
j
• Then, minimize the “expected” cost


 


E C1  C2   p(i) p( j )c1 (i, j )  c2 (i, j )
i
–
j
p(i) = p(li), p(j) = p(lj)
25
Contribution # 3: Clustering Algorithms
Clustering: Hardness Claims & a Good Heuristic
• Decision version of the clustering problem
–
–
For a fixed number of clusters k, is there a clustering with
cost less than a constant?
k-way weighted graph cut problem:
known to be NP-complete and reduces
to our clustering problem in
log-space [Monga et al., 2004]
• A good heuristic?
–
–
Motivated by the stable roommate/spouse problem
Give preference to the “bully” or the strongest candidates in
ordered fashion: intuitively this minimizes the grief
• Our clustering problem
–
Notion of strength is captured by the probability mass of
the data point/feature vector
26
Contribution # 3: Clustering Algorithms
Basic Clustering Algorithm
[Monga et al. 2004]
Type I error: D(li , l j )   , C (li )  C (l j )
Type II error: D(li , l j )   , C (li )  C (l j )
• Heuristic: Select the data
point associated with the
highest probability mass as
the cluster center
• For any (li, lj) in cluster Sk
D(li , l j )  D(li , l k )  D(l k , l j )  
• No errors until this
3 / 2

stage of the algorithm
 /2
27
Contribution # 3: Clustering Algorithms
Handling the unclustered data points
Approach 2
Approach 1


 /2
Assign to the cluster that incurs the

minimum EC1  cost

 /2
All clusters are candidates: assign to
one that minimizes a joint cost
28
Contribution # 3: Clustering Algorithms
Clustering Algorithms: Revisited
• Approach 1
 
 


– Tries to minimize E C1 conditioned on E C 2 = 0
• Approach 2
 
 


– Smoothly trades off the minimization of E C1 vs. E C 2
via the parameter β:
 
 


(  ) E C1  (1   ) E C 2
–
β = ½  joint minimization
• Final hash length determined automatically!
– Given by log 2 k  bits, where k is the total number of clusters
–
Proposed clustering can be used to compress feature vectors in
any metric space, i.e. no assumptions on the topology
29
Contribution # 3: Clustering Algorithms
Clustering: Results
• Compress binary feature vector of L = 240 bits
–
ε = 0.2, δ = 0.3 (normalized hamming distance)
 
Clustering Algorithm

E C1
Approach 1
7.64 * 10-8
 
Final Hash Length
0
54 bits

E C2
Approach 2, β = ½
7.43 * 10-9 7.46 * 10-10
54 bits
Approach 2, β = 1
7.17 * 10-9 4.87 * 10-9
54 bits
Decoding via Reed-Muller 5.96 * 10-4 3.65 * 10-5
Error Correction Codes
75 bits
3.25 * 10-4 7.77 * 10-5
60 bits
Average Distance VQ
At approximately the same rate, the cost is orders of magnitude
lower for the proposed clustering
30
Contribution # 3: Clustering Algorithms
Validating the Perceptual Significance
• Applied the two-stage hash algorithm to a natural
image database of 100 images
–
–
For each image 20 perceptually identical images were
generated using the Stirmark benchmark software
Attacks included JPEG compression with varying quality
factors, AWGN addition, geometric attacks viz. small
rotation and cropping, linear/non-linear filtering etc.
• Results
–
–
Robustness: Final hash values for the original and
“distorted” images same in over 95% cases
Fragility: 1 collision in all pairings (4950) of 100 images
In comparison, 40 collisions for traditional VQ and
25 for error correction decoding
More analysis
31
Outline
• Perceptual image hashing
–
Motivation & applications
• Contribution # 1: A unified framework
–
–
Formal definition of desired hash properties/goals
Novel two-stage hashing algorithm
• Review of existing feature extraction techniques
• Contribution # 2: Robust feature extraction
• Contribution # 3: Clustering algorithms for feature
vector compression
–
Randomized clustering for secure hashing
• Summary
32
Contribution # 3: Clustering Algorithms
Randomized Clustering
• Heuristic for the deterministic map
–
Select the highest probability data point as the cluster center
• Randomization Scheme
–
Select cluster centers probabilistically via a randomization
parameter s   

(s)
i
( pi ) s

,
s
 ( pi )
i runs over unclustered data points
i
• On the role of s….
–
–
s = 0, implies  i( 0 ) is uniform or any point is selected as
the cluster center with the same probability
s   implies deterministic clustering
1 for the highest prob. point
()
i  
0 otherwise
33
Contribution # 3: Clustering Algorithms
Security Via Randomization
• Conjecture
– Randomization makes
generation of malicious inputs
harder
• Adversary model
– U : set of all possible feature
vector pairs in L
– E  U : the error set for
deterministic clustering


Clustering cost EC1  EC 2  computed
– Adversary has complete
over the error set E
knowledge of feature
As randomization increases,
extraction and deterministic
clustering  will contrive to adversary achieves little success
generate input pairs over E
34
Contribution # 3: Clustering Algorithms
The rest of the story…….
Cost over the set E
(complement of the error set)
Cost over the set U  E  E
An appropriate choice of s preserves perceptual robustness
while significantly enhancing security: result of a joint cryptosignal processing approach
35
Contribution # 3: Clustering Algorithms
Uniformity of the hash distribution
Kullback-Leibler (KL) distance of the hash distribution measured
against the uniform distribution
Hash distribution is close to uniform for s < 1000
36
Summary of contributions
• Two-stage hashing framework
–
Media dependent feature extraction followed by (almost)
media independent clustering
• Robust feature extraction from natural images
–
Iterative feature extractor that preserves significant image
geometry, features invariant under several attacks
• Algorithms for feature vector compression
–
–
–
Novel cost function for the hashing application
Greedy heuristic based clustering algorithms
Randomized clustering for secure hashing
• Image authentication under geometric attacks
(not presented)
37
Questions and Comments!
38