Transcript X(t)
A Markov Process Based Approach
to Effective Attacking JPEG
Steganography
By
Y. Q. Shi, Chunhua Chen, Wen Chen
NJIT
Presented by
Ashish Ratnakar and Seshu Kishore Dola
1
Steganalysis: A Markov Process Based
Approach
Steganography, Different Approaches & Techniques
Steganalysis & Previous Work on Steganalysis
Markov Process
Feature Construction
Experiments and Results
Discussion and Conclusion
2
What is Steganography ?
Art and science of invisible communication to conceive
the very existence of hidden messages
Images convey large size of message
Because of non-stationarity, Image Steganography is
hard to attack
JPEG is popularly used format for Staganography as it is
possible to compress JPEG images up to 1:10 ratio
without significant loss
3
Steganography Approaches
Some Steganography approaches are:
Least significant bit insertion
Masking and filtering
Algorithms and transformations
4
Modern Stego Techniques
Outguess
Stego framework is created by embedding hidden data
using redundancy of cover image.
Outguess preserves statistics of the BDCT coefficients
histogram
Stego takes two measures before embedding data
- Redundant BDCT coefficients, which has least effect on
cover image.
- Adjusts the untouched coefficients.
5
Modern Stego Techniques (cont’d.)
F5
Works on JPEG format only.
Two main security actions against steganalysis attacks:
- Straddling: scatters message uniformly over the cover
image
- Matrix Embedding: Improves embedding efficiency (no.
of bits/ change of BDCT coeff.)
6
Modern Stego Techniques (cont’d.)
MB (Model-based Steganography)
Correlates embedded data with cover image
Splits cover image into two parts
- Models parameter of distribution of second given first
part
- Encodes second part using model and hidden message
- Combine these two parts to form stego image
MB1 operates on JPEG images, uses Cauchy
distribution to model JPEG histogram
7
Steganalysis: A Markov Process Based
Approach
Steganography, Different Approaches & Techniques
Steganalysis & Previous Work on Steganalysis
Markov Process
Feature Construction
Experiments and Results
Discussion and Conclusion
8
Steganalysis
Art of detecting hidden messages from stego images
Two categories
Specific Steganalysis: Concentrates on a particular
steganography technique.
Universal Steganalysis: analyze any steganographic
technique.
9
Previous Work on Steganalysis
Universal Steganalyzer - proposed by Farid
Based on Image’s higher order statistics
Achieves better detection rate than random guess for
universal steganalysis method
Universal Steganalysis – proposed by Shi et al
Based on statistical moments of characteristic functions
of image, its prediction-error image and their DWT
subbands
Performs better than Universal Steganalyzer proposed
by Farid
10
Previous Work on Steganalysis
Fridrich proposed set of distinguishing features from
BDCT and spatial domain for detecting messages
embedded in JPEG images.
- Performs better than previous two steganalysis
techniques in attacking JPEG steganography
Specific Steganalysis with spread spectrum – by Sullivan
et al
- Inter-pixel dependencies used and Markov chain model
is adopted.
- Some loss is inevitable due to random feature selection
- Markov chains used only in horizontal direction
11
Steganalysis: A Markov Process Based
Approach
Steganography, Different Approaches & Techniques
Steganalysis & Previous Work on Steganalysis
Markov Process
Feature Construction
Experiments and Results
Discussion and Conclusion
12
Markov Processes – Wikipedia
Named after mathematician Markov for random evolution
of memoryless system
Definition: A stochastic process whose state at time t is
X(t), for t>0 and whose history of states is given by x(s)
for times s<t is a Markov process if
Probability of its having state y at time t+h conditioned on
having particular state x(t) at time t, is equal to
conditional probability of its having that same state y but
conditioned on its value for all previous times before t,
presenting future state is independent of its past states.
13
Steganalysis: A Markov Process Based
Approach
Steganography, Different Approaches & Techniques
Steganalysis & Previous Work on Steganalysis
Markov Process
Feature Construction
Experiments and Results
Discussion
Conclusion
14
Feature Construction for Steganalysis
To classify as stego or non-stego image
In this Steganalysis scheme, second order statistics are
used to detect JPEG steganographic method.
Steps:
- Defining JPEG 2-D array
- Introducing Difference JPEG 2-D array in different
directions
- Modeling this difference array using Markov random
process (Transition Probability Matrix)
- Thresholding technique to reduce computational cost.
15
Defining JPEG 2-D array
Generation of features
from 8 x 8 BDCT domain
to attack steganography
2-D array of same size as
given image with each 8 x
8 block filled up with
corresponding JPEG
quantized 8 x 8 BDCT
coeff.
Absolute value is taken
resulting array as shown
16
Defining JPEG 2-D array (Cont’d.)
Reason for choosing absolute value of coefficients
- BDCT coefficients do not obey Gaussian distribution
- Power of 8 x 8 block of DCT coefficients is highly concentrated in
DC and low freq. AC components and JPEG quantization enhances
disparity among quantized BDCT coefficients.
- These coefficients are non-increasing along zig-zag order i.e. they
are correlated.
- Hence difference of absolute values of two immediate neighbors is
highly concentrated around 0 having Laplacian-like distibution.
- Also absolute value results in higher detection rates and lower
computational complexity
17
Difference JPEG 2-D array
Disturbance caused by
Steganographic methods in
JPEG images can be enlarged
by observing difference
between an element and one
of its neighbors.
4 JPEG 2-D difference arrays
are generated.
Fh(u, v) = F(u, v) – F(u+1, v)
Fv(u, v) = F(u, v) – F(u, v+1)
Fd(u, v) = F(u, v) – F(u+1, v+1)
Fmd(u, v) = F(u+1, v) – F(u, v+1)
18
Difference JPEG 2-D array
Distribution of difference
array elements is
Laplacian with most
values close to 0
Most of the elements is
difference array are in [-T,
T] as long as T is large
enough.
19
Transition Probability Matrix
Second order statistics are used in order not to increase
computational complexity dramatically
Uses Markov Random Process with one-step transition
probability matrix.
In order to reduce complexity further, thresholding
technique is used. Hence dimensionality of matrix is
reduced to (2T+1)X(2T+1)
By choosing proper ‘T’ value, good steganalysis
capability with manageable computational complexity is
achieved.
20
Transition Probability Matrix (Cont’d.)
From equations beside, we
have 4 X (2T+1) X (2T+1)
elements
Choosing proper value of T
gives steganalysis capability
with manageable
computational complexity
21
Feature Formation Procedure
22
Support Vector Machine
Classifier for pattern Recognition.
Easy to use than Neural Networks of Image analysis and
Performance is comparable.
SVM is based on idea hyperplane classifier.
Optimal separation hyperplane is calculated by
Langrangian multipliers.
SVM can be used for both linear and nonlinear
separable case.
In linear case SVM, looks for Hyperplane (H) and two
planes (H1 & H2 M) parallel to H. It maximizes distance
b &w these two planes With no data points in between.
In nonlinear case SVM uses kernels ( Polynomial kernel)
functions to locate linear hyperplane.
http://svm.dcs.rhbnc.ac.uk/pagesnew/GPat.shtml
23
Steganalysis: A Markov Process Based
Approach
Steganography, Different Approaches & Techniques
Steganalysis & Previous Work on Steganalysis
Markov Process
Feature Construction
Experiments and Results
Discussion and Conclusion
24
Experiments and Results
Images used were 7560
JPEGs with QF ranging
from 70-90
Each one is cropped to
768*512 or 512*768
dimension
Chrominance set to zero
and Luminance
untouched before
embedding.
25
Experiments and Results (Cont’d.)
Stego Images Generation
Embedding rate is ratio of message length to non-zero
elements in JPEG 2-D array measured in bpc
Considered embedding rates are
- For OutGuess: 0.05, 0.1, 0.2 bpc and stego images
generated are 7498, 7452, 7215 resp.
- For F5 and MB1: 0.05, 0.1, 0.2, 0.4 bpc and 7560 stego
images are generated. Step size equal to two for MB1
26
Results obtained using SVM with
polynomial Kernel
Half of non-stego and stego
image pairs selected to train
SVM classifier and others are
using trained classifier
4 steganalysis schemes
compared as shown to detect
OutGuess, F5 and MB
Result: The proposed
steganalyzer outperforms the
prior-arts by significant margin
F5 has low detection rate on
same embedding rate than
MB1
27
Result with features from one direction at
a time
Contributions made from
horizontal and vertical direction
are more than that from main
and minor diagonal directions.
Contribution
Contribution made from main
diagonal larger than that from
the minor diagonal direction.
28
Steganalysis: A Markov Process Based
Approach
Steganography, Different Approaches & Techniques
Steganalysis & Previous Work on Steganalysis
Markov Process
Feature Construction
Experiments and Results
Discussion and Conclusion
29
Discussion
Taking absolute values in JPEG 2-D array is an
advantage
- Not taking absolute value degrades performance
- Dynamic range of JPEG 2-D array will be increased
- Following table shows performance comparison with
and without absolute values for MB1
30
Discussion (Cont’d.)
Detection Rates of F5
Detection rates for MB1 are higher than F5 for same embedding
rates
Reasons:
- F5 reduces magnitude of non-zero DCT AC coefficients by 1 in
order to embed a bit and has larger probability to keep difference
JPEG 2-D array elements unchanged after data embedding
- Following statistics show that at low rates F5 changes fewer DCT
co-eff. Than MB1 but reverse case for higher rate.
31
Conclusion
Taking absolute value in JPEG 2-D array reduces
computation complexity and raises analysis capability
Difference JPEG 2-D Arrays along horizontal, vertical,
diagonal and minor diagonal directions have enlarged
changes caused by Steganographic methods
Thresholding technique greatly reduces dimensionality of
feature vectors to a manageable extent
Markov process to model difference JPEG 2-D arrays
and using all elements in transition probability matrices
as features, the second order statistics have been used
32
References
C. J. C. Burges. “A tutorial on support vector machines for pattern
recognition”, Data Mining and Knowledge Discovery, 2(2):121-167, 1998
H. Farid, “Detecting hidden messages using higher-order statistical models”,
International Conference on Image Processing, Rochester, NY, USA, 2002
Y. Q. Shi, G. Xuan, D. Zou, J. Gao, C. Yang, Z. Zhang, P. Chai, W. Chen,
and C. Chen,“Steganalysis based on moments of characteristic functions
using wavelet decomposition, prediction-error image, and neural network,”
International Conference on Multimedia and Expo, Amsterdam,
Netherlands, 2005
www.wikipedia.org
33
Thank You!!!
Q&A
34