Automatic detection of object based Region-of

Download Report

Transcript Automatic detection of object based Region-of

Automatic detection of object based
Region-of-Interest for image
compression
Sunhyoung Han
SVCL
Basic Motivation
Constraints
Limited Resources &
Channel Errors
Transmission in erroneous channel
Spatially different
Super resolution
SVCL
Basic motivation
By having information about importance of regions
One can wisely use the limited resources
SVCL
User-adaptive Coder
• visual concepts of interest can
be anything
• main idea:
• let users define a universe
of objects of interest
• train saliency detector for
each object
v
• e.g. regions of “people”,
“the Capitol”, “trees”, etc.
SVCL
User Adaptive Coder
current training
sets
query provided
by user
train
detector
SVCL
User-adaptive coder
• user-adaptive coder:
– detector should be generic enough to handle large
numbers of object categories
“face”
“lamp”
“car”
– training needs to be reasonably fast (including example
preparation time)
SVCL
User-adaptive coder
• proposed detector
– top-down object detector (object
category specified by user)
– focus on weak supervision
instead of highly accurate localization
– composed of saliency detection
and saliency validation
– discriminant saliency:
training
saliency
filters
FIND best features
SVCL
Discriminant Saliency
• start from a universe of classes (e.g. “faces”, “trees”,
“cars”, etc.)
• design a dictionary of features: e.g. linear
combinations of DCT coefficients at multiple scales
• salient features: those that best distinguish the object
class of interest from random background scenes.
• salient regions are the regions of the image where
these detectors have strong response
• see [Gao & Vasconcelos, NIPS, 2004].
SVCL
Top-down Discriminant Saliency Model
Original
Feature Set
k *  arg max I X k ; Z 
k
Discriminant
Feature
Selection
Faces
Background
Salient
Features
Scale Selection
Wj
Saliency Map
WTA
Malik-Perona pre-attentive
perception model
SVCL
Saliency representation
• saliency detector
saliency map
image
salient points
• salient point sali:
– magnitude
– location
– scale
pi
li
si
Probability map
• saliency map approximated
by a Gaussian mixture
SVCL
Saliency validation
• saliency detection:
example of saliency map
– due to limited feature
dictionary and/or limited
training set
– coarse detection of object
class of interest
• need to eliminate false
positives
original Image
• saliency validation:
saliency map for ‘street sign’
SVCL
– geometric consistency
– reject salient points whose
spatial configuration is
inconsistent with training
examples
Saliency validation
• learning a geometric model of salient point configuration
two
components:
•
- image
alignment
- configuration
model
• model:
- classify points
into
• true positives
• false positives
- model each
as Gaussian
SVCL
Saliency validation
• model: two classes of points Y={0,1}
– Y=1 true positive
– Y=0 false positive
• saliency map: mixture of true and false
positive saliency distributions
• each distribution approximated by a
Gaussian
SVCL
Saliency validation
• this is a two class clustering problem
– can be solved by expectation-maximization
• graphical model
L~uniform
Y
C
X
Y~Bernoulli (a1)
L,S
C|Y=i~multinomial (pi)
D
X|Y=i,L=l,S=s,Dm~G(x, l-m, sI)
• non-standard issues
– we start from distributions, not points
– alignment does not depend on false
negatives
SVCL
M-step
E-step
Saliency Validation
DERIVATION DETAILS
For K training examples (#
of saliency point is Nk for
kth example)
Missing data  Y= j,
j ∈ {1,0}
Parameters 
aj (probability for class j)
∑j (Covariance for class )
mk (displacement for kth
example)
For robust update
SVCL
Saliency Validation
• visualization of EM algorithm
Saliency detection result
Init saliency points overlapped
over 40 samples
Overlapped points classified
as ‘’object’’
Overlapped points classified
as ‘’noise’’
SVCL
Visualized variance ∑1
Visualized variance ∑0
Saliency Validation
• examples of classified Points
Examples of classified saliency points  White if hij1>hij0 Black otherwise
• in summary, during training we learn
– discriminant features
– The “right” configuration of salient points
SVCL
Region of interest detection
• find image window that best matches the learned
configuration
•
mathematically:
- find location p where the posterior
probability of the object class is the largest
SVCL
Region of interest detection
• by Bayes rule
– Posterior  Likelihood x Prior
– likelihood is given by matching
saliencies within the window
& the model
- prior measures
the saliency mass
inside window
?
?
likelihood
Prior
SVCL
Region of Interest Detection
• given the model
– the likelihood, under it, of
a set of points drawn from
the observed saliency distribution is
– and the optimal location is given by
Measure configuration matching
Prior for location P
With saliency detector
DERIVATION DETAILS
SVCL
Region of Interest Detection
2. Determine scale(shape) of ROI mask Observation(∑*) from data and
prior(∑1) from training data are used
∑*
∑1
** Once the center point is known the assignment of each point is given by
x
 The observed configuration for Y=1 is
3. Thresholds PY|X,P(1|x,p*) to get binary ROI mask
SVCL
Region of Interest Detection
• Example of ROI Detection
Saliency detection (for statue of liberty)
Probability map (with configuration info.)
SVCL
Probability map (saliency only)
ROI mask
Evaluation
• Using CalTech “Face” database &
UIUC “Car side” database
• Evaluate robustness of learning
– Dedicated Training set vs. Web Training set
Number of positive example: 550
Number of positive example: 100
• Evaluation Metric
– ROC area curve
– PSNR gain for ROI coding vs. normal coding
SVCL
Evaluation
True Positive
True Positive
• ROC area curve
False Positive
False Positive
“Face”
“Car”
SVCL
Evaluation
PSNR
PSNR
• PSNR performance comparison
Bit Per Pixel
Bit Per Pixel
“Car”
“Face”
14.3% bits can be saved even with web train uniform case
for the same image quality
SVCL
Result Examples
SVCL
Result
Comparison of needed bits to get the same PSNR (30 dB) for ROI
 Maximally, ¼ bits are enough to get the same quality for ROI area
SVCL
Result Examples
Normal coding
ROI coding
SVCL
SVCL
EM derivation
• Want to fit lower level observation

• For a virtual sample X = {Xik|i=1, …, Nk and k=1, …, K} with
the size of Mik=pik*N, likelihood becomes
• For complete set the log likelihood becomes
SVCL
EM derivation

Maximization in the m-step is carried out by maximizing the Lagrangian
SVCL
ROI Detection
For one sample point x1 
For samples having distribution of
SVCL
ROI Detection
Therefore,
SVCL