Hierarchical Semantic Indexing for Large Scale Image Retrieval

Download Report

Transcript Hierarchical Semantic Indexing for Large Scale Image Retrieval

HIERARCHICAL SEMANTIC INDEXING
FOR LARGE SCALE IMAGE RETRIEVAL
Jia Deng(Princeton University)
Alexander C. Berg(Stony Brook University)
Li Fei-Fei(Stanford University)
Outline





Introduction
Exploiting Hierarchy for Retrieval
Efficient Indexing
Experiments
Conclusion
Introduction
Introduction

Prior knowledge:
 Known
semantic labels and a hierarchy relation relating
them

Contributions:
 Exploiting
retrieval with semantic hierarchically
 Novel hashing scheme
Work Flow
Low-lv Features to semantic vector

Using method of : Beyond Bags of Features: Spatial
Pyramid Matching for Recognizing Natural Scene
Categories[24] (SPM)
Result from SPM
Low-lv Features to semantic vector

Classification is done with SVM trained using the
one-versus-all rule:
a
classifier is learned to separate each class from the
rest, and the test image is assigned the label of the
classifier with the highest response.


Train linear SVM with semantic attributes by
LIBLINEAR
2-fold cross validation to determine C parameter
Hard Assign / Probability Calibration

Hard Assign :
 δi(a)
∈ {0,1} , image a has attribute i or not
 Example
: image a is a person δperson(a) = 1
 xi=δi
 The

binary entry here called “hard assign”
Probability Calibration
 Fit
sigmoid function to SVM classifier to get prob.
 xi=Pr(δi(a)=1|a) , prob. of image a has attribute i
Hierarchy



Let Sij = ξ(π(i,j)) , similarity between attributes
π(i,j) is lowest common ancestor of attribute i and j
ξ : {1,..,K}=> R , similarity function
(We can assign scores base on height of node)
Example:
Equine

Horse
In experiment, defineDonkey
ξ=1-h(π(i,j))/h*
h* is total height of node
Flat




When attributes are mutually exclusive
No hierarchical relationship between relationships
This is the most existing system developed and
evaluated
Equivalent to setting S matrix to identity
Similarity

Similarity between image a and b is ∑ijδi(a)Sijδj(b) ,
S ∈RKxK ,S is matching score between attribute i and
j, K is number of attributes
(like vector multiplication xTSy,
author refer this as bilinear similarity)
*Cosine Similarity is
Multinomial Distribution

Suppose
 P(蝦)
= 0.2
 P(蟹) = 0.35
 P(魚) = 0.45


P(N蝦= 8, N蟹= 10, N魚= 12)
=
(0.2)8(0.35)10(0.45)12
Sampling


For each trial, draw an auxiliary variable X from a
uniform (0, 1) distribution. The resulting outcome is
the component
This is a sample for the multinomial distribution with
n=1
Hashing

Prior condition:
S
is symmetric
 S is element-wise non-negative
 S is diagonally dominant, i.e.
Hashing


Define a K × (K + 1) matrix Θ : Θij (i,j ≤K)
Each row of Θ sums to one, and Θ without last
column is symmetric
Hashing



Consider hash functions of the form
h outputs a subset in 2N,
2N is all subsets of natural numbers.
h(x) = h(y) is defined as “set equality”,
that is {a,b,c} = {b,a,c}
Hashing





To construct H(S,ε), let N ≥ 1/ ε, Then h(x; S, ε) is
computed as follows:
(1) Sample α ∈ {1, . . . , K} ∼multi(x)
(2) Sample β ∈ {1, . . . , K + 1} ∼ multi(θα) where θα
is the α th row of Θ;
(3) If β ≤ K, return {α, β};
(4) Randomly pick γ from {K+1, . . . , K+N} and
return {γ}.
Experiments

Used Datasets:
 Caltech
256
 ILSVRC (for precision evaluation)
 (half


for train, half for test)
21k dimensional vectors formed by three-level SPM
on published SIFT visual words from 1000 word
codebook
Train linear SVM with semantic attributes by
LIBLINEAR, 2-fold cross validation for C
Precision



Suppose we have retrieved 2 images with similarity
0.8 and 0.7, but ground truth is 5 images with
similarity {0.8,0.7,0.6,0.5,0.4}
Ordinary Precision : 2/5 = 0.4
Proposed Precision :
(0.8+0.7)/(0.8+0.7+0.6+0.5+0.4) = 0.5
Precision Definition

Precision measure for a query q
 Numerator(分子)
: sum of ground truth similarities for
most similar k database items
 Denominator(分母): sum of ground truth similarities for
the best possible k database items
Experiments
Experiments
Experiments
Conclusion



Present an approach that can exploit prior
knowledge of a semantic hierarchy for similar
image retrieval
Outperform state-of-art similarity learning (OASIS)
with more accuracy and less computation
A hashing scheme for bilinear similarity on
probability distributions
End
Thank You