SURF: Speeded Up Robust Features, ECCV 2006. Herbert Bay

Download Report

Transcript SURF: Speeded Up Robust Features, ECCV 2006. Herbert Bay

An Empirical Study on Large-Scale
Content-Based Image Retrieval
Group Meeting
Presented by Wyman
23-1-2007
1
Introduction
• Content-based Image Retrieval (CBIR)
– The process of searching for digital images in large
databases based on image contents
– It consists of four modules in general:
•
•
•
•
data acquisition and processing
feature representation
data indexing
query and feedback processing
2
Introduction
• Content-based Image Retrieval (CBIR)
– Extensively studied in both academia and industry
– Yet, traditional data indexing techniques are not
scalable to high dimensional data
– Image contents in high dimensional feature space
– Thus, traditional CBIR systems are not efficient
3
Contribution
• Propose a scalable CBIR scheme by applying localitysensitive hashing (LSH)
– LSH is good at indexing high dimensional data
• Conduct a comprehensive empirical evaluation of CBIR
over a half million images
– There is very limited empirical study on that large CBIR systems
• Address some challenges for building scalable CBIR
systems on large-scale data
– some innovative ideas are suggested for tackling these issues
4
Hashing Algorithm
• Hashing algorithm is commonly employed for
indexing large-scale database
– Due to its fast database lookup capability
– O(1) on average
• However, it is never used for similarity indexing
– It builds index for searching identical copy of the
query key, but not for searching near neighbors
5
Locality-Sensitive Hashing
• An emerging new indexing algorithm
– Proposed to solve high-dimensional near neighbor searching
problem in Euclidean space l2
– It can answer queries in sublinear time
– Each near neighbor being reported with a fixed probability
• Principles
– Two close points most likely share the same hash value
– By looking into the hash bucket of the query point, we obtain
many near neigbhors of the query points
– Large fraction of data points are not processed
6
Locality-Sensitive Hashing
• we employ the E2LSH (Exact Euclidean LSH) package
• The probability of finding near neighbors can be
controlled by two parameters L and k
– L: number of hash functions g(v) = (h1(v), … hk(v))
• L: larger L increases the probability of finding all R-near neighbors
• k: larger k reduces the chance of hitting data points that are not
R-near neighbors
• h(v) controlled by two parameters a and b
Same h if this
is within
[ (n-1)w, (n)w )
– a: a d dimensional vector with entries chosen independently
from a Gaussian distribution
– b: a real number chosen uniformly from the range [0, w]
7
Main problem of E2LSH
• One main problem is that E2LSH is a memory
based implementation
– All the data points and the data structures are
stored in the main memory
– Maximum database size limited by the amount of
free main memory available
8
Our Scalable Implementation
• We propose a multi-partition indexing approach
– We divide the whole database into multiple
partitions
– Each of them is small enough to be loaded into the
main memory
– Then we can process queries on each partition
respectively
9
The Detailed Procedure
1. Divide the database into n partitions, where the number of partitions n =
ceiling of (database size / max partition’s size).
2. Run E2LSH indexing on each of the partitions to create the hash tables;
3. Given a query q, load the pre-built hash tables T of one partition into the
main memory;
4. Run E2LSH query on T to retrieve top k ranking images with respect to q;
5. Repeat (3) and (4) steps until all partitions are processed.
6. Collect the results from all partitions and return top k ranking results with
respect to the query q.
10
Some Critical Issues
• Disk-access overhead for loading the hash
tables into the main memory
• We can consider some parallel solutions to
overcome this overhead issue and speedup the
overall solution in our future work
11
Feature Representation
• We represent an image with three types of
visual feature: color, shape, and texture
• For color, we use Grid Color Moment
• For shape, we employ an edge direction
histogram
• For texture, we adopt Gabor feature
• In total, a 238-dimensional feature vector is
employed to represent each image in our image
database
12
Empirical Evaluation
• Experimental Testbed
– Testbed containing 500,000 images crawled from
Web
– 5,000 images from COREL image data set are
engaged as the query set
• Contains 50 categories
• Each category consists of exactly 100 images that are
randomly selected
• Every category represents a different semantic topic, such
as butterfly, car, cat, dog, etc
13
Performance Metrics
• Two standard performance metrics
– Precision and recall
– Relevant image if it is one of the 100 COREL images in the
database which share the same category with the query images
– Evaluate efficiency by average CPU time elapsed on a given
query
14
Experimental Setup
• Form a query set of 1000 image examples by randomly
select 20 images from each of the 50 COREL image
categories
• Prepare 10 image databases of different sizes ranging
from 50,000 images to 500,000 images
• A database of size N contains the followings:
– 5,000 COREL images
– N-5000 other images selected from our testbed
• Extract the image features using the techniques
discussed
15
Experimental Setup
• Perform a series of experiments using our CBIR system with LSH
indexing on all the 10 databases
– LSH’s parameters: L = 550 and k = 34
– Retrieve top 20 and top 50 ranking images for each query image
– Calculate recall and precision
• Simulate two rounds of relevance feedback
– re-querying the database with relevant examples with respect to the
given query
• Environments
– 3.2GHz Intel Pentium 4 PC with 2GB memory running Linux kernel 2.6
– All implementations are programmed by C++ language
• The same experiments are repeated with Exhaustive Searching
indexing
16
Average Precision of TOP20
• The results of LSH is very close to the ES results
• Their maximal difference is no more than 5% at any database size
Average Precision of TOP20
17
Average Recall of TOP50
• Average recall and average precision of our CBIR system decrease
with the database size, yet the decreasing rate
• Diminishes when the database size increases
Average Recall of TOP50
18
Average Query Time
• the query time for ES is linear to the database
size, while the one for LSH is sublinear
19
Time Performance of LSH over ES on
different databases
• LSH approach is much faster than the ES solution with an average speedup
greater than 4 times
• The gap of time performance between them grows even faster when the
database size increases
20
Conclusion and Future Work
• Proposed a scalable CBIR scheme based on a fast high
dimensional indexing technique, LSH
• Conducted extensive empirical evaluations on a large
testbed of a half million images
– Our scalable CBIR system is more efficient than traditional
exhaustive linear searching methods
– Our system is scalable to large-scale CBIR applications
• Addressed some limitations and challenges in our
current solution
• Consider parallel solutions in the coming future
21
Q&A
22