Content-based Retrieval Using Local Descriptors

download report

Transcript Content-based Retrieval Using Local Descriptors

CS848 Similarity Search in Multimedia Databases
Dr. Gisli Hjaltason
Content-based Retrieval Using Local
Descriptors: Problems and Issues from
Databases Perspective
Laurent Amsaleg & Patrick Gros
IRISA-CNRS, Campus de Beaulieu, Rennes, France
Presented by: Wei Jiang
February 19, 2003
Outline


Introduction
Image Processing Techniques
–
–
–
–

Global vs Local Descriptors
Extracting Interest Points
Computing Local Descriptors
Extension to Colour Images
Database Indexing Techniques
– Traditional Approaches
– VA-File and Pyramid-Tree


Performance Evaluation
Conclusion & Perspectives
Introduction

Descriptor
–
–
–
–

Root of image content-based retrieval system
Represented by multi-dimentional vectors of real numbers
Encodes specific information extracted from an image
Invariant to some types of variations
Content-based retrieval
– Image processing techniques to extract descriptors from
images
• Large-grain recognition: color histogram, grey-level histogram
• Fine-grain recognition: local descriptors
– Database techniques to store descriptors and accelerate
searches
• Dimensional curse
• Pyramid-Tree and VA-File
Introduction (con’d)

Motivation:
To explore the consequences of using local
descriptors together with up-to-date database multidimensional indexing strategies.
Outline


Introduction
Image Processing Techniques
–
–
–
–

Global vs Local Descriptors
Extracting Interest Points
Computing Local Descriptors
Extension to Colour Images
Database Indexing Techniques
– Traditional Approaches
– VA-File and Pyramid-Tree


Performance Evaluation
Conclusion & Perspectives
Global vs. Local Descriptors
Global Descriptor






Computes one descriptor per
image
Cannot identify elements
within images
Lower computation cost
Smaller database size
Lower searching cost
Globally robust
Local Descriptor






Computes several
descriptors per image
Can identify elements within
images
Higher computation cost
Bigger database size
Higher searching cost
Increase the recognition
power
Outline


Introduction
Image Processing Techniques
–
–
–
–

Global vs Local Descriptors
Extracting Interest Points
Computing Local Descriptors
Extension to Colour Images
Database Indexing Techniques
– Traditional Approaches
– VA-File and Pyramid-Tree


Performance Evaluation
Conclusion & Perspectives
Extracting Interest Points

Interest Points
– Points in one image that will be also found in similar images
– Where the signal changes 2-dimensionally
Harris Corner Detector

Gaussian filter to avoid image noise
– Gaussian function:
– Gaussian distribution and discrete approximation (with mean
(0,0) and =1)
Harris Corner Detector

Gaussian filter to avoid image noise
– Computing the convolution of the original signal with
Gaussian function to get the smoothed signal I

Computing the eigenvalues of the matrix

Significant values of the eigenvalues indicate an
interest point
Outline


Introduction
Image Processing Techniques
–
–
–
–

Global vs Local Descriptors
Extracting Interest Points
Computing Local Descriptors
Extension to Colour Images
Database Indexing Techniques
– Traditional Approaches
– VA-File and Pyramid-Tree


Performance Evaluation
Conclusion & Perspectives
Computing Local Descriptor

Goal:
– To make local descriptors invariant w. r. t. the type of
variations

Computing in two steps:
– Computing the derivatives of the smoothed signal I (up to the
third order), to provide a basic description
– Mixing the derivatives to enforce invariance properties and to
make descriptors robust to changes
Local Descriptors for Grey-level Images

Gaining translational and rotational invariance
– Nine invariant quantities to eliminate the angle of rotation

Gaining photometric invariance
– Illumination is modelled by I --> aI + b

Gaining scale invariance
– Multi-scale approach
F is a function, a is a scale. F(x)= G(ax).
Outline


Introduction
Image Processing Techniques
–
–
–
–

Global vs Local Descriptors
Extracting Interest Points
Computing Local Descriptors
Extension to Colour Images
Database Indexing Techniques
– Traditional Approaches
– VA-File and Pyramid-Tree


Performance Evaluation
Conclusion & Perspectives
Extension to Colour Images




RGB system
Extracting interest points by Harris detector
Computing the derivatives separately for every
channel
Mixing the derivatives
– Gaining rotational invariance
– Gaining scale invariance
– Gaining photometric invariance
Outline


Introduction
Image Processing Techniques
–
–
–
–

Global vs Local Descriptors
Extracting Interest Points
Computing Local Descriptors
Extension to Colour Images
Database Indexing Techniques
– Traditional Approaches
– VA-File and Pyramid-Tree


Performance Evaluation
Conclusion & Perspectives
Database Indexing Techniques

Problem:
– The space for descriptor gets too big to fit in main memory

Solution:
– To store descriptors on disks
– Multi-dimensional index structures to accelerate searches

Goal:
– To minimize the resulting number of I/Os
Outline


Introduction
Image Processing Techniques
–
–
–
–

Global vs Local Descriptors
Extracting Interest Points
Computing Local Descriptors
Extension to Colour Images
Database Indexing Techniques
– Traditional Approaches
– VA-File and Pyramid-Tree


Performance Evaluation
Conclusion & Perspectives
Traditional Approaches

Data-partitioning index methods
– Dividing the data space according to the distribution of data
• R-Tree: Minimum bounding rectangles and overlaps
• SS-Tree: bounding spheres instead of rectangles
• SR-Tree: intersection of a bounding sphere and a bounding
rectangle
• TV-Tree: divide the dimensions into three classes
– Drawback: in high-dimensional space, the probability of
accessing every index page gets close to 1
Traditional Approaches

Space-partitioning indexing
– Dividing the data space along predefined lines, regardless of
the actual data clustering
Grid-file, KDB-Tree, etc.
– Drawbacks
• Inefficient in high-dimensional space
• Indexing large volumes of empty space
• When the query point is near a cell boundary, the search cost is
increased.
Outline


Introduction
Image Processing Techniques
–
–
–
–

Global vs Local Descriptors
Extracting Interest Points
Computing Local Descriptors
Extension to Colour Images
Database Indexing Techniques
– Traditional Approaches
– VA-File and Pyramid-Tree


Performance Evaluation
Conclusion & Perspectives
VA-File





To improve the sequential search
Splitting each dimension, and encoding the grid cells
A file storing all the descriptors
Another file storing the geometrical approximations of
these descriptors associates a descriptor id to a cell #
Searching algorithm computes geometrical
approximation, determines close cells, and scans
them in an increasing order of distance
Pyramid-Tree





Dividing a space into 2xd pyramids, the top of each
pyramid is the center of the data space
Each pyramid is cut into slices parallel to its base
Any point of the multi-dimensional space is mapped
into a pair(pyramid number, height in the pyramid)
A given slice of a specific pyramid is stored as a page
of B+ tree
The number of slices increases linearly (and not
exponentially) with the number of dimensions
Outline


Introduction
Image Processing Techniques
–
–
–
–

Global vs Local Descriptors
Extracting Interest Points
Computing Local Descriptors
Extension to Colour Images
Database Indexing Techniques
– Traditional Approaches
– VA-File and Pyramid-Tree


Performance Evaluation
Conclusion & Perspectives
Performance Evaluations

Experimental environment
–
–
–
–

SUN Ultra 5 workstation running SunOS 5.7
CPU 333MHz UltraSPARC-Iii
Main memory 384 Mb
Local secondary storage 8 Gb
Databases
– One color image database
– One grey-level image database
– Third one for recognition evaluation test only
Experiments

Comparing the recognition power of color and grey
descriptors

Influence of the Dimensionality of Data

Influence of the Database Size

Impact of the Number of Descriptors in a Request
Experiment 2: Influence of the Dimensionality of Data
Experiment 3: Influence of the Database Size
Experiment 4: Impact of the Number of Descriptors in a
Request
Outline


Introduction
Image Processing Techniques
–
–
–
–

Global vs Local Descriptors
Extracting Interest Points
Computing Local Descriptors
Extension to Colour Images
Database Indexing Techniques
– Traditional Approaches
– VA-File and Pyramid-Tree


Performance Evaluation
Conclusion & Perspectives
Conclusion and Perspectives

Slowdown is caused by the dimensionality of data,
the size of database, and the number of descriptors.

The three efficient multi-dimensional indexing
techniques do NOT efficiently cope with the fine-grain
recognition with local descriptors.

It’s crucial to come up with new indexing techniques
specially designed to efficiently support the use of
local descriptors.
Research Direction




Numerous local descriptors for a single query creates
redundancy
Exploit the distribution of data to accelerate the
queries
Change the management of memory to benefit from
consecutive queries
Using several low-dimension indexes instead of a
unique high-dimension index