Download Report


Presented by Dipti Vaidya
• Image Retrieval
• Content Based Image Retrieval
• Various visual features and their
corresponding techniques
• Indexing
• Examples of VIR systems
• Research issues
• Digital images and videos are becoming an
integral part of human communication
• Giga bytes of images generated everyday
• Preservation
• Make the information organized to allow
efficient browsing, searching and retrieval
Text-based Image Retrieval
• Annotate images by text and then used text-based
DBMS to perform the image retrieval
• Difficulties:
• Large amount of labor in manual annotation
• Digital Imagery is a subjective source of
• Certain visual properties (pattern, colors, shapes,
textures) are different or nearly impossible to
describe with text.
Content-Based Information
• Instead of being manually annotated by
text-based keywords, images would be
indexed by their own “ visual content”
• Computer vision and pattern recognition
Image query visual content
Spatial Relationship
• Distribution of color is a useful feature for image
• Color distribution is represented as a histogram of
intensity of values.
• Color of any pixel may be represented in terms of
component, RGB.
• A histogram is defined each of whose bins
correspond to a range of these values for each
• Let Q and I be two histograms and both contain N
bins, the intersection (query and image in
database) is defined as follows:
Problem:It is computationally expensive. For N
histogram bins and M is the total number of
images in database, the computation cost is
O(NM). DB is exhaustively searched the only way
to reduce search time is to reduce N.
K best colors in a given color space:
• A partition of color space is into “K super-cells;” each of which will
correspond to a histogram bin. Color histogram of images or objects
can then be calculated as the normalized count of the pixels that fall in
each of these super cells.
• The advantage of this approach is that the clustering process will take
into account the color distribution of images over the entire database
and this will minimize the likelihood of histogram bins =>
only a small number of number of histogram bins tend to capture the
majority of pixels of an image=> only largest bins ( in terms of pixel
counts) need be selected as the representation of any histogram, and as
the bins of the query and image histograms are apparently matched,
intersection may be computed. It does not degrade the performance of
histogram matching.
• Disadvantage of a histogram is that it lacks
any performance about location—divide an
image into sub-areas and calculate a
histogram for each of these sub-areas.
Increasing the number of sub-area increases
the information about location, but it also
increases the memory.
Object-detection by color-based
• Segmentation begins by dividing an image into
achromatic and chromatic region, based strictly on
the chroma component of each pixel. The Hue
component is then used to further segment the
image into a set of uniform region based on
histogram difference metric. Finally, post
processing is carried out to recover from over
• Performance may be degraded in the presence of
strong highlights or shading.
Shape Retrieval
• Shape Representation
– Invariant to translation, rotation and scaling
• Boundary Based
– Outer Boundary
– Fourier Descriptor
• Region Based
– Entire shape region
– Moment invariant
• Homogeneity of visual patterns
Granularity,directionality and repetitiveness
• Co-occurrence matrix
Grey-level spatial dependence of texture
Orientation and Distance
• Texture representation
Human visual perception of texture
6 visual texture properties- coarseness, contrast,
directionality,likeness, regularity and roughness
• Wavelets
• To make CBIR truly scalable to large size
image collections, efficient
multidimensional indexing techniques needs
to be explored
• Challenges:
- High dimensionality
- Non-Euclidean similarity measure
• Towards solving these problems, one
promising approach is to first perform
dimension reduction and then use
appropriate multi-dimensional indexing
Dimension Reduction
At least 2 approaches appeared in the
• Karhunen-Loeve Transform (KLT)
• Column-wise clustering
• Considering that the Image Retrieval
System is a dynamic system and new
images are continuously added to the image
collection, a dynamic update of indexing
structure is indispensably needed.This
algorithm provides such a tool.
Column wise clustering
Normally it is used to cluster similar objects
together to perform recognition or grouping.
Clustering can also be used column wise to
reduce the dimensionality of the feature
Multi-dimensional Indexing
Bucketing Algorithm
K-d tree
K-D-B tree
R- tree and it’s variants R+ tree and R*- tree
CBIR Systems
• More than 80 systems have been identified
Most Image retrieval systems support one or more of
the following options:
• Random browsing
• Search by example
• Search by sketch
• Search by text
• Navigation with customized image categories
CBIR Systems
Other Systems
• Two key properties of QBIC are
(1) its use of image and video content
computable properties of color, texture,
shape, and motion of images, videos, and
their objects in the queries, and
(2) its graphical query language in which
queries are posed by drawing, selecting, and
other graphical means.
• QBIC has two main components: database
population (the process of creating an image
database) and database query.
• During the population, images and videos are processed to extract
features describing their content colors, textures, shapes, and camera
and object motion and the features are stored in a database. During the
query, the user composes a query graphically. Features are generated
from the graphical query and then input to a matching engine that finds
images or videos from the database with similar features.
QBIC – Data Model
For both population and query, the QBIC data model has
  still images or scenes (full images) that contain objects
(subsets of an image), and
  video shots that consist of sets of contiguous frames and
contain motion objects.
Videos are broken into clips called shots. Representative frames, or r-frames,
are generated for each extracted shot. R-frames are treated as still images, and
features are extracted and stored in the database. Further processing of shots
generates motion objects for example, a car moving across the screen.
QBIC – Sample Queries
• For each full-scene image, identified image object, rframe, and identified video object resulting from the above
processing, a set of features is computed to allow contentbased queries. The features are computed and stored during
database population.
• A multiobject query is asking for images that contain both
a red round object and a green textured object. The features
are standard color and texture. The matching is done by
combining the color and texture distances. Combining
distances is applied to arbitrary sets of objects and features
to implement logical and semantics
QBIC – Database Population
In still image database population, the images are reduced to a
standard-sized icon called a thumbnail and annotated with any
available text information. Object identification is an optional but key
part of this step. It lets users manually, semi-automatically, or fully
automatically identify interesting regions which we call objects in the
images. Internally, each object is represented as a binary mask. There
may be an arbitrary number of objects per image. Objects can overlap
and can consist of multiple disconnected components like the set of
dots on a polka-dot dress. Text, like "baby on beach," can be associated
with an outlined object or with the scene as a whole.
Open Research Issues
• Feature Extraction
Robust retrieval algorithms
• Multi-dimensional Indexing
Retrieval speed
• Human-computer interaction
Query Construction
• Human Visual Perception
• Evaluation Criterion
Performance – effective measures; precision/recall
Large-scale image test bed.