Content Based Image Retrieval . ppt
Download
Report
Transcript Content Based Image Retrieval . ppt
Content Based Image Retrieval
Problems, issues, future directions
Problem ???
MEDIA FLOODING
EXAMPLE: GENERAL PHOTOGRAPHY
SNAPSHOT
PREVIEWS
MEMORY
REUSABLE
PRINTER
TECHNOLOGY
EASY SHARING
VIA INTERNET
CHEAP &
DENSE
STORAGE
EFFECTS &
PROCESSING
RESULT: DIGITAL MEDIA FLOOD
HOW DO WE COPE, TRACK, ORGANIZE IT
ALL?
MOTIVATION
DEVICE FUNCTION CONVERGENCE
DATA RAPIDLY GENERATED BY MANY
DEVICES
INTERNET ACTS AS GLOBAL TRANSPORT
DATA CONSUMED BY DEVICES ON DEMAND
MULTIMEDIA DATA NEEDS TO BE
EFFICIENTLY STORED
INDEXED ACCURATELY
EASILY RETRIEVED
History of Image Retrieval
Traditional text-based image search engines
Manual annotation of images
Use text-based retrieval methods
Water lilies
E.g.
Flowers in a pond
<Its biological
name>
Text Based Image retrieval
by google
by yahoo
etc..
IMPORTANT QUESTION ARISES:
“WHY NOT SIMPLY INDEX USING TEXT?”
(YAHOO! HAS HAD SOME SUCCESS WITH THIS)
INTUITIVE, YET USING TEXT IS
SIMPLE BUT SIMPLISTIC
TIME CONSUMING – CAN’T AUTOMATE
HIGHLY SUBJECTIVE & USER-DEPENDENT
SUSCEPTIBLE TO TRANSLATION PROBLEMS
Content based image retrieval
What is content in images ??
Colour, texture, shape, etc.
CBIR
Definition:
“The process of retrieving images from a
collection on the basis of features (such
as colour, texture and shape)
automatically extracted from the images
themselves”
CBIR
SIMPLE EXAMPLE
FOR A GIVEN QUERY…
EXAMPLE IMAGE
ROUGH SKETCH
EXPLICIT DESCRIPTION
CRITERIA
RETRIEVAL
SYSTEM
…RETURN
ALL ‘SIMILAR’
IMAGES
RETRIEVAL RESULTS
QUERY
IMAGE
BASED ON COLOR CONTENT
CBIR
QUERY TYPES
COLOR
SKETCH
SHAPE
EXAMPL
E
TEXTUR
E
MORE COMPLEX TYPES
EXIST YET ABOVE
ARE MOST
FUNDAMENTAL &
CBIR
(DIS)SIMILARITY?
SIMILARITY IS NOT SO SIMPLE
CONSIDER THREE IMAGES
ON WHAT BASIS ARE THEY
SIMILAR?
COLOR CONTENT?
However …
Finding the right image is not always easy.
There are many millions of digital images
available on the Web.
Many more images waiting to be digitised.
Variable levels of metadata and associated
content – if any exist at all …
Diverse groups of users.
Users who don’t always know either what they
want or how to express it in words.
Why Research Image
Data?
Increasing use of digital images, particularly
via the Internet, by professionals of all kinds.
Inadequacy of current technology in
handling images.
Increasing interest in how people perceive,
search for and use images.
Exciting new applications opening up (ecommerce?).
Why Is Image Retrieval
Difficult?
Why Is Image Retrieval
Difficult?
Important to distinguish between the physical properties
of an image and how it is perceived.
Image retrieval should be based on the latter, not the
former!
Two Classes of CBIR
Narrow vs. Broad Domain
Narrow
Medical Imagery Retrieval
Finger Print Retrieval
Satellite Imagery Retrieval
Broad
Photo Collections
Internet
Block diagram of CBIR
Image
Fectures Extraction
Color
Shape
Color Sensation
Spatial Relation
Query Interface
User Drawing
Query
Query by
Color
Server
Indexing
&
Filtering
Internet
or
Intranet
or
Extranet
Image Database
Query by
Color Sensation
Query by
Images
Query by
Shape
Query by
Spatial Relation
Similarity Measure
Color
Shape
Learning
Weight of Features
Mechanism
Color Sensation
Spatial Relation
Server
Client
Query specification
Interfaces
Browsing and navigation
Specifying the conditions the objects of interest
must satisfy, by means of queries
Queries can be specified in two different
ways
Using a specific query language
Query by example
Using actual data (object example)
Conditions on multimedia data
Query predicates
Attribute predicates
Concern the attributes for which an exact value is supplied
for each object
Exact-match retrieval
Structural predicates
Concern the structure of multimedia objects
Can be answered by metadata and information
about the database schema
“Find all multimedia objects containing at least one
image and a video clip”
Conditions on multimedia data
Semantic
Concern
predicates
the semantic content of the
required data, depending on the features
that have been extracted and stored for
each multimedia object
“Find all the red houses”
Exact match cannot be applied
Uncertainty, proximity, and
weights in query expressions
Specify the degree of relevance of the retrieved
objects
Using some imprecise terms and predicates
Represent a set of possible acceptable values with
respect to which the attribute or he features has to
be matched
Normal, unacceptable, typical
Particular proximity predicates
The relationship represented is based on the
computation of a semantic distance between the
query object and stored ones
Nearest object search
Uncertainty, proximity, and
weights in query expressions
Assign each condition or term a given weight
Specify the degree of precision by which a
condition must be verified by an object
“Find all the objects containing an image
representing a screen (HIGH) and a keyboard
(LOW)”
The corresponding query is executed by assigning some
importance and preference values to each predicate and
term
Issues related to feature extraction
What are features of images ??
Colour
Chromatic Histograms, dominant colours, moments,
Shape
Texture
For the purpose of Colour Based
Image Retrieval
Is the colour system device independent
Is the colour system perceptual uniform
Is the colour system linear
Is the colour system intuitive
Is the colour system robust against varying
imaging conditions
Invariant to change in viewing direction
Invariant to change in object geometry
Invariant to change in direction of illumination
Invariant to change in intensity of illumination
Invariant to change in SPD of illumination
For the purpose of color based image
retrieval, color systems composed
according to the following criteria
•
is the color system device
independent ?
•
perceptual uniform ?
•
linear ?
•
intuitive ?
•
robust against varying imaging
conditions
invariant to a change in viewing direction
invariant to a change in object geometry
Change in illuminant
Color Histogram
The histogram of image I is defined as:
For a color Ci , Hci(I) represents the number of
pixels of color Ci in image I .
OR:
For any pixel in image I, Hci(I) represents the
possibility of that pixel is in color Ci.
Most commercial CBIR systems include color
histogram as one of the features (e.g., QBIC
of IBM).
No space information.
Improvement of color
histogram
There are several techniques proposed to
integrate spatial information with color
histograms:
W.Hsu, et al., An integrated color-spatial approach to
content-based image retrieval. 3rd ACM Multimedia
Conf. Nov 1995.
Smith and Chang, Tools and techniques for color image
retrieval, SPIE Proc. 2670, 1996.
Stricker and Dimai, Color indexing with weak spatial
constraints, SPIE Proc. 2670, 1996.
Gong, et al., Image indexing and retrieval based on
human perceptual color clustering, Proc. 17th IEEE
Conf. On Computer Vision and Pattern Recognition,
1998.
Color auto-correlogram
Pick any pixel p1 of color Ci in the image
I, at distance k away from p1 pick
another pixel p2, what is the probability
that p2 is also of color
RedC? i?
k
P2
P1
Image: I
Color auto-correlogram
The auto-correlogram of image I for
color Ci , distance k:
(I ) Pr[| p1 p2 | k, p2 IC | p1 IC ]
(k )
Ci
i
i
Integrate both color information and
space information.
Color auto-correlogram
Implementations
Pixel Distance Measures
Use D8 distance (also called chessboard
distance):
D8 ( p, q) max(| px qx |, | py qy |)
(n2 )
(134* n2 )
Choose distance k=1,3,5,7
Computation complexity:
Implementations
Features Distance Measures:
D( f(I1) - f(I2) ) is small I1 and I2 are
similar.
Example: f(a)=1000,
f(a’)=1050;
f(b)=100,
| hC ( I )
hC ( I ' ) |
| I I ' |h
f(b’)=150 i[ m]1 hC ( I ) hC ( I ' )
i
i
i
For
histogram:
| I I ' |
For
i
| C( ki ) ( I ) C( ki ) ( I ' ) |
(k )
(k )
1
(
I
)
i[ m ], k[ d ]
Ci
Ci ( I ' )
correlogram:
Human beings can perceive specific wavelengths as colors
CBIR …
is about:
but is not about:
“blue”
“Botticelli”
“seashell”
“Venus”
MPEG-7
Motivation
To efficiently search/retrieve relevant
information that people want to use
Goal
To make it easy to
search/retrieve/filter/exchange content to
maintain archive, and to edit multimedia
content etc.
MPEG-1, 2, 4 : Representation of contents itself
MPEG-7 : Representation of information about
the content
Types of multimedia data
audio, speech, video, still pictures, graphics
Components of MPEG-7
1)
2)
3)
4)
5)
6)
7)
MPEG-7 Systems
MPEG-7 Description Definition
Language
MPEG-7 Visual
MPEG-7 Audio
MPEG-7 Multimedia DSs
MPEG-7 Reference Software
MPEG-7 Conformance
Color Descriptors
Color Spaces
Constrained color spaces
Scalable Color Descriptor uses HSV
Color Structure Descriptor uses HMMD
MPEG-7 color spaces:
Monochrome
RGB
HSV
YCrCb
HMMD
Scalable Color Descriptor
A color histogram in HSV color space
Encoded by Haar Transform
Dominant Color Descriptor
Clustering colors into a small number of
representative colors
It can be defined for each object, regions, or
the whole image
F = { {ci, pi, vi}, s}
ci : Representative colors
pi : Their percentages in the region
vi : Color variances
s : Spatial coherency
Color Layout Descriptor
Clustering the image into 64 (8x8)
blocks
Deriving the average color of each
block (or using DCD)
Applying DCT and encoding
Efficient for
Sketch-based image retrieval
Content Filtering using image
indexing
Color Structure Descriptor
Scanning the image by an 8x8 pixel block
Counting the number of blocks containing
each color
Generating a color histogram (HMMD)
Main usages:
Still image retrieval
Natural images retrieval
GoF/GoP Color Descriptor
Extends Scalable Color Descriptor
Generates the color histogram for a
video segment or a group of pictures
Calculation methods:
Average
Median
Intersection
Color Opponency
Color
Afterimages
Color Opponency
Color Blindness
Color Constancy
Discounting the illuminant - adaptation
The color of the ambient lighting quickly
fatigues photorecptors to that color -- There
is no eye position that allows the
photoreceptors to recover
Once fatigued to the ambient color, that color
is subtracted, or discounted from the visual
scene and colors appear close to the way they
would in white light
Color Constancy
There is a cognitive
component as well.
Color Constancy
There is a cognitive
component as well.
Color Vision
1.
2.
Memory & Imagery
Achromatopsia
Form & Motion
Interactions of color system with
other visual components
Case of Achromatopsia
Damage to V4 can cause the complete
loss of color vision (as opposed to redgreen color blindness): V4 is more
sensitive to oxygen deprivation
In addition, color imagery and color
memory are also lost
What are the implications for
perception, imagery and memory?
Color, Form & Motion
Although V4 interacts with other areas (V3
& V5 are monochromatic), its interactions
are limited
Equiluminant color conditions makes form
and motion perception difficult -- but not
impossible
Equiluminant Colors
Equiluminant Colors
Equiluminant Colors
Some Applications Areas
Planning and government:
there is a lot of satellite
imagery of the earth, which can be used to inform important
political debates. For example, how far does urban sprawl
extend? what acreage is under crops? how large will the maize crop
be? how much rainforest is left?, etc.
Military intelligence:
satellite imagery can contain
important military information. Typical queries involve finding
militarily interesting changes — for example, is there a
concentration of force? how much damage was caused by the
last bombing raid? what happened today? etc. — occurring at
particular places on the earth
Stock photo and stock footage: commercial libraries —
which often have extremely large and very diverse collections —
survive by selling the rights to use particular images. Effective tools
may unlock value in these collections by making it possible for
relatively unsophisticated users to obtain images that are useful to
them at acceptable expense in time and money.
Access to museums: museums are increasingly creating
web views of their collections, typically at restricted resolutions, to
entice viewers into visiting the museum. Ideally, one would want
viewers to get a sense of what is at the museum, why it is worth
visiting and the particular virtues of the museum’s gift store.
Trademark and copyright enforcement: as electronic
commerce grows, so does the opportunity for automatic searches to
.nd violations of trademark or of copyright. For example, at time of
writing, the owner of rights to a picture could register it with an
organisation called BayTSP, who would then search for stolen
copies of the picture on the web; recent changes in copyright law
make it relatively easy to recover fines from violators (see
http://www.baytsp.com/index.asp).
Managing the web: indexing web pages appears to be a
profitable activity; the images present on a web page
should give cues to the content of the page. Users may
also wish to have tools that allow them to avoid offensive
images or advertising. A number of tools have been built
to support searches for images on the web using CBIR
techniques. There are tools that check images for
potentially offensive content, both in the academic and
commercial domains.
Medical information systems: recovering medical images
“similar” to a given query example might give more
information on which to base a diagnosis or to conduct
epidemiological studies. Furthermore, one might be able
to cluster medical images in ways that suggest
interesting and novel hypotheses to experts.
5-min Recap
Why is Image IR important?
“a picture is worth a 1000 words”
Alternative form of communication
Not everything can be described in
text; Not everything can be described
in images
Popular medium of information on the
Internet
Search and Retrieval Process
It’s all over
Earlier Works of CBIR
The main features of earlier works of CBIR:
Focused on effective FEATURE
representation
such as color, texture, shape.
Indexing image contents based on
Features.
Disadvantages of the previous work
Semantic gap between high level concepts and low
level image feature representation. Hence hard to
select appropriate features.
1st iteration
Display
User
Feedback
Feedback
to system
Estimation &
Display selection
2nd iteration
Display
User
Feedback
Problem Statement
Assumption: images of the same semantic
meaning/category form a cluster in feature
vector space
Given a set of positive examples, learn user’s
preference and find better result in the next
iteration
Former Approaches
Multimedia Analysis and Retrieval
System (MARS)
IEEE Trans CSVT 1998
Weight updating, modification of
distance function
Pic-Hunter
IEEE Trans IP 2000
Probability based, updated by Bayes’
rule
Comparisons
Aspect
Model
Description
Modelin
g of
user’s
target
MARS
Weighted Euclidean distance
Pic-Hunter
Probability associated with each image
Our
approach
User’s target data point follow Gaussian
distribution
Learnin
g
method
MARS
Weight updating, modification of distance
function
Pic-Hunter
Bayes’ rule
Our
approach
Parameter estimation
MARS
K-NN neighborhood search
Pic-Hunter
Maximum entropy principle
Our
Simulated maximum entropy principle
Display
selectio
n
Estimation of Target
Distribution
Assume the user’s target follows a Gaussian
distribution
Construct a distribution that best fits the
relevant data points into some “specific”
region
Data points selected as relevant
Estimation of Target
Distribution
Assume the user’s target follows a Gaussian
distribution
Construct a distribution that best fits the
relevant data points into some “specific”
region
Data points selected as relevant
Estimation of Target
Distribution
Assume the user’s target follows a Gaussian
distribution
Construct a distribution that best fits the
relevant data points into some “specific”
region
Data points selected as relevant
Expectation Function
Best fit the relevant data points to medium
likelihood region
The estimated distribution represents user’s
target
Updating Parameters
After each feedback loop, parameters are
updated
New estimated mean = mean of relevant
data points
New estimated variance found by
differentiation
Iterative approach
Indexing and searching
Searching similar patterns
Distance function
Given two objects, O1 and O2, the
distance (=dissimilarity) of the two
objects is denoted by D(O1,O2)
Similarity queries
Whole match
Sub-pattern match
Nearest neighbors
All pairs
Spatial access methods
Map objects into points in f-D space, and to
use multiattribute access methods (also
referred to as spatial access methods or
SAMs) to cluster them and to search for them
Methods
R*-trees and the rest of the R-tree family
Linear quadtrees
Grid-files
Linear quadtrees and grid files explode
exponentially with the dimensionality
R-tree
R-tree
Represent a spatial object by its
minimum bounding rectangle (MBR)
Data rectangles are grouped to form
parent nodes (recursively grouped)
The MBR of a parent node completely
contains the MBRs of its children
MBRs are allowed to overlap
Nodes of the tree correspond to disk
R-tree
Range query
Specify a region of interest, requiring
all the data regions that intersect it
Retrieve
Compute the MBR of the query
region
Recursively descend the R-tree,
excluding the branches whose
MBRs do not intersect the query
Generic multimedia indexing
approach
“Whole match” problem
A collection
of N objects: O1,
O2,…,ON
The
distance/dissimilarity between
two objects (Oi,Oj) is given by the
function D(Oi,Oj)
User specifies a query object Q, and
a tolerance ε
Goal
GEMINI
Generic Multimedia object INdexIng
Ideas
A ‘quick-and-dirty’ test, to discard
quickly the vast majority of nonqualifying objects (possibly, allowing
some false alarms)
The use of spatial access methods, to
achieve faster-than-sequential
searching
GEMINI
Example
Database: yearly stock price movements, with
one price per day
1/ 2
Distance function
2
D( S , Q) S[i] Q[i]
Euclidean distance
i 1
The idea behind the quick-and-dirty test is to
characterize a sequence with a single number
(feature), which help us discard many nonqualifying sequences
Average stock price over the year, standard
GEMINI
Mapping function
Let F() be the mapping of objects to fdimensional points, that is, F(O) will be the
f-D point that corresponds to object O
Organize f-D points into a spatial access
method, cluster them in a hierarchical
structure, like the R*-trees
Upon a query, we can exploit the R*-tree, to
prune out large portions of the database that
are not promising
GEMINI
Search algorithm (for whole match
query)
Map the query object Q into a point
F(Q) in feature space
Using a spatial access method,
retrieve all points within the desired
tolerance εfrom F(Q)
Retrieve the corresponding objects,
compute their actual distance from Q
GEMINI
Lower Bounding lemma
To guarantee no false dismissals for wholematch queries, the feature extraction function
F() should satisfy the following formula
DfeatureF O1, F O2 DO1, O2
Dfeature(): distance of two feature vectors
(mapping F() from objects to points should
make things look closer)
GEMINI
GEMINI algorithm
Determine the distance function D()
between two objects
Find one or more numerical featureextraction functions, to provide a
‘quick-and-dirty’ test
Prove that the distance in feature
space lower-bounds the actual
distance D(), to guarantee
correctness
GEMINI
‘Feature-extracting’ question
If we are allowed to use only one
numerical feature to describe each
data object, what should this feature
be?
The successful answers to the question
should meet two goals
They should facilitate step 3 (the
distance lower-bounding)
They should capture most of the
R type database e.g. Access and
OLE
Object Linking and Embedding was Microsoft’s first
architecture for integrating files of different types:
Each file type in Windows is associated with an
application It is possible to place a file of one type
inside another:
either by wholly embedding the data in which
case it is rendered by a plug-in associated with the
program
or by placing a link to the data in which case it is
rendered by calling the original program
Access works with this system by providing a domain
type for OLE
There’s not much you can do with OLE fields since
the data is in a format that Access does not
R databases e.g. BFILEs in
Oracle
The BFILE datatype provides access to
BLOB files of up to 4 gigabytes that are
stored in file systems outside an Oracle
database.
The BFILE datatype allows read-only
support of large binary files; you
cannot modify a file through Oracle.
Oracle provides APIs to access file
data.
Large Object Types in Oracle
and SQL3
Oracle and SQL3 support three large object
types:
BLOB - The BLOB domain type stores
unstructured binary data in the database.
BLOBs can store up to four gigabytes of
binary data.
CLOB – The CLOB domain type stores up
to four gigabytes of single-byte character
set data
NCLOB - The NCLOB domain type stores
up to four gigabytes of fixed-width and
varying width multi-byte national character
Cont …
These types support
Concatenation – making up one LOB by
putting two of them together
Substring – extract a section of a LOB
Overlay – replace a substring of one LOB
with another
Trim – removing particular characters (e.g.
whitespace) from the beginning or end
Length – returns the length of the LOB
Position – returns the position of a
substring in a LOB
Upper and Lower – turns a CLOB or
Large Object Types in
MySQL
MySQL has four BLOB and four CLOB (called
TEXT in MySQL) domain types:
TINYBLOB and TINYTEXT – store up to 256
bytes
BLOB and TEXT – store up to 64K bytes
MEDIUMBLOB and MEDIUMTEXT – store
up to 16M bytes
LONGBLOB and LONGTEXT – store up to
4G bytes
Oracle interMedia Audio,
Image, and Video
Oracle interMedia supports multimedia storage,
retrieval, and management of:
BLOBs stored locally in Oracle8i onwards and
containing audio, image, or video data
BFILEs, stored locally in operating systemspecific file systems and containing audio, image
or video data
URLs containing audio, image, or video data
stored on any HTTP server such as Oracle
Application Server, Netscape Application Server,
Microsoft Internet Information Server, Apache
HTTPD server, and Spyglass servers
The Object Relational Multimedia
Domain Types in interMedia
interMedia provides the ORDAudio,
ORDImage, and ORDVideo object types and
methods for:
updateTime ORDSource attribute
manipulation
manipulating multimedia data source
attribute information
extracting attributes from multimedia data
getting and managing multimedia data
from Oracle interMedia, Web servers, and
other servers
Cont …
The properties available are:
ORDImage – the height, width, data size of
the on-disk image, file type, image
type,compression type, and MIME type
ORDAudio – the format, encoding, number of
channels, sampling rate, sample
size,compression type, and audio duration
ORDVideo – the format, frame size, frame
resolution, frame rate, video duration, number
of frames, compression type, number of
colours, and bit rate
Cont …
Oracle also stores metadata including:
source type, location, and source
name
MIME type and formatting information
characteristics such as height and
width of an image, number of audio
channels, video frame rate, pay time,
etc.
Open issues
Gap between low level features and
high-level concepts
Human in the loop – interactive systems
Retrieval speed – most research
prototypes can handle only a few
thousand images.
A reliable test-bed and measurement
criterion, please!
Query Refinement in
Multimedia Similarity Retrieval
To refine the query to represent the
information that the user is looking for.
optimal query representation
initial query representation
Sim=0.7
Sim=0.8
Sim=0.9
refine
Relevant according to the current query
Relevant according to the optimal query
Query Refinement Models
User Feedback
Inter-feature Refinement
(Feature Re-weighting)
Query Refinement Model
Multi-feature Query
Intra-feature Refinement
(Query Modification
& Re-weighting)
Index
Index
Index
Individual Feature Queries
Intra-feature Refinement
Query Expansion
Query Point Movement
new query representation (C*)
Q2
Q1
Sim=0.7
Sim=0.8
Sim=0.9
Q3
Q4
P
initial query representation
Dist(Q
new
,P) Dist(C , P )
*
m
1
j 1
(C *j P j ) 2
j
j standard deviation of Q in dimension j
C * w eightedcentroid of Q
new query representation (Q1…4)
P
initial query representation
Dist(Q, P ) w i Dist(Qi , P )
w i w eightof Qi
weights based on relevance level
Selecting Relevant Points in
Query Expansion
Cluster centroids to be added
to query representation
• Clustering algorithm used to
cluster the relevant points
• Cluster centroids chosen as
a new query points
Feature
space
Query Expansion: multi-point approach
Node distance from a multi-point query is defined as :
Q2
Q1
w2
Q3
w3
w1
R
P
MinDist(Q,R) w1 MinDist(Q1,R) + w2 MinDist(Q2,R) + w3 MinDist(Q3,R)
PR : Dist(Q,P) MinDist(Q,R)
Query Processing for Refined Query
Naively,
execute the refined query just like
executing the initial query
Observation:
the query representation does not change
dramatically across feedback iterations.
exploit the work done in the previous
iteration by reusing the priority queue used
in the previous kNN search for the next
iteration.
Refined Query
1 2 34 5
Previous priority queue
New priority queue
Previous query P = P1, P2, P3
New query Q = Q1, Q2, Q3, Q4
wp1P1 + wp2P2 + wp3P3
wq1Q1 + wq2Q2 + wq3Q3 + wq4Q4
CBIR
SUMMARY
BORN FROM MULTIMEDIA FLOOD
TEXT TOO SIMPLE AND
LABORIOUS
SYSTEMS WORK DECENTLY IN
VITRO
QUERY BY SHAPE, COLOR,
TEXTURE, EXAMPLE
SHORTCOMINGS
ONGOING
RESEARCH-2
RELEVANCE FEEDBACK
ITERATIVE QUERY REFINEMENT
PLACE USER IN LOOP TO
ITERATIVELY IMPROVE
RETRIEVAL RATES
HIGH-DIMENSIONAL SPACE
NEEDS PRUNING
EMPHASIZED FEATURE(S)
MUST BE FOUND
ONGOING
RESEARCH-2
RELEVANCE FEEDBACK
FEATURE SELECTIVE INTERFACE
WHY CHOOSE IMAGES ON
WHOLE? REQUIRES
PROCESSING/STATS TO FIND
RELEVANT SHAPE
GOOD FEATURES
EXPLICIT FEATURES
USER CAN EXPLICITLY
TO R.F. ENGINE
INDICATE ELEMENTS OF
IMAGE WHICH
ARECOLOR
GOOD:
RELEVANT
ONGOING
RESEARCH-3
SIMILARITY AGGREGATION/HYBRID
QUERIES
TYPICALLY USED APPROACHES
BOOLEAN (AND, OR & NOT
OPERATORS)
EUCLIDEAN (MINKOWSKI W/
r=1)
WEIGHTED AVERAGE (WA)
i.e. SUPERVECTORS
DISADVANTAGES
ONGOING
RESEARCH-4
SIMILARITY AGGREGATION/HYBRID
QUERIES
FUZZY AGGREGATION OF
DECISIONS
USE MEMBERSHIP FUNCTION
TO ‘FUZZIFY’ DISTANCES &
GENERATE A ‘FUZZY
DECISION’FUZZY
m
MEMBERSHIP
d
SIMILARITY
DISTANCE
FUNCTION
FUZZY DISTANCE
DECISION
ONGOING
RESEARCH-5
DISTRIBUTED MULTIMEDIA INDEXING
INDEXES USUALLY
CENTRALIZED
ENTIRE SYSTEM FAILS IF
COMPONENT FAILS
NO GRACEFUL PERFORMANCE
DEGRADATION
HIGH DATA VOLUME = HIGH
SYSTEM REQ’S
ONGOING
RESEARCH-6
DISTRIBUTED MULTIMEDIA INDEXING
SMALL WORLD INDEXING
MODEL1
SOCIOLOGICAL PEER
DESCRIPTIONS
WE ARE NOT BLIND TO WHO
OUR PEERS ARE
PEOPLE KEEP MEMORY OF
THEIR PEERS
[1] P. Androutsos, D. Androutsos and A. N. Venetsanopoulos, “A distributed fault-tolerant MPEG-7 retrieval scheme based on small world
theory”, Distributed Media Technologies and Applications Special Issue of IEEE Transactions on Multimedia, under review.
RESEARCH
AVENUES-1
HYBRID QUERIES &
AGGREGATION
WHAT DO WEIGHTS MEAN?
HOW TO CHOOSE?
ALTERNATIVE
AGGREGATIONS METHODS
ADAPTIVE SCHEMES USING
REL. FEEDBACK
RESEARCH
AVENUES-2
PERCEPTUAL ISSUES
EMPHASIS OF DOMINATING
FEATURES
FEATURE MASKING
EMOTIONAL INDEXING/
ALL USERS DIFFERENT–
CUSTOMIZED PROFILE
RESEARCH
AVENUES-3
DISTRIBUTED INDEXING
DISTRIBUTED INDEXES &
RETRIEVAL
INDEX SYNCHRONIZATION
RESULTS ORGANIZATION &
RANKING
SWIM OVERHEAD
ESTIMATION
SUMMARY-1
MULTIMEDIA PROCESSING
RESULTS FROM MULTIMEDIA
EXPLOSION
USERS DEMANDING MORE
FROM DEVICES
DEVICES ARE CONVERGING
CONTENT BASED IMAGE
RETRIEVAL
Introduction
As Imaging systems evolved in
complexity and openness, the need for
device-independent image measures
became clear.
It was quickly recognized that devicedependent color coordinates (such as
monitor RGB and printer CMYK) could
not be used to specify and reproduce
color images with accuracy and
precision.
Colorimetry
Device-independent color
measurements are based on the
internationally-standardized CIE system
of colorimetry first developed in 1931.
CIE colorimetry specifies a color
stimulus with numbers proportional to
the stimulation of the human visual
system independent of how the color
stimulus was produced.
Outline
Introduction :
Colorimetry
Color / Image Difference
Color / Image Appearance Model
The iCAM framework
Input Images
First Stage: Chromatic Adaptation (Color
Appearance)
Second Stage: Appearance Attributes
Third Stage: Spatial Filtering (Image
Difference)
Color Difference Eqs
A color difference equation allows for
the mapping of physically measured
stimuli into perceived differences.
CIELAB and CIELUV. (△Eab)
CIE DE94 and CIEDE2000.
Image Difference
The CIE color difference formula were
developed using simple color patches in
controlled viewing condition. There is no
reason to believe that they are
adequate for predicting color difference
for spatially complex image stimuli.
S-CIELAB contrast sensitivity functions.
(CSF)
Image Difference
The CSF serves to remove information
that is imperceptible to the visual
system. For instance, when viewing
dots at a certain distance the dots tend
to blur, and integrate into a single color.
Outline
Introduction :
Colorimetry
Color / Image Difference
Color / Image Appearance Model
The iCAM framework
Input Images
First Stage: Chromatic Adaptation (Color
Appearance)
Second Stage: Appearance Attributes
Third Stage: Spatial Filtering (Image
Difference)
Color Appearance Model
CIE colorimetry is only strictly applicable
to situations in which the original and
reproduction are viewed in identical
conditions.
Color appearance model developed to
predict color in different viewing
conditions.
Image Appearance Model
Color appearance models account for
many changes in viewing condition, but
they do not incorporate any of the
spatial or temporal properties of human
vision and the perception of images.
One model for still images, referred as
iCAM, has recently been published by
Fairchild and Johnson.
2. Color Appearance
Image courtesy of John MCann
Image courtesy of John MCann
Color Appearance
More than a single color
Adjacent colors (background)
Viewing environment (surround)
surround
Appearance effects
Adaptation
Simultaneous contrast
Spatial effects
Color Appearance Models
Mark Fairchild
background
stimulus
Light/Dark Adaptation
Adjust to overall brightness
7 decades of dynamic range
100:1 at any particular time
Absolute illumination effects
Hunt effect
Higher brightness increases
colorfulness
Stevens effect
Higher brightness increases
Chromatic Adaptation
Change in illumination
Cones “white
balance”
Scale cone
sensitivities
von Kries
Also cognitive
effects
Creates unique white
Daylight
Tungsten
From Color Appearance Models, fig 8-1
Simultaneous Contrast
“After image” of background
adds to the color
Reality is more complex
Affects Lightness Scale
Effect of Spatial Frequency
Smaller = less saturated
The paint chip problem
Color image perception
S-CIELAB
Redrawn from Foundations of Vision, fig 6
© Brian Wandell, Stanford University
Keep it as simple as possible but
not simpler.
Albert Einstein
R-trees
B-trees in multiple dimensions
Spatial object represented by its MBR
Minimum Bounding Rectangle
R-trees
Nonleaf
nodes
<ptr, R>
ptr – pointer to a child node
R – MBR covering all rectangles
in the child node
Leaf
nodes
<obj-id, R>
R-trees
Algorithms
Insert
Find the most suitable leaf node
Possibly, extend MBRs in parent
nodes to enclose the new object
Leaf node overflow split
Split
Heuristics
based
R-trees
Range queries
Traverse the tree
Compare query MBR with the current
node’s MBR
Nearest neighbor
Branch and bound:
Traverse the most promising sub-tree
find neighbors
R-trees
Spatial joins
”find intersecting objects”
Naïve method:
Build a list of pairs of intersecting
MBRs
Examine each pair, down to leaf
level
(Faster methods exist)
Variants
R+-tree
(Sellis et al 1987)
Avoids overlapping rectangles in
internal nodes
R*-tree
(Beckmann et al 1990)
Applications
Spatial databases
Text retrieval
Multimedia retrieval
Color correction
Does Memorization =
Learning?
Test #1: Thomas learns his mother’s
face
Memorizes:
But will he recognize:
Thus he can generalize beyond what he’s seen!
Does Memorization =
Learning? (cont’d)
Test #2: Nicholas learns about trucks & combines
Memorizes:
But will he recognize others?
So learning involves ability to generalize from labeled examples
(in contrast, memorization is trivial, especially for a computer)
Some examples
Some examples
Some examples
That is all, folks…
Thank you for your patience!
Questions
Good Luck!