MediaView: A Semantic View Mechanism for Multimedia Systems

Download Report

Transcript MediaView: A Semantic View Mechanism for Multimedia Systems

Multimedia Databases
Prepared by Pradeep Konduri
Instructor: Dr. Yingshu Li
Georgia State University
Plan of Attack
Introduction
 Architecture
 Image Content Analysis
 Modeling Constructs
 Logical Implementation
 Real-World Applications
 Conclusion

Types of multimedia data
Text: using a standard language (SGML,
HTML)
 Graphics: encoded in CGM, postscript
 Images: bitmap, JPEG, MPEG
 Video: sequenced image data at specified
rates
 Audio: aural recordings in a string of bits in
digitized form

Nature of Multimedia Applications
Repositories: central location for data
maintained by DBMS, organized in storage
levels
 Presentations: delivery of audio and video
data, temporarily stored.
 Collaborative: complex design, analyzing
data

Management Issues
Modeling: complex objects, wide range of
types
 Design: still in research
 Storage: representation, compression,
buffering during I/O, mapping
 Queries: techniques need to be modified
 Performance: physical limitations, parallel
processing

Research Problems
Information Retrieval in Queries: Modeling
the content of documents
 Multimedia/Hypermedia Data Modeling and
Retrieval: Hyperlinks, Used in WWW
 Text Retrieval: Use of a thesaurus

Multimedia Database Applications
Documentation and keeping Records
 Knowledge distribution
 Education and Training
 Marketing, Advertisement, Entertainment,
Travel
 Real-time Control, Monitoring

A Generic Architecture of MMDBMS
Query feature construction
Feature
Extraction
query
Indexing
MM
DBMS
Media Object
Search Engine
result
Media Object
Compression
feedback
Feedback Query construction

Media organization: organize the features for retrieval
(i.e., indexing the features with effective structures)

Media query processing: accommodated with indexing
structure, efficient search algorithm with similarity
function should be designed
<!ELEMENT ..>
.....
<!ATTLIST...>
Additional
Information
<article>
.....
</article>
MM Data
Instance
Recognized
components
MM Data
Preprocessor
Meta-Data
MM Data
Query Interface
Multimedia Database Architecture
MM Data
Instance
Users
Multimedia DBMS
Multimedia Data
Preprocessing System
Database Processing
DTD files
<article>
.....
</article>
SGML/XML
Parser
XML or SGML
Document Parse
Instance
Tree
Instance
Generator
DTD
Manager
Type
Generator
<!ELEMENT ..>
.....
<!ATTLIST...>
DTD
DTD
C++ Types
Document
content
SGML/XML
Documents
Query Interface
<!ELEMENT ..>
.....
<!ATTLIST...>
DTD Parser
Document Database Architecture
Users
Multimedia DBMS
C++ Objects
Document
Processing System
Database Processing
Image Database Architecture
Semantic
Objects
Syntactic
Objects
Meta-Data
Image
Annotation
Image Content
Description
<article>
.....
</article>
Image
Query Interface
Image Analysis
and Pattern Recognition
Users
Image
Multimedia DBMS
Image
Processing System
Database Processing
Image Content Analysis

Image content analysis can be categorized
in 2 groups:
◦ Low-level features: vectors in a multidimensional space
 Color
 Texture
 Shape
◦ Mid- to high-level features: Try to infer
semantics
◦ Semantic Gap
Image Content Analysis: Color

Color space:
◦
◦
◦
◦

Multidimensional space
A dimension is a color component
Examples of color space: RGB, HSV
RGB space: A color is a linear combination of 3 primary colors
(Red, Green and Blue)
Color Quantization
◦ Used to reduce the color resolution of an image

Three widely used color features
◦ Global color histogram
◦ Local color histogram
◦ Dominant color
Color Histograms

Color histograms indicate color distribution without spatial
information
◦
Color histogram distance metrics
50
40
30
20
10
0
Red Orange Yellow Green
Blue
Indigo Violet
Image Content Analysis: Texture




Refers to visual patterns with properties of
homogeneity that do not result from the presence of
only a single color
Examples of texture: Tree barks, clouds, water, bricks
and fabrics
Texture features: Contrast, uniformity, coarseness,
roughness, frequency, density and directionality
Two types of texture descriptors
◦ Statistical model-based
 Explores the gray level spatial dependence of texture and
extracts meaningful statistics as texture representation
◦ Transform-based
 DCT transform, Fourier-Mellin transform, Polar Fourier
transform, Gabor and wavelet transform
Image Content Analysis: Shape

Object segmentation
◦ Approaches:




Global threshold-based approach
Region growing,
Split and merge approach,
Edge detection app
◦ Still a difficult problem in computer vision.
Generally speaking it is difficult to achieve
perfect segmentation
Salient Objects vs. Salient Points
Generic low-level description of images into
salient objects and salient points
Original images
Segmented images with
region boundaries
Extracted
salient
points
Modeling Images – Principles





Support for multiple representations of an image
Support for user-defined categorization of images
Well-defined set of operations on images
An image can have (semantic, functional, spatial)
relationships with other images (or documents) which
should be represented in the DBMS
An image is composed of salient objects (meaningful
image components)
Salient Object Modeling
Multiple representations of a salient
object (grid, vector) are allowed
 A salient object O is of a particular type
which belongs to a user defined salient
object types hierarchy
 An image component may have some
(semantic, functional, spatial)
relationships with other salient objects

“Semantic Gap”
non-semantic
multimedia
data models
semantics-intensive
multimedia systems &
applications
require
semantic meaning
of the data
Semantic
Gap
model
raw data,
primitive properties
(size, format, etc)
Semantic modeling of multimedia
-- Why hard?

Context-dependency
◦ Semantics is not a static and intrinsic property
◦ The semantics of an object often depends on:
 the application/user who manipulate the object
 the role that the object plays
 other objects in the same “context”
Example:
Van Gogh’s
paintings
flower
Why hard? (cont.)

Modality-independency
◦ Media objects of different modalities may suggest
the similar/related semantic meanings.
◦ Example:
Query:
Results:
Harry Potter has never been the
star of a Quidditch team,
scoring points while riding a
broom far above the ground. He
knows no spells, has never
helped to hatch a dragon, and
has never worn a cloak of
invisibility.
image
video
text
MediaView – A “Semantic Bridge”

An object-oriented view mechanism that
bridges the semantic gap between multimedia
systems and databases

Core concept – media view (MV)
◦ a customized context for semantic interpretation of
media objects (text docs, images, video, etc)
◦ collectively constitute the conceptual infrastructure
of a multimedia system & application
Architecture
Multimedia Systems
...
MediaView
Mechanism
media
view 1
media
view 2
...
media
view n
External Schema
Conceptual Schema
Internal Schema
Object-oriented Database
Basic Concepts
An example…
Name
Color-Histogram
Impressionistic
Artworks
Wavelet-Texture
Image
Dominant-Shape
Text
Document
(B) Media View
Multimedia
Object
Bitmap
Image
Video
Clip
Audio
Clip
JPEG
Image
Artworks
subclass
keyframe
Image
Type
Style
(a) Base Class
subclass
Artist
Speech
(c) Base Schema
audio
track
Song
subview
subview
Impressionistic
Artworks
Impressionistic
Paintings
Realistic
Artworks
Post-modern
Artworks
Impressionistic
Sculptures
(d) View Schema
Basic Concepts
Semantics-based data reorganization via media views
text
image
video
audio
media view
View Operators
A set of operators that take media views
and view instances as operands.
 Focus on the operators that are
indispensable in supporting queries and
navigation over multimedia objects.


View Operators
type-level
V-overlap
syntax<boolean>:= v-overlap (<media view1, media view2 >)
semantics true, if and only if ( o  O)(oextent(<media view1>) and
oextent(<media view2>))
Cross
syntax{<object>}:= cross (<media view1, media view2 >)
semantics{<object>} := {o  O | o  extent(<media view1>) and oextent(<media
view2>)}
Sum
syntax{<object>}:= sum (<media view1, meida-view2 >)
semantics{<object>} := {o  O | o  extent(<media view1>) or oextent(<media
view2>)}
Subtract
syntax{<object>}:= subtract (<media view1, media view2>)
semantics{<object>}:= {o  O | o  extent(<media view1>) and oextent(<media
view2>)}

View Operators
instance-level
Class
syntax<base class> := class(<view instance>)
semantics<view instance> is a instance of <base class>
components
syntax{<object>} := components (<view instance>)
semantics {<object>} := { oO | o is a component (direct or indirect) of <view
instance>}
i-overlap
syntax<boolean> := i-overlap (<view instnace1>, <view instance2>)
semantics true, if and only if ( o  O) (o  components (<view instance1>)
and o  components(<view instance2>))
View Algebra

Functions
-- derivation of new MVs from existing MVs
Heuristic Enumeration
1. Blind enumeration
2. Content-based enumeration
3. Semantics-based enumeration
View Algebra

Algebra Operators
◦
◦
◦
◦
◦
select from src-MV where <predicate>
project <property-list> from src-MV
intersect (src-MV1, src-MV2)
union (src-MV1, src-MV2)
difference (src-MV1, src-MV2)
Comparison (vs. class)
media view
object class
membership
heterogeneous objects
uniform objects
member
acquisition
dynamic inclusion/exclusion of
existing objects of other classes
creating new objects
mapping
one object can belong to multiple
media views
one object has exactly
one class
relationship
inter-member semantic relationship
N/A
Comparison (vs. traditional object
view)
media view
object view
membership
heterogeneous objects
uniform objects
relationship
inter-member semantic
relationship
N/A
member
properties
instance-level properties
(user-defined)
inherited or derived
properties (for view instances)
global
properties
MV-level properties (userdefined)
N/A
Logical Implementation
MediaView Construction
 MediaView Customization
 MediaView Evolution

MediaViews Construction

Work with CBIR systems to acquire the
knowledge from queries
◦ Learn from previously performed queries
◦ A multi-system approach to support multimodality of media objects

Organize the semantics by following
WordNet
Why WordNet?

Different queries may greatly vary with the
liberty of choosing query keywords

We need an approach to organize those
knowledge into a logic structure
◦ A simple “context”: a concept in WordNet
◦ Common media views: corresponds to simple
contexts
◦ We provide all common media views, based on
which users can build complex ones.
Navigating the Multimedia Database

Navigating via semantic relationships of
WordNet
Semantic Relationship
Examples
Synonymy (similar)
pipe, tube
Antonymy (opposite)
fast, slow
Hyponymy (subordinate)
tree, plant
Meronymy (part)
chimney, house
Troponomy (manner)
march, walk
Entailment
drive, ride
Navigating the Multimedia Database
MediaView 2
MediaView 3
Multimedia Database
User
browse
MediaView 1
MediaView 4
Semantic Relationship in WordNet
MediaViews Construction
Multimedia Database
CBIR
System(Video)
CBIR
System(Image)
Query
CBIR System(Text)
Issue
...
User Feedback
Users
MediaView Engine
Results
System Feedback
MediaView Customization

Two level MediaView Framework
Customized MediaView
Basic MediaView
Simple Context
Advanced Context
MediaView Customization

Dynamically construct complex-contextbased media views based on simple ones
◦ An example complex context: “the Grand
Hall in City University”

Several user-level operators are devised
to support more complex/advanced
contexts, besides the basic operators
User-level Operators
INHERIT_MV(N: mv-name, NS: set-of-mvrefs,VP: set-of-property-ref, MP: set-ofproperty-ref): mv-ref
 UNION_MV(N: mv-name, NS: set-of-mvrefs): mv-ref
 INTERSECTION_MV(N: mv-name, NS:
set-of-mv-refs): mv-ref
 DIFFERENCE_MV(N1: mv-ref, N2: mvref): mv-ref

Build a MediaView in Run-time

Legend
Media View 1
Text
Image
Sound
Video
Build MediaView
Topic 3
Topic 1
Multimedia Document
Topic 2
Example: find out info
about "Van Gogh"
◦ Who is "Van Gogh"?
◦ What is his work?
◦ Know more about his whole
life.
◦ Know more about his country.
◦ See his famous painting
"sunflower"
Build a MediaView in Run-time

Who is “Van Gogh”?
◦ INHERIT_MV(“V. Gogh“, {<painter>},name=”Van Gogh” ,);

What is his work?
◦ INTERSECTION_MV(“work”, {<painting>, vg});

Know more about his whole life.
◦ INTERSECTION_MV(“life”, {<biography>, vg});

Know more about his country.
◦ INTERSECTION_MV(“country”, {<country>, vg});

See his famous painting “sunflower”
◦ Set sunflower = INTERSECTION_MV(“sunflower”, {<sunflower>,
<painting>});
Set vg_sunflower = INTERSECTION_MV(“vg_sunflower”, {vg_work,
sunflower});
Authoring Scenario

Creates a new media view named after the subject
◦ All multimedia materials used in the document would be put
into this MediaView for further reference.

To collect the most relevant materials for authoring, the
user performs the MediaView building process.
◦ Import suitable media objects by browsing media views

Reference the manner and style of authoring, to find
other media views with similar topics.
◦ Drag & Drop
◦ “learning-from-references”
Summary





Types of multimedia data: Text, Audio,Video, Images.
Management issues: Design, Storage, Modeling,
Queries
Image Content Analysis: Color, Texture, Shape
MediaView – a semantic multimedia database
modeling mechanism
◦ to bridge the semantic gap between conventional
database and semantics-intensive multimedia
applications
A set of user-level operators to accommodate the
specialization/generalization relationships among
the media views
Summary (contd..)


MediaView promises more effective access to the
content of media databases
◦ Users could get the right stuff and tailor it to the
context of their application easily.
Providing the most relevant content from prelearnt semantic links between media and context
high performance database browsing and
multimedia authoring tools can enable more
comprehensive applications to the user.
Users could customize specific media view according
to their tasks, by using user-level operators
Further Issues
The development and transition of
MediaView to a fully-fledged multimedia
database system supporting “declarative”
queries
 Intensive and extensive performance
studies
 Advanced semantic relations (eg.
temporal and spatial ones) can also be
incorporated in combining individual
media views

Thank you!
Q &A
Email: [email protected]