Transcript Outline

Research Directions

Web Databases
 Will

discuss
Mobile MM systems
 Pull Access
 Push Access

Watermarking and Steganography
 Rani
Hoitash
1
Web Databases

WWW
 a collection of HTML documents

text, images, forms, tables
 information
is distributed
 There is no unifying schema
 Information duplication
 Semi-structured data
 Retrieval is costly and not guaranteed

optimization is important
• use of similarity values
• use of feedback
2
Searching

Search engines




Yahoo
Infoseek
Excite
Maintain indices to the HTML pages

document retrieval




use of confidence values & ranking is essential
Problem:


based on keywords
Information overload
No media search capability
No complex search capability



single page at a time
web structure is ignored
page structure is ignored
3
MAVIS(microcosm)


Content based retrieval for non-textual data
media dependant features



links are enriched by media dependant signatures
feasible at a small scale hypermedia environment
Harder to implement at a larger scale:



it is not realistic to expect from authors to specify signatures
it is not realistic to expect from indexing tools(search engine) to
extract this information.
The servers can do this. We can augment servers to extract such
information off-line and annotated pages and links with signatures.
4
Other Systems

Image surfer(Yahoo)




Manual categorization of images
Histogram based image retrieval (keyword)
only images!!
Webseek(University of Chicago)

Image retrieval




keyword retrieval
face recognition
Only images!!!
Infoseek(RPI)

A combined search engine (integrated)




parallel execution of queries using multiple indices
query translation
result merging
Keyword based

do not provide complex query functionality
5
WebSQL(University of Toronto)





WWW is a table of documents
URL, title, date, etc are treated as attributes
SQL is augmented to query such information
E. g.
Select d. url, d. title, d. length
from Document d suchthat d Mentions “hypertext”
where d. type=“text/html”
Find all documents of type html that mentions hypertext, return the
url, title, length, on modification date for such documents.
6