(AND book (FILLS doc-author-name “Carl Sagan”))

Download Report

Transcript (AND book (FILLS doc-author-name “Carl Sagan”))

The InfoQuilt Project
THE INFOQUILT VISION
 Semantic interoperability between systems, sharing knowledge
using multiple ontologies
 Logical correlation of information
 Media independent information processing
REALIZATION OF THE VISION
 fully distributed, adaptable, agent-based system
 information/knowledgement supported by collaborative
processes
http://lsdis.cs.uga.edu/proj/iq/iq.html
InfoQuilt Project: using the Metadata REFerence link
MREF
Complements HREF, creating a “logical web” through media
independent ontology & metadata based correlation
It is a description of the information asset we want to retrieve
Semantic Correlation using MREF
constraints
relations
attributes
domain ontologies
IQ_Asset ontology +
extension ontologies
MREF
keywords
content attributes
(color, scene cuts, …)
MREF Concept
Model for logical
correlation using
ontological terms
and metadata
MREF
Framework for
representing MREF’s
RDF
Serialization
(one implementation
choice)
XML
http://lsdis.cs.uga.edu/proj/iq/iq.html
Domain Specific Correlation – example
Potential locations for a future shopping mall identified by all regions having a
population greater than 5000, and area greater than 50 sq. ft. having an urban
land cover and moderate relief <A MREF ATTRIBUTES(population > 5000; area > 50;
region-type = ‘block’; land-cover = ‘urban’; relief = ‘moderate’) can be viewed here</A>
domain specific metadata: terms chosen from domain specific ontologies
Population:
Area:
=> media-independent
relationships between domain
Boundaries:
Regions
(SQL):
specific metadata: population,
Land cover:
Relief:
Image Features
 Boundaries 
(image processing
routines)
area, land cover, relief
=> correlation between image
and structured data at a
higher domain specific level
as opposed to physical “linkchasing” in the WWW
Census DB
TIGER/Line DB
US Geological Survey
Domain Specific Correlation – example
A DL II approach for Information Brokering
Iscape N
Iscape 1
CONSTRUCTING APPROPRIATE INFORMATION LANDSCAPES
CONSTRUCTING ADDITIONAL
META-INFORMATION RESOURCES
DISCOVERING COLLECTIONS OF
HETEROGENEOUS INFORMATION AND
META-INFORMATION RESOURCES
Domain
Specific
Ontologies
Images
Physical/Simulation
World
Data Stores
Documents Digital Media
Domain
Independent
Ontologies
ADEPT Information Landscape Concept Prototype
(a scenario for Digital Earth:
learning in the context of the “El Niño” phenomenon)
Sample Iscapes Requests:
– How does El Niño affect sea animals? Look for
broadcast videos of less than 2 minutes.
– How are some regions affected by El Niño? Look at
request
information
using
East/West
Pacific regions.
keywords
– What disasters have been related to El Niño?
domain-specific attributes
domain-independent
attributes
– What storm occurrences
are attributed
to El Niño?
– Show reports related to El Niño that contain Clinton.
TRY ISCAPE CONCEPT DEMO
Putting MREFs to work
IQ_Asset ontology +
extension ontologies
domain ontologies
MREF Builder
User
construct new MREF
User
Agent
User
profiles
Profile
Manager
Broker Agent
MREF
repository
MREF
repository
Context: the lynchpin of semantics
Cricket
“For instance, if you were to use Yahoo! or Infoseek to
search the web for pizza, your results would probably
be hundreds of matches for the word pizza. Many of
these could be pizza parlors around the world. Yet if
you run the same search within NeighborNet, you will
allows you to order pizza to be delivered instead of
shipped.”
From a Press Resease of FutureOne, Inc. March 24, 1999
http://home.futureone.com/about/pr/021699.asp
Constructing c-contexts from ontological terms
C-CONTEXT:
DATABASE
OBJECTS
AGENCY(RegNo, Name, Affiliation)
DOC(Id, Title, Agency)
Agency
Concept
Document
Concept
ONTOLOGICAL TERMS
“All documents stored in the database
have been published by some agency”
=> Cdef(DOC) = <(hasOrganization, AgencyConcept)>
C-Context = <(C1 , V1) (C2 , V2) ... (Ck , Vk) >
a collection of
contextual coordinates Ci s (roles) and
values Vi s (concepts/concept descriptions)
Advantages:
 Use of ontologies for an intensional
domain specific description of data
 Representation of extra information
 Relationships between objects not
represented in the database schema
 Using terminological relationships in
the ontology
Using c-contexts to reason about
information in database
EXAMPLE
Cdef(DOC)
CQ
<(hasOrganization, AgencyConcept)>
<(hasOrganization, { “USGS”})>
glb(Cdef(DOC), CQ)
<(self, DocumentConcept),(hasOrganization, { “USGS” })>
- Reasoning with c-contexts: glb(Cdef(DOC), CQ)
- Ontological Inferences:
- DocumentConcept
- (hasOrganization, { “USGS” })
Challenge 1: use of multiple ontologies
Challenge 2: estimating the loss of information
Estimating information loss for multi-ontology based
query processing in the OBSERVER/InfoQuilt system
OBSERVER architecture
Data Repositories
IRM
Ontology
Server
Mappings
Ontologies
Interontologies
Terminological
Relationships
Query
Processor
User
Query
IRM NODE
USER NODE
COMPONENT NODE
Ontology
Server
Query
Processor
Mappings
Ontologies
Data Repositories
COMPONENT NODE
Ontology
Server
Query
Processor
Mappings
Ontologies
Data Repositories
Eduardo Mena (III’98)
Estimating information loss for multi-ontology based
query processing in the OBSERVER/InfoQuilt system
Query construction - Example
“Get title and number of pages of books written by Carl Sagan”
User ontology: WN
[name pages] for
(AND book (FILLS creator “Carl Sagan”))
Target ontology: Stanford-I
Integrated ontology WN-Stanford-I
[title number-of-pages] for
(AND book (FILLS doc-author-name “Carl Sagan”))
Ontologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.html
http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/
Eduardo Mena (III’98)
Estimating information loss for multi-ontology based
query processing in the OBSERVER/InfoQuilt system
Query construction - Example
Re-use of Knowledge:
Biblio-Thing
Bibliography Data Ontology
Stanford-I
“Get title and number of pages of books written by Carl Sagan”
Conference
Document
User ontology: WN
Agent
Person
[name pages] for
Book
Author
Organization
Technical-Report
(AND book (FILLS creator “Carl Sagan”))Publisher
Miscellaneous-Publication
University
Proceedings
Target ontology:
Stanford-I
Edited-Book
Thesis
Technical-Manual
Integrated
ontology WN-Stanford-I
Periodical-Publication
Cartographic-Map
[title number-of-pages]
forComputer-Program
Doctoral-Thesis
Journal
Newspaper
(AND
Multimedia-Document
Artwork“Carl Sagan”))
Master-Thesis
book
(FILLS doc-author-name
Magazine
Ontologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.html
http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/
Eduardo Mena (III’98)
Estimating information loss for multi-ontology based
query processing in the OBSERVER/InfoQuilt system
Query construction - Example
Print-Media
Re-use of Knowledge:
A subset of WordNet 1.5
“Get
title and number of pages
of books written by Carl Journalism
Sagan”
Press
Publication
User
Newspaper
ontology:
Magazine WN
Periodical
Book
[name pages] for
Trade-Book
Brochure
Pictorial
TextBook
(AND
book (FILLS creator
“Carl Sagan”))
SongBook
Reference-Book
CookBook
Instruction-Book
HandBook
Series
PrayerBook
Target ontology: Stanford-I
Integrated ontology
WN-Stanford-I
WordBook
Journals
Encyclopedia
Directory
Annual
[title number-of-pages] for
GuideBook
Manual
(AND book
(FILLSBible
doc-author-name “Carl Sagan”))
Ontologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.html
Instructions
Reference-Manual
http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/
Eduardo Mena (III’98)
Estimating information loss for multi-ontology based
query processing in the OBSERVER/InfoQuilt system
WN ontology and user query
Query construction - Example
“Get title and number of pages of books written by Carl Sagan”
User ontology: WN
[name pages] for
(AND book (FILLS creator “Carl Sagan”))
Target ontology: Stanford-I
Integrated ontology WN-Stanford-I
[title number-of-pages] for
(AND book (FILLS doc-author-name “Carl Sagan”))
Ontologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.html
http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/
Eduardo Mena (III’98)
Estimating information loss for multi-ontology based
query processing in the OBSERVER/InfoQuilt system
Estimating the loss of information




To choose the plan with the least loss
To present a level of confidence in the answer
Based on intensional information (terminological difference)
Based on extensional information (precision and recall)
Plans in the example
User Query: (AND book
(FILLS doc-author-name “Carl Sagan”))
Plan 1: (AND document (FILLS doc-author-name “Carl Sagan”))
Plan 2: (AND periodical-publication (FILLS doc-author-name “Carl Sagan”))
Plan 3: (AND journal (FILLS doc-author-name “Carl Sagan”))
Plan 4: (AND UNION(book, proceedings, thesis, misc-publication, technical-report)
(FILLS doc-author-name “Carl Sagan”))
Eduardo Mena (III’98)
Estimating information loss for multi-ontology based
query processing in the OBSERVER/InfoQuilt system
Loss of information based on intensional information
User Query: (AND book (FILLS doc-author-name “Carl Sagan”))
Plan 1:
(AND document (FILLS doc-author-name “Carl Sagan”))
book:=(AND publication (AT-LEAST 1 ISBN))
publication:=(AND document (AT-LEAST 1 place-of-publication))
Loss: “Instead of books written by Carl Sagan, OBSERVER is
providing all the documents written by Carl Sagan (even if they
do not have an ISBN and place of publication)”
Eduardo Mena (III’98)
Estimating information loss for multi-ontology based
query processing in the OBSERVER/InfoQuilt system
Example: loss for the plans
Plan 1: (AND document (FILLS doc-author-name “Carl Sagan”))
[case 2]
91.57% < (1-Loss) < 91.75%
Plan 2: (AND periodical-publication (FILLS doc-author-name “Carl Sagan”))
94.03% < (1-Loss) < 100%
Plan 3: (AND journal (FILLS doc-author-name “Carl Sagan”))
[case 3]
[case 3]
98.56% < (1-Loss) < 100%
Plan 4: (AND UNION(book, proceedings, thesis, misc-publication, technical-
report) (FILLS doc-author-name “Carl Sagan”))
[case 1]
0% < (1-Loss) < 7.22%
Eduardo Mena (III’98)
Summary
Knowledge Mgmt.,
Information
Brokering,
Cooperative IS
Visual,
Scientific/Eng.
Knowledge
Semantic
Semi-structured
Metadata
Structural,
Schematic
Mediator,
Federated IS
Text
Structured Databases
Data
Syntax,
System
Federated DB
Agenda for research
 Interoperation not at systems level, but at informational and
possibly knowledge level
– traditional database and information retrieval solutions
do not suffice
– need to understand context; measures of similarities
 Need to increase impetus on semantic level issues involving
terminological and contextual differences, possible perceptual
or cognitive differences in future
– information systems and humans need to cooperate,
possible involving a coordination and collaborative
processes
Related Reading
 Books:
 Information Brokering for Digital Media, Kashyap and Sheth, Kluwer,
1999 (to appear)
 Multimedia Data Management: Using Metadata to Integrate and Apply
Digital Media, Sheth and Klas Eds, McGraw-Hill, 1998
 Cooperative Information Systems, Papazoglou and Schlageter Eds.,
Academic Press, 1998
 Management of Heterogeneous and Autonomous Database Systems,
Elmagarmid, Rusinkiewica, Sheth Eds, Morgan Kaufmann, 1998.
 Special Issues and Proceedings:
 Formal Ontologies in Information Systems, Guarino Ed., IOS Press, 1998
 Semantic Interoperability in Global Information Systems, Ouksel and
Sheth, SIGMOD Record, March 1999.
http://lsdis.cs.uga.edu
[See publications on Metadata, Semantics,Context,
InfoHarness/InfoQuilt]
[email protected]
Acknowledgements:
Tarcisio Lima
Vipul Kashyap