Intranet Mediator - University of Washington

Download Report

Transcript Intranet Mediator - University of Washington

Intranet Mediator
Clement Yu
Department of Computer Science
University of Illinois at Chicago
• A simple Query in natural language
• Collaborative databases can be text or
relational databases possibly with form
interfaces
Answer the query using some of the
databases
QUERY
Intranet Mediator
DBMS1
DBMS2
…
DBMSm
Environment
An organization having numerous services
Users want to utilize these services, but
are unaware of their existence/how to
access those services
Query: When did Peter Smith borrow a
textbook which costs more than $60?
Numerous databases:
Academic standing of students;
Student social activities;
Research activities of faculty members;
Health services
Library service …
Which databases can answer the query?
Distance (Database, Query)
• For each database, construct a schema graph
• For the given query, extract query terms
• Associate the query terms to the database
schema graphs
Compute a distance between each database
graph and the query terms
The database with the least distance from the
query is the most likely database to answer the
query
When, Peter Smith, borrow, textbook,
$60, costs
Person
0
0
PersonName
Person-ID
1
0
Borrow
0
Book-ID
0
1
Date
1
Book
1
1
price
1
Peter
Smitch
borrow
0
when
textbook
$60 costs
Research issues
• Necessary and sufficient conditions that the
least distance database is in fact a correct
database to answer the given query
• Mapping from natural language query to a
relational query or a keyword based query
• Automatic construction of ontology
• Multiple databases, including text databases,
necessary to answer the query
• Internet vs Intranet
• Keyword-based Search to a Single Database
– Sanjay Agrawal, Surajit Chaudhuri, Gautam Das: DBXplorer:
A System for Keyword-Based Search over Relational
Databases. ICDE 2002
– Gaurav Bhalotia, Arvind Hulgeri, Charuta Nakhe, Soumen
Chakrabarti, S. Sudarshan: Keyword Searching and
Browsing in Databases using BANKS. ICDE 2002
– Vagelis Hristidis, Yannis Papakonstantinou: DISCOVER:
Keyword Search in Relational Databases. VLDB 2002
• NLI to a Single Database
– Ion Androutsopoulos, G. Ritchie, and P. Thanisch: Natural
Language Interfaces to Databases - An Introduction. Journal
of Natural Language Engineering 1995
– Ana-Maria Popescu, Oren Etzioni, Henry A. Kautz: Towards
a theory of natural language interfaces to databases.
Intelligent User Interfaces 2003
• NLI to a Single Database (Cont.)
– Frank Meng and Wesley W. Chu: Database query formation from
natural language using semantic modeling and statistical keyword
meaning disambiguation. CSD-TR 990003, UCLA, 1999
– L. Tang and R. Mooney Using multiple clause constructors in
inductive logic programming, European Conference on Machine
Learning,2001.
• Mediator
– David A. Grossman, Steven M. Beitzel, Eric C. Jensen, Ophir
Frieder: The IIT Intranet Mediator: Bringing Data Together on a
Corporate Intranet. IEEE IT PRO, January/February 2002.
– Alon Y. Levy, Anand Rajaraman, Joann J. Ordille: Querying
Heterogeneous Information Sources Using Source Descriptions.
VLDB 1996
• Ontology
– Jiawei Han, Yongjian Fu: Dynamic Generation and Refinement of
Concept Hierarchies for Knowledge Discovery in Databases. KDD
Workshop 1994
– George A. Miller. WordNet: A lexical database for English.
Communications of the ACM, 38(11), 1995