Keyword Searching and Browsing in Databases using

Download Report

Transcript Keyword Searching and Browsing in Databases using

Keyword Searching and
Browsing in Databases using
BANKS
Charuta Nakhe, Arvind Hulgeri, Gaurav Bhalotia,
Soumen Chakrabarti, S. Sudarshan
Presented by
Sushanth Sivaram Vallath
Motivation
•Keyword search of documents on the web as been
enormously successful
•Simple and intuitive, no need to learn any query
language
•Database querying using keywords is desirable
•SQL is not appropriate for casual users
•Form interfaces cumbersome:
•Require separate form for each type of query –
confusing for casual users of Web information
systems
•Not suitable for ad hoc queries
Motivation
• Many Web documents are dynamically
generated from databases
– E.g. Catalog data
• Keyword querying of generated Web
documents
– May miss answers that need to combine
information on different pages
– Suffers from duplication overheads
Examples of Keyword Queries
• Airticket reservation database
– “DFW LAX”
• University database
– Info on courses
• Online shopping
– Canon Digital Rebel
Differences from IR/Web Search
• Related data split across multiple tuples due to
normalization
• Different keywords may match tuples from
different relations
Schema
Basic Model
• Database: modeled as a graph
– Nodes = tuples
– Edges = references between tuples
• foreign key
• Edges are directed.
The BANKS Answer Model
• Query: set of keywords {k1, k2, .., kn}
– Each keyword ki matches set of nodes Si
• Answer: rooted, directed tree connecting nodes,
with one node from each Si
– Root node has special significance, may be restricted to
some relations
– May include intermediate nodes not in any Si and hence a
steiner tree.
• Multiple answers
– Ranking based on proximity + prestige
Edge Directionality
• Some popular tuples are connected to many other tuples
– E.g. Students -> departments -> university
• Popular tuples would create misleading shortcuts from
every tuple to every other
– E.g. every student would be closely linked with every other
student via the department/university
• Solution: define different forward and backward edge
weights
– Forward edges: In the direction of the foreign key reference
Node Weight
• Nodes have prestige weights too
– nodes with greater prestige tend to have greater
indegree
Finding Answer Trees
• Backward Expanding Search Algorithm:
– Intuition: find vertices from which a forward path
exists to at least one node from each Si.
– Run concurrent single source shortest path
algorithm from each node matching a keyword
• Create an iterator for each node matching a keyword
– Traverse the graph edges in reverse direction
• Output a node whenever it is on the intersection of the
sets of nodes reached from each keyword
Finding Answer Trees
• Backward Expanding Search
• Intuition: travel backwards from keyword nodes
Query: sudarshan roy
till you hit a common node
paper
MultiQuery Optimization
writes
authors
Sudarshan
Prasan Roy
References
1. Keyword Searching and Browsing in
Databases using BANKS
2. Keyword Searching and Browsing in
Databases using BANKS (PPT)
Thank You