CMPUT 391 Lecture Notes
Download
Report
Transcript CMPUT 391 Lecture Notes
LogicSQL-based Enterprise
Archive and Search System
Li-Yan Yuan
How to organize the information and
make it accessible and useful ?
Oct 30, 2006
1
Projects
How to develop an enterprise search engine based on
a database management system
challenges:
implementation of the inverted index
Oct 30, 2006
2
Projects
How to implement the TOP K query
Oct 30, 2006
Ranking formula
Inverted indexes are created with respect to frequences
3
Internet search
Search for relevant web pages
Good answers:
Relevant
Popular
Public domain knowledge,
Search engines are critical to Internet use
Oct 30, 2006
internal workings are secret
Tremendous political, economical, and cultural power
4
Enterprise search
Search the enterprise information systems for right
information
Enterprise information
Internal web pages
Internal documentation systems
File systems
Databases
Email servers
The internet and enterprise domains differ fundamentally
Oct 30, 2006
Contents
User behavior
Economic motivations
5
Top-K Query
Objective
How to determine the top K objects that are most likely
(approximately) related to the given query
Applications
Oct 30, 2006
Information retrieval
Internet and enterprise searches
Multimedia similarity search
Scheduling large scale on-demand data broadcase
……
6
Oct 30, 2006
7
Oct 30, 2006
8
Development of Enterprise Search Systems
Oct 30, 2006
9
LogicSQL Enterprise information
Archive and Search system
LogicSQL An object-relational database
management system
New
Oct 30, 2006
concurrency control algorithm
Staged database architecture
Developed in the University of Alberta
Commercialized by Shanghai Shifang Software Co.
10
Enterprise Archive and Search System
To archive all the enterprise information contents
To provide a web styled search engine
To support user-specified ranking algorithms
Oct 30, 2006
File systems
Web pages
Emails
Internal documents
Database records?
focus on the platform of archive and search
Easy implementation and test of various ranking algorithms
11
Enterprise Archive and Search System
Extend the database functionalities
Security model
Users,
roles + security handle
Security primary key
New database objects
Inverted
indexes
CREATE INVERTED INDEX
DROP INVESTED INDEX
Automatic population, similar to that of index
ORDER BY clause
User
Oct 30, 2006
specified aggregate functions
CREATE AGGREGATE FUNCTION
Top-K query evaluation
Specified crawlers
12
Enterprise Archive and Search System
User configuration
Extend the query languages
Oct 30, 2006
Set up crawlers
Create a list of inverted indexes
Create one aggregate function for object ranking
Implement the top K query algorithm
Web based query pages
13