Part 5 - Semantic Web Workshop 2002

Download Report

Transcript Part 5 - Semantic Web Workshop 2002

Knowledge Compilation
from the Web
Some Examples
 Finding relationships
 Discovering micro-communities
 Creating concept hierarchies
Finding Relationships Using
Association Rules
Input: Crawl of about 1 million pages
Association Rules




I = {i1, i2, ..., ik} : a set of literals, called items.
Transaction T : a set of some items in I.
Database D: a set of transactions.
An association rule is an implication of the form X => Y,
where X, Y are in I.
– The rule X => Y holds in the database D with confidence c if c%
of transactions in D that contain X also contain Y.
– The rule X => Y has support s in the transaction set D if s% of
transactions in D contain X U Y.
 Find all rules that have support and confidence greater
than user-specified minimum support and minimum
confidence.
Discovering Micro-communities
complete 3-3 bipartite graph
 Japanese elementary
schools
 Turkish student
associations
 Oil spills off the coast of
Japan
 Australian fire brigades
 Aviation/aircraft vendors
 Guitar manufacturers
Frequently co-cited pages are related.
Pages with large bibliographic overlap are related.
Creating Concept Hierarchies
 Nested list structures in the link pages (my
links, cool links, etc.) are great sources for
discovering concept hierarchies
 The current manual approaches will not
scale
 Start with automated techniques and use
mass collaboration to refine and correct
Reassertion
 We must make semantic web happen
 Don’t lose sight of performance and
scaling
 Database and data mining literature may
have much to offer
Making Semantic Web Real:
Call for Action
 Define architecture with interfaces
 Let different communities contribute
pieces
 Don’t overdesign --- let it grow organically