Individualized Knowledge Access
Download
Report
Transcript Individualized Knowledge Access
Individualized
Knowledge Access
David Karger
Lynn Andrea Stein
Mark Ackerman
Ralph Swick
Information Access
A key task in Oxygen: help people manage
and retrieve information
Three overlapping projects:
Haystack:
information storage and retrieval
application clients
Semantic Web: next-generation metadata
Volt: collaborative access
Presentation Overview
Motivation
Information access behavior and goals
System Design & Architecture
Data Model
Interacting data and UI components
Working applications
Base haystack
Frontpage
Volt
Motivation
Problem Scenario
I try solving problems using my data:
Information gathered personally
High quality, easy for me to understand
Not limited to publicly available content
My organization:
Personal annotations and meta-data
Choose own subject arrangement
Optimize for my kind of searching
Adapts to my needs
Then Turn to a Friend
Leverage
They organize information for their own use
Let them find things for me too
Shared vocabulary
They know me and what I want
Personal expertise
They know things not in any library
Trust
Their recommendations are good
Last to Library/web
Answer usually there
But hard to find
Wish: rearrange to suit my needs
Wish: help from my friends in looking
Lessons
Individualized access
Best tools adapt to individual ways of
organizing and seeking data
Individualized knowledge
People know more than they publish
That knowledge is useful to them and others
Collaborative use
Right incentives lead to sharing and joint use
Haystack
Individualized access
My data collection, organization
Search tools tuned for me
Collaborate to leverage individual knowledge
Access unpublished information in others’ haystacks
Self interest
public benefit
Lens to personalize access to the world library
Rearrange presentation to suit my personal needs
Example
Info on probabilistic models in data mining
My haystack doesn’t know, but “probability” is in lots
of email I got from Tommi Jaakola
Tommi told his haystack that “Bayesian” refers to
“probability models”
Tommi has read several papers on Bayesian
methods in data mining
Some are by Daphne Koller
I read/liked other work by Koller
My Haystack queries “Daphne Koller Bayes” on
Yahoo
Tommi’s haystack can rank the results for me…
System Design
Gathering Data
Haystack archives anything
Web pages browsed, email sent and
received, address book, documents written
And any properties, relationships
Text of object (for text search)
Author, title, color, citations, quotations,
annotations, quality, last usage
Users freely add types, relationships
Semantic Web
Arbitrary objects,
connected by
named links
No fixed schema
User extensible
HTML
Doc
Haystack
Sharable by any
application
A new “file system”?
D. Karger
Outstanding
Gathering Data
Active user input
Interfaces let user add data, note relationships
Mining data from prior data
Plug-in services opportunistically extract data
Passive observation of user
Plug-ins to other interfaces record user actions
Other Users
Data
Extraction
Services
Machine
Learning
Services
Spider
Triple Store
Web
Observer
Proxy
Mail
Observer
Proxy
Volt
Viewer/
Editor
Web
Viewer
Sample Applications
Sample Applications
Because everything uses the Semantic
Web constructions, a variety of
application clients can share information
Web Browser---data viewer
FrontPage---personalized information filter
Volt---collaboration tool
Haystack via Web
Web server
interface
Basic operations:
Insert objects
View objects
Queries
Haystack via Web
Haystack via Web
Viewer shows one
node and
associated arrows
Service notices
we’ve archived a
directory; so
archives the
objects it contains
(and so on…)
Haystack via Web
Services detect
document type,
extract relevant
metadata
Output can specialize
by type of object
Mediation
Haystack can be a lens for viewing data
from the rest of the world
Stored content shows what user
knows/likes
Selectively spider “good” sites
Filter results coming back
Compare to objects user has liked in the past
Can learn over time
Example - personalized news service
News Service
News Service
Scavenges articles from your favorite news
sources
Html parsing/extracting services
Over time, learns types of articles that interest
you
Prioritizes those for display
Content provider no longer controls viewing
experience
No more ads
Personalized News Service
Collaborative Access
Want to leverage others’ work in
organizing information
No need to “publish” expertise
Exposed automatically---without effort
Self interest helps others
Volt
Volt is about collaboration between people
The Haystack architecture allows easy
collaboration among individuals
semantic web references to Haystack objects
Individuals share parts of their Haystack
Group spaces and shared notebooks
Volt
Collaborators
Those I interact with
Frequent mail contact
Frequent visits to their home page
Those with shared content
And who have same opinions about
content
Collaborative filtering techniques
Referrals
Expertise search engine
Expertise Beacon
Volt Expertise Beacons
Group spaces and shared notebooks
Create individual and group profiles
Profiles can be used to find other people
Allows targeted search
“Who else is working on this project?”
User controls visibility/privacy
Summary
Next generation information access
Semantic Web
provides a language and capabilities for meta-data
Haystack
teases out individual knowledge,
stores it in a coherent fashion, and
allows a variety of application clients to leverage
individual meta-data
Volt
turns individual knowledge into a community resource
More Info
http://haystack.lcs.mit.edu/
http://www.w3c.org/2001/sw
[email protected]
[email protected]
[email protected]
[email protected]