Universal Memex

Download Report

Transcript Universal Memex

Universal Memex
(A Research Project for Discussion)
Vannevar Bush’s Memex
• Record everything
one sees
• All human knowledge
in Memex “a billion
books” hyper-linked
together
• Camera glasses
• “A machine which
types when talked to”
Storage Needed for A Personal Memex
• If the disk technology trend continues, a single
disk will hold
– 10TB by 2010
– 1PB by 2020
– 10PB by 2030
Your husband died,
but here is his black box.
Human input data
/hr
/lifetime
read text
100 KB
25 GB
Hear speech @ 10KBps
40 MB
10 TB
2 GB
8 PB
See
TV@ .5 MB/s
Universal Memex
• Record everything a
person reads, writes,
hears, and sees
anywhere
• Retrieve needed
information effectively
anywhere anytime
Capture, store and transfer
Distributed
(Planet)
Related Research Topics
• How to capture information
– Wearable devices, etc
• How to transfer information
– Mobile storage system, etc
• How to store information
Capture, store and transfer
– Distributed, scalable storage
• How to summarize
– Summarization, mining, etc
• How to retrieve information
– Search, visualization, display
system, etc.
• How to protect information
– Security, disaster, etc
• How to be legal
– Privacy, copyright, etc
Distributed
(Planet)
A Small Step: Capture Device
• Build audio capture and transfer mechanisms
• Infrastructure: a iPAQ with wireless networking
and a storage server
• Research issues:
– How to capture and preprocess audio information for
Memex retrieval? (speech to text?)
– How to obey the law? (machine voice?)
– How to protect data?
– How to transfer data? (leverage R. Wang’s group
project?)
A Small Step: Memex Browser
• Build a software package to remember all web
pages a person has visited (Due to Han Chen)
• Infrastructure: IE or Navigator
• Research Issues:
– How to store data for Memex retrieval?
– How to obey the copyright law? (link only?)
– How to protect privacy?
A Small Step: ICU Memex
• Build a medical data collection appliance to capture all
data of a patient in an ICU and visualize desired data
(Due to discussion with Dr. Bill Hanson at UPenn
Medical School)
• Infrastructure: a PC, HP medical equipment and a
storage server
• Research issues:
– How to capture and store medical specific information for data
visualization
– How to allow doctors to look at data and protect patient’s rights?
– How to visualize the medical data
A Small Step: Voice-Based Search
• Build a software package to search a Personal
Memex based on voice input (e.g. Find all
conversations I had with Dr. Bill Hanson on ICU)
• Infrastructure: a PDA, notebook or desktop, and
a storage server
• Research issues:
– How to organize audio data for such search?
– How to output search results (completely sequential)?
– How to deal with privacy issues (who are supposed to
see the results)?
A Small Step: A Distributed Storage
Server for Universal Memex
• Build a distributed storage system
• Infrastructure: A PC cluster (140 PCs in the lab)
and later Planet-Lab
• Research issues
– How to securely store and access data?
– How to find data (maintain directory information) in a
distributed environment?
– How to build a scalable system?
– How to support search and summarization in various
Memex applications?