Telegraph: A Universal System for Information

download report

Transcript Telegraph: A Universal System for Information

Telegraph: A Universal System
for Information
Telegraph History & Plans
• Initial Vision
– Carey, Hellerstein, Stonebraker
– “Regres”, “B-1”
• Sweat, ideas and further vision
–
–
–
–
4 of my grads committed
Brewer + 2 grads committed
Franklin will play
obvious tie-ins with other projects
Telegraph Architecture
Query/Browse/Mine
& synergies!
Control,
DigLib
Mariposa,
Millenium,
Control
Global Agoric Federation
Continuously Reoptimizing Query Processor
Adaptive Data Placement
Storage Manager (FS, DB, Web)
Ninja,
GiST,
IStore
River, Ninja,
Aetherstore,
Control,STIX
Storage Manager
• Historic chance to start over!
– new hardware realities
• variable-length segments, not blocks
• big main memories
• extra CPUs at the devices (IStore)
– revisit and clean up infrastructure for transactions
• clean API supporting both log-based & version-based
schemes; version-based runs today!
• big SW Eng. challenge
– unify DB/FS/Web server!
• Clients: Ninja’s persistent hash table, query processing,
web server, Linux (NT?) filesystem.
– (Mohan Lakhamraju, Rob von Behren, Steve Gribble)
Query Engine
• Shared-nothing (cluster)
– all data flow (no blocking ops)
•
•
•
•
auto load-balance to micro/macro changes in environment
adaptivity more important than raw performance!!
CONTROL! || ripple join, online reordering
(Shankar Raman)
– continuously reoptimizing query plans
• tie-ins with STIX (Christos/Sinclair/Russell/Hellerstein)
• (Ron Avnur)
– first steps in handling streaming sources
Cluster Data Layout
– issues: fragmentation, placement, replication on
10^6 disks. For DB/FS/Web.
– goals: availability, efficiency, consistency,
manageability.
– Adaptivity: cooperative vs. competitive ($$)
techniques?
– (Mehul Shah)
Global Federation
• Global distribution
– federated DBMS layer a la Mariposa/Cohera
• address all the hard stuff they dropped!
– Global data placement
• as in cluster, but must be competitive. (Mehul Shah)
– Global query processing (Amol Deshpande)
• Agoric query optimization
• distributed query processing
– Global metadata
• yellow pages both for services & datasets
• Millenium/Ninja tie-ins?
Applications
• Really finding stuff in all the world’s data?
– UI meets AI meets Logic (browse/mine/query)
•
•
•
•
CONTROL is key: seamless, non-blocking interaction
multi-res output and feedback during browse/query
hints, wizards, training (AI mining, user in the loop)
build on existing “scalable spreadsheet”/xform tools
(Shankar Raman)