PPT - Ryan Huebsch
Download
Report
Transcript PPT - Ryan Huebsch
Querying the Internet with PIER
(PIER = Peer-to-peer Information Exchange and Retrieval)
Ryan Huebsch†
Joe Hellerstein†, Nick Lanham†,
Boon Thau Loo†, Scott Shenker‡, Ion Stoica†
[email protected]
†UC
Berkeley, CS Division
‡International Computer Science Institute, Berkeley CA
VLDB 9/12/03
What is Very Large?
Depends on Who You Are
Single Site
Clusters
Distributed
10’s – 100’s
Database Community
Internet Scale
1000’s – Millions
Network Community
Challenge: How to run DB style queries at
Internet Scale!
2
What are the Key Properties?
Lots of data that is:
1.
2.
3.
4.
Naturally distributed (where it’s generated)
Centralized collection undesirable
Homogeneous in schema
Data is more useful when viewed as a whole
This is the design space we have chosen to
investigate.
3
Who Needs Internet Scale?
Example 1: Filenames
Simple ubiquitous schemas:
Filenames, Sizes, ID3 tags
Born from early P2P systems such as Napster,
Gnutella, AudioGalaxy, etc.
Content is shared by “normal” non-expert
users… home users
Systems were built by a few individuals ‘in
their garages’ Low barrier to entry
4
Example 2: Network Traces
Schemas are mostly standardized:
Network administrators are looking for
patterns within their site AND with other
sites:
IP, SMTP, HTTP, SNMP log formats
DoS attacks cross administrative boundaries
Tracking virus/worm infections
Timeliness is very helpful
Might surprise you how useful it is:
Network bandwidth on PlanetLab (world-wide
distributed research test bed) is mostly filled with
people monitoring the network status
5
Our Challenge
Our focus is on the challenge of scale:
Applications are homogeneous and distributed
Already have significant interest
Provide a flexible framework for a wide variety of
applications
Other interesting issues we will not discuss
today:
Heterogeneous schemas
Crawling/Warehousing/etc
These issues are complementary to the issues we
will cover in this talk
6
Four Design Principles (I)
Relaxed Consistency
ACID transactions severely limits the
scalability and availability of distributed
databases
We provide best-effort results
Organic Scaling
Applications may start small, without
a priori knowledge of size
7
Four Design Principles (II)
Natural habitat
No CREATE TABLE/INSERT
No “publish to web server”
Wrappers or gateways allow the information to be
accessed where it is created
Standard Schemas via Grassroots software
Data is produced by widespread software
providing a de-facto schema to utilize
8
Declarative
Queries
Query Plan
Overlay Network
Physical Network
Network
Monitoring
Other User
Apps
Applications
Query
Optimizer
Catalog
Manager
Core
Relational
Execution
Engine
PIER
DHT
Wrapper
Storage
Manager
IP
Network
Overlay
Routing
DHT
Network
Recent Overlay Networks:
Distributed Hash Tables (DHTs)
What is a DHT?
Take an abstract ID space, and partition among a
changing set of computers (nodes)
Given a message with an ID, route the message to
the computer currently responsible for that ID
Can store messages at the nodes
This is like a “distributed hash table”
Provides a put()/get() API
Cheap maintenance when nodes come and go
10
Recent Overlay Networks:
Distributed Hash Tables (DHTs)
(16,16)
(16,0)
Key = (15,14)
Data
(0,0)
(0,16)
11
Recent Overlay Networks:
Distributed Hash Tables (DHTs)
Lots of effort is put into making DHTs better:
Scalable (thousands millions of nodes)
Resilient to failure
Secure (anonymity, encryption, etc.)
Efficient (fast access with minimal state)
Load balanced
etc.
12
PIER’s Three Uses for DHTs
Single elegant mechanism with many uses:
Search: Index
Partitioning: Value (key)-based routing
Like Gamma/Volcano
Routing: Network routing for QP messages
Like a hash index
Query dissemination
Bloom filters
Hierarchical QP operators (aggregation, join, etc)
Not clear there’s another substrate that
supports all these uses
13
Metrics
We are primarily interested in 3 metrics:
Different DHTs provide different properties:
Answer quality (recall and precision)
Bandwidth utilization
Latency
Resilience to failures (recovery time) answer quality
Path length bandwidth & latency
Path convergence bandwidth & latency
Different QP Join Strategies:
Symmetric Hash Join, Fetch Matches, Symmetric Semi-Join,
Bloom Filters, etc.
Big Picture: Tradeoff bandwidth (extra rehashing) and
latency
14
Experimental Setup
A simple join query
SELECT R.pkey, S.pkey, R.pad
FROM
R, S
WHERE R.num1 = S.pkey AND
R.num2 > constant1 AND
S.num2 > constant2 AND
f(R.num3, S.num3) > constant3
Table R has 10 times more tuples than S
num1, num2, num3 are uniformly distributed
R.num1 is chosen such that 90% of R tuples have one
matching tuple in S, while 10% have no matching tuples.
R.pad is used to insure all result tuples are 1KB
Symmetric Hash Join
15
Does This Work for Real?
Scale-up Performance (1MB source data/node)
Real Network
Simulation
16
Current Status of PIER
PIER currently runs boxand-arrow dataflow
graphs
We are working on
query optimization
GUI (called Lighthouse)
for query entry and
result display
Designed primarily for
application developers
User inputs an optimized
physical query plan
17
Future Research
Routing, Storage and Layering
Catalogs and Query Optimization
Hierarchical Aggregations
Range Predicates
Continuous Queries over Streams
Sharing between Queries
Semi-structured Data
18
Bonus Material
Symmetric Hash Join (SHJ)
r.c > s.c
NS=temp
r.a = s.a
PUT r.a
PUT s.a
r.b=constant s.b=constant
R
S
NS=r
NS=s
21
Fetch Matches (FM)
s.b=constant AND r.c > s.c
r.a = s.a
r.b=constant
GETs.a
R
S
NS=r
NS=s
22
Symmetric Semi Join (SSJ)
NS=temp
r.c > s.c
r.a = s.a
r.a = r.a
s.a = s.a
GET s.key
S
r.a = s.a
PUT
r.a
GET r.key
PUT s.a
NS=s
R
NS=r
r.a, r.key
s.a, s.key
Both R and S are
projected to save
bandwidth
The complete R and
S tuples are fetched
in parallel to
improve latency
r.b=constant s.b=constant
R
S
NS=r
NS=s
23