PPT - Ryan Huebsch

Download Report

Transcript PPT - Ryan Huebsch

Querying the Internet with PIER
(PIER = Peer-to-peer Information Exchange and Retrieval)
Ryan Huebsch†
Joe Hellerstein†, Nick Lanham†,
Boon Thau Loo†, Scott Shenker‡, Ion Stoica†
[email protected]
†UC
Berkeley, CS Division
‡International Computer Science Institute, Berkeley CA
VLDB 9/12/03
What is Very Large?
Depends on Who You Are
Single Site
Clusters
Distributed
10’s – 100’s
Database Community

Internet Scale
1000’s – Millions
Network Community
Challenge: How to run DB style queries at
Internet Scale!
2
What are the Key Properties?

Lots of data that is:
1.
2.
3.
4.

Naturally distributed (where it’s generated)
Centralized collection undesirable
Homogeneous in schema
Data is more useful when viewed as a whole
This is the design space we have chosen to
investigate.
3
Who Needs Internet Scale?
Example 1: Filenames

Simple ubiquitous schemas:




Filenames, Sizes, ID3 tags
Born from early P2P systems such as Napster,
Gnutella, AudioGalaxy, etc.
Content is shared by “normal” non-expert
users… home users
Systems were built by a few individuals ‘in
their garages’  Low barrier to entry
4
Example 2: Network Traces

Schemas are mostly standardized:


Network administrators are looking for
patterns within their site AND with other
sites:




IP, SMTP, HTTP, SNMP log formats
DoS attacks cross administrative boundaries
Tracking virus/worm infections
Timeliness is very helpful
Might surprise you how useful it is:

Network bandwidth on PlanetLab (world-wide
distributed research test bed) is mostly filled with
people monitoring the network status
5
Our Challenge

Our focus is on the challenge of scale:

Applications are homogeneous and distributed



Already have significant interest
Provide a flexible framework for a wide variety of
applications
Other interesting issues we will not discuss
today:



Heterogeneous schemas
Crawling/Warehousing/etc
These issues are complementary to the issues we
will cover in this talk
6
Four Design Principles (I)

Relaxed Consistency



ACID transactions severely limits the
scalability and availability of distributed
databases
We provide best-effort results
Organic Scaling

Applications may start small, without
a priori knowledge of size
7
Four Design Principles (II)

Natural habitat




No CREATE TABLE/INSERT
No “publish to web server”
Wrappers or gateways allow the information to be
accessed where it is created
Standard Schemas via Grassroots software

Data is produced by widespread software
providing a de-facto schema to utilize
8
Declarative
Queries
Query Plan
Overlay Network
Physical Network
Network
Monitoring
Other User
Apps
Applications
Query
Optimizer
Catalog
Manager
Core
Relational
Execution
Engine
PIER
DHT
Wrapper
Storage
Manager
IP
Network
Overlay
Routing
DHT
Network
Recent Overlay Networks:
Distributed Hash Tables (DHTs)

What is a DHT?



Take an abstract ID space, and partition among a
changing set of computers (nodes)
Given a message with an ID, route the message to
the computer currently responsible for that ID
Can store messages at the nodes



This is like a “distributed hash table”
Provides a put()/get() API
Cheap maintenance when nodes come and go
10
Recent Overlay Networks:
Distributed Hash Tables (DHTs)
(16,16)
(16,0)
Key = (15,14)
Data
(0,0)
(0,16)
11
Recent Overlay Networks:
Distributed Hash Tables (DHTs)

Lots of effort is put into making DHTs better:






Scalable (thousands  millions of nodes)
Resilient to failure
Secure (anonymity, encryption, etc.)
Efficient (fast access with minimal state)
Load balanced
etc.
12
PIER’s Three Uses for DHTs

Single elegant mechanism with many uses:

Search: Index


Partitioning: Value (key)-based routing


Like Gamma/Volcano
Routing: Network routing for QP messages




Like a hash index
Query dissemination
Bloom filters
Hierarchical QP operators (aggregation, join, etc)
Not clear there’s another substrate that
supports all these uses
13
Metrics

We are primarily interested in 3 metrics:




Different DHTs provide different properties:




Answer quality (recall and precision)
Bandwidth utilization
Latency
Resilience to failures (recovery time)  answer quality
Path length  bandwidth & latency
Path convergence  bandwidth & latency
Different QP Join Strategies:


Symmetric Hash Join, Fetch Matches, Symmetric Semi-Join,
Bloom Filters, etc.
Big Picture: Tradeoff bandwidth (extra rehashing) and
latency
14
Experimental Setup

A simple join query






SELECT R.pkey, S.pkey, R.pad
FROM
R, S
WHERE R.num1 = S.pkey AND
R.num2 > constant1 AND
S.num2 > constant2 AND
f(R.num3, S.num3) > constant3
Table R has 10 times more tuples than S
num1, num2, num3 are uniformly distributed
R.num1 is chosen such that 90% of R tuples have one
matching tuple in S, while 10% have no matching tuples.
R.pad is used to insure all result tuples are 1KB
Symmetric Hash Join
15
Does This Work for Real?
Scale-up Performance (1MB source data/node)
Real Network
Simulation
16
Current Status of PIER



PIER currently runs boxand-arrow dataflow
graphs
We are working on
query optimization
GUI (called Lighthouse)
for query entry and
result display


Designed primarily for
application developers
User inputs an optimized
physical query plan
17
Future Research







Routing, Storage and Layering
Catalogs and Query Optimization
Hierarchical Aggregations
Range Predicates
Continuous Queries over Streams
Sharing between Queries
Semi-structured Data
18
Bonus Material
Symmetric Hash Join (SHJ)
 r.c > s.c
NS=temp
r.a = s.a
PUT r.a
PUT s.a
r.b=constant s.b=constant
R
S
NS=r
NS=s
21
Fetch Matches (FM)
 s.b=constant AND r.c > s.c
r.a = s.a
 r.b=constant
GETs.a
R
S
NS=r
NS=s
22
Symmetric Semi Join (SSJ)
NS=temp
 r.c > s.c

r.a = s.a
r.a = r.a
s.a = s.a
GET s.key
S
r.a = s.a
PUT
r.a
GET r.key
PUT s.a
NS=s
R
NS=r


 r.a, r.key
 s.a, s.key

Both R and S are
projected to save
bandwidth
The complete R and
S tuples are fetched
in parallel to
improve latency
r.b=constant s.b=constant
R
S
NS=r
NS=s
23