- Ryan Huebsch

Transcript - Ryan Huebsch

Complex Queries in DHT-based
Peer-to-Peer Networks
Matthew Harren, Joe Hellerstein,
Ryan Huebsch, Boon Thau Loo,
Scott Shenker, Ion Stoica
[email protected]
UC Berkeley, CS Division
IPTPS 3/8/02
Outline



Contrast P2P & DB systems
Motivation
Architecture




DHT Requirements
Query Processor
Current Status
Future Research
2
Uniting DHTs and
Query Processing…
DHT
CAN
Chord
Tapestry
Pastry
Query
Processor
Predicates
SQL
Group By
Joins
Aggregation
Relational Data
3
P2P & DB Systems
Flexibility
 
Decentralized
 
Strong Semantics
 
Powerful query facilities
 
Fault Tolerance
 
Lightweight
 
Transactions & Concurrency Control
 
4
P2P + DB = ?

P2P Database? No!



ACID transactional guarantees do not scale, nor
does the everyday user want ACID semantics
Much too heavyweight of a solution for the
everyday user
Query Processing on P2P!



Both P2P and DBs do data location and movement
Can be naturally unified (lessons in both
directions)
P2P brings scalability & flexibility
DB brings relational model & query facilities
5
P2P Query Processing
(Simple) Example

Filesharing+
SELECT song, size, server…
FROM album, song
WHERE album.ID = song.albumID
AND album.name = “Rubber Soul”
Keyword searching is ONE canned SQL query
Imagine what else you could do!

6
P2P Query Processing
(Simple) Example

Filesharing+
SELECT song, size, server…
FROM album-ngrams AN, song
WHERE AN.ID = song.albumID AND
AN.ngram IN <list of search ngrams>
GROUP BY AN.ID
HAVING COUNT(AN.ngram) >=
<# of ngrams in search>
Keyword searching is ONE canned SQL query
Imagine what else you could do!

Fuzzy Searching, Resource Discovery, Enhanced DNS

7
What this project
IS and IS NOT about…

IS NOT ABOUT: Absolute Performance


IS ABOUT: Decentralized Features


No administrator, anonymity, shared resources,
tolerates failures, resistant to censorship…
IS NOT ABOUT: Replacing RDBMS


In most situations a centralized solution could be
faster…
Centralized solutions still have their place for many
applications (commercial records, etc.)
IS ABOUT: Research synergies

Unifying/morphing design principles and
techniques from DB and NW communities
8
General Architecture


Based on Distributed
Hash Tables (DHT) to
get many good
networking properties
A query processor is
built on top

Note: the data is stored
separately from the
query engine, not a
standard DB practice!
9
DHT – API

Basic API




publish(RID, object)
lookup(RID)
multicast(object)
NOTE: Applications can only fetch-byname… a very limited query language!
10
DHT – API Enhancements I

Basic API




publish(namespace, RID, object)
lookup(namespace, RID)
multicast(namespace, object)
Namespaces: subsets of the ID space
for logical and physical data partitioning
11
DHT – API Enhancements II

Additions





lscan(namespace) – retrieve the data stored
locally from a particular namespace
newData(namespace) – receive a callback when
new data is inserted into the local store for the
namespace
This violates the abstraction of location independence
Why necessary? Parallel scanning of base relation
Why acceptable? Access is limited to reading,
applications can not control the location of data
12
Query Processor
(QP) Architecture


QP is just another application as far as the
DHT is concerned… DHT objects = QP tuples
User applications can use QP to query data
using a subset of SQL





Select
Project
Joins
Group By / Aggregate
Data can be metadata (for a file sharing type
application) or entire records, mechanisms
are the same
13
Indexes. The lifeblood of a
database engine.
Primary Index
Secondary
Data
Primary
PKey
DHT
Index NS

DHT’s mapping of RID/Object is equivalent to an index
Additional indexes are created by adding another key/value
pair with the key being the value of the indexed field(s) and
value being a ‘pointer’ to the object (the RID or primary key)
DHT

Key
PKey
Ptr
Data
Secondary Index
14
Relational Algorithms


Selection/Projection
Join Algorithms

Symmetric Hash


Fetch Matches




Use lscan on tables R & S. Republish tuples in a temporary
namespace using the join attributes as the RID. Nodes in
the temporary namespace perform mini-joins locally as
tuples arrive and forwards results to requestor.
If there is an index on the join attribute(s) for one table
(say R), use lscan for other table (say S) and then issue a
lookup probing for matches in R.
Semi-Join like algorithms
Bloom-Join like algorithms
Group-By (Aggregation)
15
Interesting note…

The state of the join is stored in the DHT store



Rehashed data is automatically re-routed to the
proper node if the coordinate space adjusted
When a node splits (to accept a new node into the
network) the data is also split, this includes previously
delivered rehashed tuples
Allows for graceful re-organization of the
network not to interfere with ongoing operations
16
Where we are…



A working real implementation of our Query
Processing (currently named PIER) on top of
a CAN simulator
Initial work studying and analyzing
algorithms… nothing really ground-breaking…
YET!
Analyzing the design space and which
problems seem most interesting to pursue
17
Where to go from here?

Common Issues:




Database Issues:






Caching – Both at DHT and QP levels
Using Replication – for speed and fault tolerance
(both in data and computation)
Security
Pre-computation of (intermediate) results
Continuous queries/alerters
Query optimization (Is this like network routing?)
More algorithms, Dist-DBMS have more tricks
Performance Metrics for P2P QP Systems
What are the new apps the system enables? 18

- Ryan Huebsch

Transcript - Ryan Huebsch

Directory