sibling groups

Download Report

Transcript sibling groups

P2P Databases
P2P Today
edonkey
bittorrent
pastry
napster
aim
open cola
?
farsite
gnutella
morpheus
bearshare
grove
kazaa
fiorana
freenet
united devices
ocean store
uddi
jxta
can
netmeeting
icq
limewire
jabber
folding@home
process tree
ebay
chord
seti@home
popular power
tapestry
mojo nation
Object representation and storage
Objects
Attributes :
Name , Artist, Album ,
Genre
Pointer to object
P2P vs. Distributed DBMS
Traditional DDBMS Issues:
• Transactions
• Distributed Query Optimization
• Interoperation of heterogeneous data
sources
• Reliability/failure of nodes
Complex features do not scale
P2P vs. Distributed DBMS
Example application: file-sharing
• Simple data model and query language
– No complex query optimization
– Easy interoperation
• No guarantee on quality of results
– Individual site availability unimportant
• Local updates
– No transactions
– Network partitions OK
Simple
Amenable to large-scale network of PCs
Example: file sharing
• Challenge #1:
Performance
– Asking everyone
is expensive!
– If I am smart, I
only need to ask
one peer
– How can I be
smart?
File X?

?
?
?
?
Search in P2P
• System can control:
– Connections made by users/topology
– Data placement
– Query type
• Tight control: “Structured”
– Efficient, comprehensive
• Loose control: “Unstructured”
– Inefficient, not comprehensive, simple, expressive
– Used in real life
Both are useful to study
Centralized
• Napster model
• Benefits:
Bob
Alice
– Efficient search
– Limited bandwidth
usage
– No per-node state
• Drawbacks:
– Central point of failure
– Limited scale
Judy
Jane
http://www.snocap.com/
Unstructured – Query Flooding
= query source
= forward
query
= processed
query
= found
result
= forward
response
Problems with unstructured
• Inefficient
– Query messages are flooded
– Even if routing is intelligent, worst case load is still
O(n), where n is # nodes in system
• Not comprehensive
– If I do not get a result for my query, is it because
none exists?
• (Of course, many optimizations are
possible…)
Structured systems address these problems
Distributed Hash Table (DHTs)
• Model:
– Key/Object pair, the key is hashed to get an ID
– Example:
• Objects are files
• The key is the content of the file
• The ID is the hash of the file contents
• Single operation: Lookup(ID)
– Input: integer ID
– Output: the object with the corresponding ID
Identifiers
• IDs are m-bit integers
• Nodes are also assigned IDs
– Commonly assigned by hashing a node’s IP
address, although many problems with this
• An object is stored on the node with the
smallest ID greater than the object’s ID
– This node is called the successor of the object’s
ID
– IDs are arranged on a circle, so 0 > 2m-1
Data Placement
0
m=3
7
6
Nodes:
•0
•1
•3
1
6
1
6
2
2
2
3
5
4
Data:
•1
•2
•6
Connections
0
7
1
Distance
• 20
• 21
….
• 2m-1
“Finger pointers”
6
2
3
5
4
Query
• Lookup(objectID)
– objectID is typically the ID of the object you are looking
for, but not necessarily
• Approach:
– Find the predecessor of the object
• I.e. the node with the largest ID that is smaller than the object ID
– Return the successor of the predecessor
Query Example
• Say node 0 wants to find the object with ID =
7
• For simplicity, we will assume a node exists
at every ID in the space
Query Example
0
Node 0: Lookup(7)
7
1
6
Node 0: FindPred (7)
2
3
5
4
Query Example
0
Node 4: FindPred(7)
7
1
6
2
3
5
4
Query Example
0
Node 6: FindPred(7)
7
1
6
Node 6 is predecessor
Return successor node 7
2
3
5
4
Query characteristics
• With high probability, a query can be
answered by contacting O(log N) nodes
– N total nodes in the network
Efficient!
• Also notice: if an object with the ID exists in
the network, it will be found
Comprehensive!
• State is also O(log N) in size
Query characteristics
• Note that finger pointers are not required for
correct operation
– Only successor pointers are needed
– But then cost of query increases
• O(N) in worst case
Advantages of Structured?
• Scalability/Efficiency
– load grows with O(log N)
• Comprehensiveness
Disadvantages? (cont)
• Availability of Data
– If a node dies suddenly, what happens to
the data it was storing?
– MUST replicate data across multiple nodes
• Query Language
– How can we express keyword queries
efficiently?
– Many useful applications require different
languages
Magnolia
“Seven Innovation Myths”
h(title)
1100100101
“Innovation”
Current approach: Hash each keyword separately and store pointers at h(keyword)
Seven
h(some)
Innovation
h(innovation)
Myths
h(myths)
1100100101
Resulting Distribution
Prefix hashing
Prefix Hashing
Innovation
hP(innovation)
hP = m’ bit hash function
………….
100
m’
m bits
Partitions network into ~ n/2m’ separate sibling groups
n = nodes, m’ partitioning factor
For m’=12, n= 1 million, ~ 256 nodes will share same prefix
Assumption: h is uniformly distributed
Balancing
Innovation
Balanced over
the sibling group
100
Sibling group ID=100
All siblings in a group share the
same prefix
Insert
Keyword hP SiblingGroup ID
Locate a sibling node via SIFT
Lookup
Group Broadcast or Multicast
Replies
O(1)
Keyword
Random
Sibling
Advantages
• Good Balancing Properties
Advantages
• Low Traffic Load on nodes for popular
queries
• Quick Lookup
• Popularity Ranking of Objects
• Distributed Replication for resilience
Implementing Magnolia
• Developed on top of a chord clone written in
Python
– If you’re going to write a peer-to-peer app, why
not leverage existing modules and libraries?
• Challenge: How do we implement groupbased stores and queries without requiring
additional network maintenance?
Chord’s Finger Table
• A chord node maintains a finger table of M
IP’s pointing to nodes ahead of it in the ring.
– A pointer at index i is the successor of node id +
(2^i-1). This lets us reach any node in the
network in O(log M) hops
• We use the M’ most significant bits in a
node’s id to indicate it’s group. We want to
reach any group in O(log M’) hops.
– Do we need another table?
– Nope. The last M’ entries in our finger table
provide this.
Talking to Siblings
• How do we propagate queries through the
group?
• Naïve solution: send to our predecessor and
successor.
• A better solution: We can send a query
throughout the group by treating the sibling
group as a tree.
Sibling Tree
N/N’ = 16; M/M’ = 4
0
1
2
3
4
5
6
7
8
9
0
0+1
10
11
1+2^2
2
3
8+1
15
5+1
6
8+2^2
12
9
5
4
14
8
1+1
2+2^1
13
0+2^3
1
2+1
12
5+2^1 9+1
7
10
9+2^1
11
Every edge can be found in the finger table!
12+1
13
12+2^1
14
14+2^0
15
Sibling Tree Problems
• Problems:
– Not every possible node will exist
– Not every node will have results to report
– The query maker needs to know when the search is
done
• But we’re okay!
– Nodes can determine if a child sub-tree is dead
– Even if a child node in our sibling table is of a higher ID
than expected
• its sub-tree contains all existing descendents of the expected id
• we can predict when a child is in a sibling our ancestor’s tree
Bigger Problems
• What if a pointer in our finger table fails?
– We either have to find the successor to it’s id or
fail to query the sub-tree
• What if the lowest ID node isn’t the root of
our tree?
– Some of our edges won’t be in our finger table
Popularity queries
Yulania , Demo
BitTorrent
SplitStream