search algorithms in peer to peer networks

Download Report

Transcript search algorithms in peer to peer networks

INTRODUCTION TO
PEER TO PEER NETWORKS
Z.M. Joseph
CSE 6392 – DB Exploration
Spring 2006
CSE, UT Arlington
WHAT DOES KONICHIWA
MEAN?
Peer to Peer Networks
Node
Node
•
•
•
•
•
Node
Node
Internet
Decentralized and distributed system
Nodes are equivalent (Peers)
Data could be at ANY node on the network
Nodes leave and join the network
Network is resilient
• Avoid dependence on central resources
Node
Centralized Network
• Napster model
Client
Client
Server
Reply
Query
File Transfer
• Nodes register their
contents with server
• Centralized server for
searches
• File access done on a
peer to peer basis
– Poor scalability
– Single point of failure
Unstructured Blind - Gnutella
Breadth-First Search (BFS)
= source
= forward
query
= processe
query
= found
result
= forward
respons
Unstructured Blind - Gnutella
• A node/peer connects to a set of Gnutella
neighbors
•
Forward queries to neighbors
• Client which has the Information responds.
• Flood network with TTL for termination
+ Results are complete
– Bandwidth wastage
Random Walkers
• Improved Unstructured Blind
•Similar structure to
Gnutella
•Forward the query
(called walker) to
random subset of it
neighbors
+ Reduced bandwidth
requirements
– Incomplete results
Peer nodes
Unstructured Informed Networks
• Zero in on target based on information about the query
and the neighbors.
• Intelligent routing
+ Reduces number of messages
+ Not complete, but more accurate
– COST: Must thus flood in order to get initial information
Informed Searches: Local Indices
• Node keeps track of information available within
a radius of r hops around it.
• Queries are made to neighbors just beyond the r
radius.
+ Flooding limited to bounded part of network
Routing Indices
• For each query, calculate goodness of each
neighbor.
• Calculating goodness:
– Categorize or separate query into themes
– Rank best neighbors for a given theme based on
number of matching documents
• Follows chain of neighbors that are expected to
yield the best results
• Backtracking possible
Bloom Filters
• Bloom filter is a bit pattern (Hash, etc)
• Contains the likelihood of a match
• Can determine the degree of similarity
• Also known as Lossy Distributed Index
• Attenuated Bloom Filters
– Maintain downstream bloom filters for each neighbor
– Reduce weight of distant nodes when choosing neighbors
Bloom Filters CONTD.
N neighbors
1
0
0
0
0
0
0
0
0
1
0
0
0
1
0
1
0
0
0
1
0
1
0
0
0
1
0
0
Requesting
Node
Hash
value
0
0
0
0
1
1
1
0
0
0
0
0
Structured P2P Networks
• Self-organizing
• Load balanced and Resilient
• Fault-tolerant
• Guarantees on numbers of hops to answer a
query
• Based on a Distributed Hash Table (DHT)
Properties of DHT
• Keys mapped evenly to all nodes in the
network
• Each node maintains information about only
a few other nodes
• Efficient routing of messages to nodes
• Node insertion/deletion only affects a few
nodes
Chord
• Chord provides operations:
– P2P hash lookup must give:
K0
• Lookup(key)  IP address
K11
– Uses Hash function:
N10
N1
• Key identifier = SHA-1 (key)
Circular
• Node identifier = SHA-1 (IP add)
ID
– Both are uniformly distributed
space
K7
– Both exist in the same ID space
K4
• How to map key IDs to node IDs?
– A key is stored at its successor: node with next higher
ID (modulo N)
Chord continued…..
start
Interval
Succ
100
[100,101)
110
101
[101,103)
5
103
[103,107)
5
107
[107,115)
5
…
…
…
9
[9,13)
10
13
[13,21)
20
N5
N110
N10
K19
N20
N99
N32
115
[115,3)
5
3
[3,35)
5
35
[35,100)
60
N80
N60
Lookup
(K19)
Analysis of Chord
In a system with N nodes and K keys:
 Each node manages at most K/N keys
 Bound information stored in every node
 Lookups resolved with O(logN) hops
Benefits of P2P Networks
• Ideally:
– Allows peers anywhere to share information and/or
resources dynamically
– Decentralized
– Resilient to failures and network changes
– Utilizes resources located closer to requesting nodes
References
• www.ececs.uc.edu/~annexste/Papers/OL
N.ppt
• www.dcs.bbk.ac.uk/selene/reports/SeLeN
eP2P.ppt
• www.eecg.utoronto.ca/~vinod/mie456/p2pi
ntro.ppt
• L. Singh, Z. Joseph: Search Algorithms in
Peer to Peer Networks (CSE5311 Fall ‘05)