Concurrent reading and writing with mobile agents

Download Report

Transcript Concurrent reading and writing with mobile agents

Introduction to Peer-to-Peer
Networks
What is a P2P network
• A P2P network is a large distributed system. It uses the
vast resource of PCs distributed at the edge of the Internet
to build a network that allows resource sharing without any
central authority
• Client-Server vs. Peer-to-peer. A peer is both a client
and a server. Control is decentralized.
• Much more than a system for sharing pirated music.
Why does P2P need attention?
A P2P network is an overlay
network
Network of peers. Each link between peers consists of one
or more IP links. The overlay network resides in the
application layer.
Bob
Alice
Carol
Well-known P2P Systems
• Napster
• Gnutella
• KaZaA
• eDpnkey
• Chord
• Tapestry
• CAN
• Pastry
• BitTorrent
Some important issues
Search
Storage
Security
Applications
A Distributed Storage Service
Bob
Alice
Carol
David
Promises
Consider File Sharing as an Example
– Available 24/7
– Durable despite machine failures
– Information is protected
– Resilient to Denial of Service
Additional Goals
• Massive scalability
• Anonymity
• Deniability
• Resistance to censorship
Challenges
• A P2P network must be self-organizing. Join
and leave operations must be self-managed.
• The infrastructure is untrusted and the
components are unreliable. The number of
faulty nodes grows linearly with system size.
Yet, the aggregate behavior has to be
trustworthy.
Challenges
• Tolerance to failures and churn
• Efficient routing even if the structure of the
network is unpredictable.
• Dealing with freeriders
• Load balancing
• Security issues
Looking up data
• How do you locate data/files/objects in a large P2P
system built around a dynamic set of nodes in a
scalable manner without any centralized server or
hierarchy?
• Napster index servers used a central database.
Questionable scalability and poor resilience.
• Check how names are looked up in internet’s DNS.
Napster
Users
I
N
T
E
R
N
E
T
Stores indices
of songs only
Root/
Redirector
Directory
server
Directory
server
Directory
server
Developed by Shawn Fanning in 1999, Shut down after 2 years for
copyright infringement. Centralized directory servers were a bottleneck..
Gnutella
Truly decentralized system. A search like
where is Double Helix?
is based on the flooding of the query on a graph of
arbitrary topology. Obvious scalability problem, and
the wastage of bandwidth caused serious
inefficiencies.
Gnutella graph
double helix
Client looking
for “double helix”
Unstructured vs. Structured
• Unstructured P2P networks allow resources
to be placed at any node. The network
topology is arbitrary, and the growth is
spontaneous.
• Structured P2P networks simplify resource
location and load balancing by defining a
topology and defining rules for resource
placement.
Distributed Hash Table (DHT)
Object-to-machine mapping uses unique keys.
H (object name) = key (H = hash function)
H (machine name) = key
Object name mapped to key k is placed in
machine whose name is mapped to key k.
Simplifies object location.
Distributed Hash Table (DHT)
c
N-1
0
keyspace
a
Machine name
hashed to b
Object name
hashed to b
Basic idea
b