Concurrent reading and writing with mobile agents

Download Report

Transcript Concurrent reading and writing with mobile agents

Introduction to Peer-to-Peer
Networks
What is a P2P network
• Uses the vast resource of the machines at the edge
of the Internet to build a network that allows resource
sharing without any central authority.
• Client-Server vs. Peer-to-peer. A peer is both a
client and a server. Control is decentralized.
• Much more than a system for sharing pirated
music.
Historical Perspective
• The Internet originally emphasized working in the P2P
mode instead of the client-server mode.
• SRI, UCLA, UCSB and University of Utah had powerful
host machines forming a league of equals. ARPANET
arranged to integrate them in the late 1960’s.
Historical Perspective
• USENET was originally based on UUCP (Unix-toUnix Copy Protocol). It allowed users on two different
Unix machines to exchange messages and files.
Why does P2P need attention?
Overlay network
A P2P network is an overlay network. Each link
between peers consists of one or more IP links.
Bob
Alice
Carol
Well-known P2P Systems
•
Napster
•
Gnutella
•
KaZaA
•
Limewire
•
eDonkey
•
Chord
•
Tapestry
•
CAN
•
Pastry
•
BitTorrent
•
Kademlia
•
Skype
•
Various Social networks
Some important issues
Search
Storage
Security
Applications
A Distributed Storage Service
Bob
Alice
Carol
David
Promises
Consider File Sharing as an Example
– Available 24/7
– Durable despite machine failures
– Information is protected
– Resilient to Denial of Service
Additional Goals
• Massive scalability
• Anonymity
• Deniability
• Resistance to censorship
Challenges
• A P2P network must be self-organizing. Join
and leave operations must be self-managed.
• The infrastructure is untrusted and the
components are unreliable. The number of
faulty nodes grows linearly with system size.
Yet, the aggregate behavior has to be
trustworthy.
Challenges
• Tolerance to failures and churn
• Efficient routing even if the structure of the
network is unpredictable.
• Dealing with freeriders
• Load balancing
• Security issues
Looking up data
• How do you locate data/files/objects in a large P2P
system built around a dynamic set of nodes in a
scalable manner without any centralized server or
hierarchy?
• Napster index servers used a central database.
Questionable scalability and poor resilience.
• Check how names are looked up in internet’s DNS.
Napster
Users
I
N
T
E
R
N
E
T
Stores indices
of songs only
Root/
Redirector
Directory
server
Directory
server
Directory
server
Developed by Shawn Fanning in 1999, Shut down after 2 years for
copyright infringement. Centralized directory servers were a bottleneck..
Gnutella
Truly decentralized system. A search like
where is Double Helix?
is based on the flooding of the query on a graph of
arbitrary topology. Obvious scalability problem, and
the wastage of bandwidth caused serious
inefficiencies.
Gnutella graph
double helix
Client looking
for “double helix”
Unstructured vs. Structured
• Unstructured P2P networks allow resources
to be placed at any node. The network
topology is arbitrary, and the growth is
spontaneous.
• Structured P2P networks simplify resource
location and load balancing by defining a
topology and defining rules for resource
placement.
Distributed Hash Table (DHT)
Object-to-machine mapping uses unique keys.
H (object name) = key (H = hash function)
H (machine name) = key
Object name mapped to key k is placed in
machine whose name is mapped to key k.
Simplifies object location.
Distributed Hash Table (DHT)
c
N-1
0
keyspace
a
Machine name
hashed to b
Object name
hashed to b
Basic idea
b