A P2P network is an overlay network
Download
Report
Transcript A P2P network is an overlay network
Peer-to-Peer Networks
What is a P2P network
P2P network is a large distributed system. It uses the vast
resource of PCs owned by ordinary people to build a network
that allows resource sharing without any central authority
Client-Server vs. Peer-to-peer. A peer is both a client and a
server. Control is decentralized.
Much more than a system for sharing pirated music.
Why does P2P need
attention?
A P2P network is an overlay
network
Network of peers. Each link between peers consists of one or more IP links.
The overlay network resides in the application layer.
Bob
Alice
Carol
Some Well-known P2P
Systems
Napster
Gnutella
KaZaA
Chord
Tapestry
CAN
Pastry
BitTorrent
Research trends
Search
storage
Security
Applications
A Distributed File Service
Bob
Alice
Carol
David
Promises
Consider File Sharing as an Example
Available 24/7
Durable despite machine failures
Information is protected
Resilient to Denial of Service
New Goals
Massive scalability
Anonymity
Deniability
Resistance to censorship
Challenges
A P2P network must be self-organizing
Untrusted infrastructure and unreliable
components. The number of faulty nodes grows
linearly with system size. Yet, the aggregate
behavior must be trustworthy.
A P2P network must be scalable.
Challenges
Tolerance to failures and churn
Efficient routing even if the structure of the
network is unpredictable.
Dealing with freeriders
Load balancing
Security issues
Looking up data
How do you locate data/files/objects in a large
P2P system built around a dynamic set of nodes in
a scalable manner without any centralized server
or hierarchy?
Napster index servers used a central database.
Poor scalability and poor resilience.
Check how names are looked up in internet’s DNS.
Unstructured vs. Structured
Unstructured P2P networks allow resources to be
placed at any node, and the network topology is
arbitrary.
Structured P2P networks simplify resource
location and load balancing by defining a topology
and defining rules for resource placement.
Napster
Used centralized index server that became a
bottleneck. Easy to censor.
Gnutella
Truly decentralized system, but search is based on
flooding the queries. Obvvious scalability problem,
and the wastage of bandwidth caused serious
inefficiencies.
Distributed Hash Table (DHT)
Object --> machine mapping uses unique keys.
H (object name) = key
(H = hash function)
H (machine name) = key
Object mapped to key k is placed in machine with
key k.
Interesting issues
Search
storage
Security
Applications