A P2P network is an overlay network

Download Report

Transcript A P2P network is an overlay network

Peer-to-Peer Networks
What is a P2P network

P2P network is a large distributed system. It uses the vast
resource of PCs owned by ordinary people to build a network
that allows resource sharing without any central authority

Client-Server vs. Peer-to-peer. A peer is both a client and a
server. Control is decentralized.

Much more than a system for sharing pirated music.
Why does P2P need
attention?
A P2P network is an overlay
network
Network of peers. Each link between peers consists of one or more IP links.
The overlay network resides in the application layer.
Bob
Alice
Carol
Some Well-known P2P
Systems

Napster

Gnutella

KaZaA

Chord

Tapestry

CAN

Pastry

BitTorrent
Research trends
Search
storage
Security
Applications
A Distributed File Service
Bob
Alice
Carol
David
Promises
Consider File Sharing as an Example

Available 24/7

Durable despite machine failures

Information is protected

Resilient to Denial of Service
New Goals

Massive scalability

Anonymity

Deniability

Resistance to censorship
Challenges

A P2P network must be self-organizing

Untrusted infrastructure and unreliable
components. The number of faulty nodes grows
linearly with system size. Yet, the aggregate
behavior must be trustworthy.

A P2P network must be scalable.
Challenges

Tolerance to failures and churn

Efficient routing even if the structure of the
network is unpredictable.

Dealing with freeriders

Load balancing

Security issues
Looking up data



How do you locate data/files/objects in a large
P2P system built around a dynamic set of nodes in
a scalable manner without any centralized server
or hierarchy?
Napster index servers used a central database.
Poor scalability and poor resilience.
Check how names are looked up in internet’s DNS.
Unstructured vs. Structured


Unstructured P2P networks allow resources to be
placed at any node, and the network topology is
arbitrary.
Structured P2P networks simplify resource
location and load balancing by defining a topology
and defining rules for resource placement.
Napster
Used centralized index server that became a
bottleneck. Easy to censor.
Gnutella
Truly decentralized system, but search is based on
flooding the queries. Obvvious scalability problem,
and the wastage of bandwidth caused serious
inefficiencies.
Distributed Hash Table (DHT)
Object --> machine mapping uses unique keys.
H (object name) = key
(H = hash function)
H (machine name) = key
Object mapped to key k is placed in machine with
key k.
Interesting issues
Search
storage
Security
Applications