Peer-to-Peer Overlay Networks

Transcript Peer-to-Peer Overlay Networks

P2P Overlay Networks
By
Behzad Akbari
Spring 2011
These slidesin some parts are based on the slides of J. Kurose (UMASS) and
Sanjay Rao ()
1
Outline






Overview of P2P overlay networks
Applications of overlay networks
Unstructured overlay networks
Structured overlay networks
Overlay multicast networks
P2P media streaming networks
2
Overview of P2P overlay networks

What is a P2P system?


What is an overlay network?


P2P refers to applications that take advantage of resources
(storage, cycles, content, bandwidth, human presence)
available at the end systems of the internet.
Overlay networks refer to networks that are constructed on
top of another network (e.g. IP).
What is a P2P overlay network?

Any overlay network that is constructed by the Internet
peers in the application layer on top of the IP network.
3
Overview of P2P overlay networks

P2P overlay network properties


Efficient use of resources
Self-organizing


Scalability



Consumers of resources also donate resources
Aggregate resources grow naturally with utilization
Reliability




All peers organize themselves into an application layer network on top of
IP.
No single point of failure
Redundant overlay links between the peers
Redundant data source
Ease of deployment and administration




The nodes are self-organized
No need to deploy servers to satisfy demand.
Built-in fault tolerance, replication, and load balancing
No need any change in underlay IP networks
4
Applications of P2P overlay networks

P2P file sharing








Napster, Gnutella, Kaza, Emule, Edonkey, Bittorent, etc.
Application layer multicasting
P2P media streaming
Content distribution
Distributed caching
Distributed storage
Distributed backup systems
Grid computing
5
Classification of P2P overlay networks




Unstructured overlay networks
 The overlay networks organize peers in a random graph in flat or
hierarchical manners.
Structured overlay networks

Are based on Distributed Hash Tables (DHT)

the overlay network assigns keys to data items and organizes its
peers into a graph that maps each data key to a peer
Overlay multicast networks
 The peers organize themselves into an overlay tree for
multicasting.
P2P media streaming networks

Used for large scale media streaming applications (e.g IPTV)
over the Internet.
6
Unstructured P2P File Sharing Networks



Centralized Directory based P2P systems
Pure P2P systems
Hybrid P2P systems
7
Unstructured P2P File Sharing Networks (…)

Centralized Directory based P2P systems






All peers are connected to central entity
Peers establish connections between each other on
demand to exchange user data (e.g. mp3
compressed data)
Central entity is necessary to provide the service
Central entity is some kind of index/group database
Central entity is lookup/routing table
Examples: Napster, Bittorent
8
Unstructured P2P File Sharing Networks(…)

Pure P2P systems



Any terminal entity can be removed
without loss of functionality
No central entities employed in the
overlay
Peers establish connections between
each other randomly



To route request and response messages
To insert request messages into the overlay
Examples: Gnutella, FreeNet
9
Unstructured P2P File Sharing Networks(…)

Hybrid P2P systems





Main characteristic,
compared to pure P2P:
Introduction of another
dynamic hierarchical layer
Election process to select
an assign Super peers
Super peers: high degree
(degree>>20, depending on
network size)
Leaf nodes: connected to
one or more Super peers
(degree<7)
Example: KaZaA
Superpeer
leafnode
10
Centralized Directory based P2P systems : Napster

Napster history: the rise





January 1999: Napster version 1.0
May 1999: company founded
December 1999: first lawsuits
2000: 80 million users
Napster history: the fall


Mid 2001: out of business due to lawsuits
Mid 2001: dozens of P2P alternatives that were
harder to touch, though these have gradually
been constrained
11
Napster Technology: Directory Service

User installing the software




centralized
directory server
1
peers
Provides a list of music files it will share
… and Napster’s central server
updates the directory
1
Client searches on a title or performer



Bob
Client contacts Napster (via TCP)


Download the client program
Register name, password, local
directory, etc.
Napster identifies online clients with the
file
… and provides IP addresses
3
1
2
1
Client requests the file from the chosen
supplier


Supplier transmits the file to the client
Both client and supplier report status to
Napster
Alice
12
Napster Technology: Properties

Server’s directory continually updated



Peer-to-peer file transfer



No load on the server
Plausible deniability for legal action (but not enough)
Proprietary protocol



Always know what music is currently available
Point of vulnerability for legal action
Login, search, upload, download, and status operations
No security: clear text passwords and other vulnerability
Bandwidth issues

Suppliers ranked by apparent bandwidth & response time
13
Napster: Limitations of Central Directory

Single point of failure
Performance bottleneck
Copyright infringement

So, later P2P systems were more distributed



File transfer is
decentralized, but
locating content is
highly centralized
Gnutella went to the other extreme…
14
Pure P2P system: Gnutella

Gnutella history



2000: J. Frankel &
T. Pepper released
Gnutella
Soon after: many
other clients (e.g.,
Morpheus,
Limewire,
Bearshare)
2001: protocol
enhancements, e.g.,
“ultrapeers”

Query flooding




Join: contact a few
nodes to become
neighbors
Publish: no need!
Search: ask neighbors,
who ask their neighbors
Fetch: get file directly
from another node
15
Gnutella: Query Flooding

Fully distributed



No central server
Public domain
protocol
Many Gnutella
clients implementing
protocol
Overlay network: graph
 Edge between peer X
and Y if there’s a TCP
connection
 All active peers and
edges is overlay net
 Given peer will typically
be connected with < 10
overlay neighbors
16
Gnutella: Protocol



File transfer:
HTTP
Query message sent
over existing TCP
connections
Peers forward
Query message
QueryHit sent over
reverse path
Query
QueryHit
Query
Scalability:
limited scope
flooding
QueryHit
17
Gnutella: Peer Joining

Joining peer X must find some other peers



X sends Ping message to Y



Start with a list of candidate peers
X sequentially attempts TCP connections with
peers on list until connection setup with Y
Y forwards Ping message.
All peers receiving Ping message respond with
Pong message
X receives many Pong messages

X can then set up additional TCP connections
18
Gnutella: Pros and Cons

Advantages




Fully decentralized
Search cost distributed
Processing per node permits powerful search
semantics
Disadvantages



Search scope may be quite large
Search time may be quite long
High overhead, and nodes come and go often
19
Hybrid P2P system: KaAzA

KaZaA history
2001: created by Dutch company (Kazaa BV)

Smart query flooding




Join: on start, the client contacts a super-node (and may later
become one)
Publish: client sends list of files to its super-node
Search: send query to super-node, and the super-nodes flood
queries among themselves
Fetch: get file directly from peer(s); can fetch from multiple
peers at once
20
KaZaA: Exploiting Heterogeneity

Each peer is either a group
leader or assigned to a
group leader



TCP connection between
peer and its group leader
TCP connections between
some pairs of group
leaders
Group leader tracks the
content in all its children
ordinary peer
group-leader peer
neighoring relationships
in overlay network
21
KaZaA: Motivation for Super-Nodes

Query consolidation



Many connected nodes may have only a few files
Propagating query to a sub-node may take more
time than for the super-node to answer itself
Stability


Super-node selection favors nodes with high uptime
How long you’ve been on is a good predictor of
how long you’ll be around in the future
22
P2P Case study: Skype




inherently P2P: pairs of
users communicate.
proprietary applicationlayer protocol (inferred Skype
login server
via reverse
engineering)
hierarchical overlay
with SNs
Index maps usernames
to IP addresses;
distributed over SNs
Skype clients (SC)
Supernode
(SN)
23
Peers as relays

Problem when both Alice and
Bob are behind “NATs”.


NAT prevents an outside peer
from initiating a call to insider
peer
Solution:



Using Alice’s and Bob’s SNs,
Relay is chosen
Each peer initiates session with
relay.
Peers can now communicate
through NATs via relay
24
BitTorrent

BitTorrent history and motivation


2002: B. Cohen debuted BitTorrent
Key motivation: popular content


Focused on efficient fetching, not searching



Popularity exhibits temporal locality (Flash Crowds)
Distribute same file to many peers
Single publisher, many downloaders
Preventing free-loading
25
BitTorrent: Simultaneous Downloading

Divide large file into many pieces




Replicate different pieces on different peers
A peer with a complete piece can trade with other
peers
Peer can (hopefully) assemble the entire file
Allows simultaneous downloading



Retrieving different parts of the file from different
peers at the same time
And uploading parts of the file to peers
Important for very large files
26
BitTorrent: Tracker

Infrastructure node


Peers register with the tracker



Keeps track of peers participating in the torrent
Peer registers when it arrives
Peer periodically informs tracker it is still there
Tracker selects peers for downloading



Returns a random set of peers
Including their IP addresses
So the new peer knows who to contact for data
27
BitTorrent: Chunks

Large file divided into smaller pieces



Allows simultaneous transfers



Downloading chunks from different neighbors
Uploading chunks to other neighbors
Learning what chunks your neighbors have


Fixed-sized chunks
Typical chunk size of 256 Kbytes
Periodically asking them for a list
File done when all chunks are downloaded
28
BitTorrent: Overall Architecture
Tracker
Web Server
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
29
BitTorrent: Overall Architecture
Tracker
Web Server
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
30
BitTorrent: Overall Architecture
Tracker
Web Server
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
31
BitTorrent: Overall Architecture
Tracker
Web Server
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
32
BitTorrent: Overall Architecture
Tracker
Web Server
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
33
BitTorrent: Overall Architecture
Tracker
Web Server
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
34
BitTorrent: Overall Architecture
Tracker
Web Server
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
35
BitTorrent: Chunk Request Order

Which chunks to request?



Problem: many peers have the early chunks



Peers have little to share with each other
Limiting the scalability of the system
Problem: eventually nobody has rare chunks



Could download in order
Like an HTTP client does
E.g., the chunks need the end of the file
Limiting the ability to complete a download
Solutions: random selection and rarest first
36
BitTorrent: Rarest Chunk First

Which chunks to request first?



Benefits to the peer


The chunk with the fewest available copies
I.e., the rarest chunk first
Avoid starvation when some peers depart
Benefits to the system


Avoid starvation across all peers wanting a file
Balance load by equalizing # of copies of chunks
37
Free-Riding Problem in P2P Networks

Vast majority of users are free-riders



A few “peers” essentially act as servers



Most share no files and answer no queries
Others limit # of connections or upload speed
A few individuals contributing to the public good
Making them hubs that basically act as a server
BitTorrent prevent free riding


Allow the fastest peers to download from you
Occasionally let some free loaders download
38
Bit-Torrent: Preventing Free-Riding

Peer has limited upload bandwidth


Prioritizing the upload bandwidth: tit for tat


Favor neighbors that are uploading at highest rate
Rewarding the top four neighbors




And must share it among multiple peers
Measure download bit rates from each neighbor
Reciprocates by sending to the top four peers
Recompute and reallocate every 10 seconds
Optimistic unchoking


Randomly try a new neighbor every 30 seconds
So new neighbor has a chance to be a better partner
39
BitTorrent Today

Significant fraction of Internet traffic



Problem of incomplete downloads





Estimated at 30%
Though this is hard to measure
Peers leave the system when done
Many file downloads never complete
Especially a problem for less popular content
Still lots of legal questions remains
Further need for incentives
40
Structured overlay networks





Overlay topology construction is based on NodeID’s
that are generated by using Distributed Hash Tables
(DHT).
In this category, the overlay network assigns keys to
data items and organizes its peers into a graph that
maps each data key to a peer.
This structured graph enables efficient discovery of
data items using the given keys.
It Guarantees object detection in O(log n) hops.
Examples: Content Addressable Network (CAN),
Chord, Pastry.
41
Structured-DHT-based P2P overlay networks
42
CAN: Content Addressable Network



Each key maps to one point
in the d-dimensional space
Each node responsible for all
the keys in its zone.
Divide the space into zones.
C
D
A
B
E
43
CAN(…)
44
Pastry



Prefix-based
Route to node with shared
prefix (with the key) of ID at
least one digit more than
this node.
Neighbor set, leaf set and
routing table.
d471f1
d46a1c
d467c4
d462ba
d4213f
Route(d46a1c)
d13da3
65a1fc
45
Pastry (…)

Routing table, leaf set and neighbor set
example in a Pastry node

b=2 and l=8
46
Overlay Multicasting
47
The key decision behind IP Multicast
Gatech
Stanford
Berkeley
Per-group Router State
“Smart Network”
48
Internet Design Philosophy

Dumb routers , smart end systems ....



Minimal functionality in routers
Push functionality to end systems as much as possible
Key reason attributed to success and scaling of Internet
Application/
Transport
IP
“Thin Waist”
Physical/
Link layer
Internet architecture
49
Significance of IP Multicast


First major functionality added to routers
since original design
Per-group state implies:




Routing tables with potentially millions of entries!
Today’s routers have only tens of thousands of
entries in routing tables though there are millions
of end systems!
Potential scaling concerns
Per-group (per-flow) based solutions typically
find slow acceptance in the Internet
50
Other concerns with IP Multicast

Slow deployment


Very difficult to change network infrastructure and
routers
Difficult to support higher layer functionality


IP Multicast: Best effort delivery
Model complicates support for reliability and
congestion control
51
Congestion control with IP Multicast
Stanford, LAN
Stanford, modem
Purdue
Network with
IP Multicast
Berkeley1
China
What rate must the source send at ?
London
52
Back to the drawing board…
Which is the right layer to implement multicast functionality?
IP Multicast: IP layer, efficient
Naïve Unicast: Application layer, inefficient
Can we do it efficiently at the application level ?
Application
?
IP
Physical/
Link
Internet architecture
?
53
Overlay Multicast
Stan1
Gatech
Stanford
Stan2
Berk1
Dumb Network
Berkeley
Overlay Tree
Gatech
Berk2
Stan1
Stan2
Purdue
Berk1
Berk2
54
Overlay Multicast: Benefits



No per-group state in routers
Easy to deploy: No change to network infrastructure
Can simplify support for congestion control etc.
Purdue
Stan-LAN
Leverage computation
and storage
(e.g. Transcoding)
Berk1
Unicast congestion
control
Gatech
Berk2
Stan-Modem
55
Overlay Performance



Even a well-designed overlay cannot be as efficient as IP
Mulitcast
But performance penalty can be kept low
Trade-off some performance for other benefits
Duplicate Packets:
Bandwidth Wastage
Gatech
Stanford
Dumb Network
Increased
Delay
Berkeley
56
Architectural Spectrum
Purely application
Infrastructure-Centric
end-point
+Instantaneous Deployment
+ Security
+Low setup overhead, low cost
+ Robustness
+Uses bandwidth resources at end systems
+Can scale to support millions of groups
57
Router-Based
(IP Multicast)
No Router Support
Infrastructure-Centric
(CDNs, e.g. Akamai)
Application End-points
or end-systems with
infrastructure support
Application End-points Only,
End-System Only
End-System, Application-Level, Overlay
or Peer-to-Peer Multicast
58
Applications

File Download Vs. Video Conferencing/Broadcasting



Bandwidth demanding : hundreds or thousands of Kbps
Real-time constraints: Continuous streaming delivery
Broadcasting Vs. Conferencing


Conferencing: smaller scale, anyone can be source.
Broadcasting: single source, tens or hundreds of thousands
of simultaneous participants
59
Design Issues

Construct efficient overlays






High bandwidth, low latency to receivers
Low Overheads
Fault tolerant
Self-organizing, Self-Improving
Honor per-node access bandwidth constraints
System Issues: NATs etc.
60
What overlay to construct?
Stan-LAN
Purdue
Stan-LAN
Stan-LAN
Stan-Modem
NJ
Berk1
Purdue
Stan-Modem
Berk1
Berk2
Berk2
•No formal structure
•Each data pkt send using
epidemic algorithms
Overlay with a structure
Tree
Multi-Tree
Mesh
61
Inefficient Overlay Trees
Purdue
Stan2
Purdue
Stan2
Stan1-Modem
Stan1-Modem
Berk1
Berk1
Gatech
Gatech
Berk2
Berk2
High latency
-Poor network usage
-Potential congestion near Purdue
Stan-LAN
Poor bandwidth
to members
Purdue
Stan-Modem
Berk1
Berk2
Gatech
62
Efficient overlay trees
Purdue
Stan-Modem
Berk2
Berk2
Berk1
Stan-LAN
Berk1
Purdue
Gatech
Stan-LAN
Gatech
Stan-Modem
63
Self-Organizing Protocols

Construct efficient overlays in a distributed
fashion



Members may have limited knowledge of the
Internet
Adapt to dynamic join/leave/death of members
Adapt to network dynamics and congestion
64
Example
Purdue
Berk2
Berk1
Stan-Modem
Stan-Lan
joins
65
Example
Purdue
Berk2
Berk1
Stan-Modem
Stan-Lan
66
Example
Purdue
Berk2
Berk1
Stan-Modem
Bad Bw Perf!
Switch!
Stan-Lan
67
Example
Purdue
Berk2
Berk1
Stan-Modem
Stan-Lan
68
Example
Purdue
Berk2
Berk1
Stan-Modem
More “clustering” if
Stan-Modem moves to Stan-Lan
Stan-Lan
69
Example
Purdue
Berk2
Berk1
Stan-Modem
Stan-Lan
70
Example
Purdue
Berk2
Berk1
Stan-Modem
Stan-Modem is disconnected:
Back to Berk1
71
Key Components of Protocol

Overlay Management:



How many people does a member know?
How is this knowledge maintained?
Overlay Optimization:


Constructing efficient overlay among members
Distributed heuristics
 No entity with global knowledge of hosts and
network
72
Group Management

Build separate control structure decoupled from tree



Each member knows small random subset of group members
Knowledge maintained using gossip-like algorithm
Members also maintain path from source
S

Other design alternatives possible:


Example: More hierarchical structure
No clear winner between design alternatives
A
73
Bootstrap
Asia
US2
Euro2
Euro3
Source (US)
Asia
US2
US
US1
Euro1 Modem1
US4, …
Euro2, Euro3, ...
Node that joins:
–Gets a subset of group membership from source
–Finds parent using parent selection algorithm
74
Other causes for parent switch
Member Leave/Death
Congestion/ poor bandwidth
Source (US)
Source (US)
Asia
US2
Asia
US1
Euro1
US2
Modem1
US4, …
Euro2, Euro3, ...
US4, …
US1
Euro2
Modem1
Euro3, ...
Source (US)
Asia
Better Clustering
US2
US4, …
Euro4
US1
Euro2
Modem1
Euro3, ...
75
Parent Selection
Source (US)
Asia
US2
US1
Euro1
Modem1
X
US4, …
Euro2, Euro3, ...
–X sends PROBE_REQ to subset of members it knows
–Waits for 1 second for the responses
–Evaluates remote nodes and chooses a candidate parent
76
Factors in Parent Selection


Filter out P if it is a descendant of X
Performance of P




Application throughput P received over last few seconds
Delay of path from S to P
Saturation level of P
Performance of link P-X


S
Delay of link P-X
TCP bandwidth of link P-X
X
P
77
Multiple Overlay Trees




With support from MDC, split into T-equally sized stripes
T trees, each distributes a single stripe of size S/T
Overall quality depends on the number of stripes received
Number of trees node i is entitled to =
S Kbps
Peer A
 ri 
S /T 


Source
Peer C
S/3
S/3
S/3
78
Tree 1
Tree 2
Tree 3
Mesh-Based Approach

Tree-based:



Involves “push”
Well-known structure
Mesh-based:



Each node maintains a set of partners
Exchange data availability info with partners
Pull data from partner if do not possess it already
79
80
Mesh-Based Streaming Vs. BitTorrent

Clever Scheduling Algorithm regarding which
segments to fetch
81
Deployments

Research


ESM Broadcasting system (2002-)
CoolStreaming (2004-)


Report peak sizes >25000 users
Industrial (last 2-3 yrs)

PPLive



Reports even larger peak sizes
Zattoo
GridMedia
82
Research Challenges/Open Issues






Tree Vs. Mesh Approaches, Hybrids
Access-bandwidth scarce regimes
Incentives and fairness
Extreme peer dynamics, flash crowd
Support for heterogeneous receivers
System Issues




NATs/Firewalls
Transport Protocol
Start-up delay/buffer interaction
Security Challenges
83
Snapshot from Sigcomm Broadcast
U.S. East Coast
U.S. Central
U.S. West Coast
Europe
Asia
Unknown
Source
84