PPT - Course Website Directory

Download Report

Transcript PPT - Course Website Directory

CS 414 – Multimedia Systems Design
Lecture 36 – Introduction to
P2P and P2P Streaming
Klara Nahrstedt
CS 414 - Spring 2011
Administrative

MP3 preview is on April 27
 Sign-up
sheet for APRIL 27 demonstration
will be provided in class on April 27
 Preview demonstrations on April 27, 7-9pm
in 216 SC
CS 414 - Spring 2011
Administration

Final MP3 delivery on April 29
 Demonstration
time slots on April 29 will be allocated on
Thursday, April 28 (TA will email to each group)
 Two demonstration intervals
students not in Google competition – demos 2-4pm in 216 SC,
 students in Google competition – demos 5-7pm in 216 SC



Pizza and Google Prizes Announcement at 7:15pm,
April 29 (room 3403 SC)
Homework 2 will be out on Wednesday, April 27
 Deadline, May 4, 11:59pm – submission via
compass
CS 414 - Spring 2011
Why P2P?

People like to get together
 Share
things
 Share information
How can I find others to share?
 How can I find what they are sharing?

CS 414 - Spring 2011
Why P2P?

WWW to share?
 Not
everyone can set up and maintain a
website
 Might not have a good enough search engine
rank

Sharing on a social network?
 Only
possible if you created the content
 People have to find you, instead of the
content
CS 414 - Spring 2011
Napster

Created in 1999 to share music
Napster’s servers
•Store file directory
(Filename/location)
Clients/peers
•Store files
CS 414 - Spring 2011
Napster, operation
2. Servers search using ternary tree
1. Query (keyword)
3. Response
5. Download
CS 414 - Spring 2011
4. Ping
candidates
Napster’s drawbacks

Asymmetric operation – servers vs. client
peers
 Scalability
and congestion issues
 Single point of failure

Napster responsible for abetting users’
copyright violation
CS 414 - Spring 2011
Contemporary P2P systems

Symmetric operation – all nodes (peers)
have the same functionality
 System

naturally scales with new peers
Basic operations
 Insert
file
 Search file
 Delete file
 Maintain an overlay network
CS 414 - Spring 2011
(physical network)
Contemporary P2P Systems
(Classification)
Usually classified depending on how peers
connect to each-other – i.e., how the
overlay network is created and maintained
 Unstructured – no particular connection
pattern (e.g., randomly connected)

 Gnutella
 Fast

Track
Skype
 BitTorrent,
etc.
CS 414 - Spring 2011
Contemporary P2P Systems
(Classification)

Structured – defines a distributed data
structure (e.g., distributed hash table)
 Chord
 Pastry
 CAN
 Etc.
CS 414 - Spring 2011
Gnutella
Servents (peers)
Peer pointer
Peers store:
•Their files
•Peer pointers
CS 414 - Spring 2011
Gnutella, searching
1. Flood query ( )
2. Ignore repeated
messages
3. Answer if local
match
4. Query hit sent
using reverse path
(
)
5. Establish
connection and
fetch file (
)
“jazz”??
“jazz”
Query message: <id, QUERY, ttl, hops, payload length, min speed, keywords>
Query hit message: <id, QUERY HIT, ttl, hops, payload length,
num hits, port, ip, speed, (fileindex, filename, filesize), servent id>
CS 414 - Spring 2011
Gnutella, maintaining the
overlay
Neighbor list:
•“A”
“V”
A
V
X
1. Periodically flood
ping ( )
2. Pong sent using
“X”
reverse path(
)
3. Update neighbor
list with received
pongs
•
Why periodically?
Ping: <id, PING, ttl, hops, payload length (zero)>
Pong: <id, PONG, ttl, hops, payload length, port, ip, num. files, num. KBs>
CS 414 - Spring 2011
Gnutella, maintaining the
overlay
V
X
Peers can leave or fail at any time.
CS 414 - Spring 2011
Neighbor
list:
•“A”
•“V”
•“X”
Gnutella: some issues
Ping/Pong constitutes 50% of traffic
 Flooding causes excessive traffic
 Repeated searches with same keywords
 Large number of freeloaders (70% of
users in 2000)

CS 414 - Spring 2011
DHTs (Distributed Hash Tables)

Hash table allows these operations on
object identified by key:
 Insert
 Lookup
 Delete

Distributed Hash Table – same but in a
distributed setting (object could be files)
CS 414 - Spring 2011
DHT performance comparison
Memory
Lookup latency
lookup overhead
Napster
O(1) at client;
O(N) at server
O(1)
O(1)
Gnutella
O(N)
O(N)
O(N)
Chord (DHT)
O(log(N))
O(log(N))
O(log(N))
CS 414 - Spring 2011
Streaming from servers
Bandwidth at video service and number of
servers have to grow with demand
 Flash
crowds have to be taken into account
clients
…
servers
…

Video service
CS 414 - Spring 2011
P2P Streaming

Use the participating node’s bandwidth
 More
nodes watching stream = more shared
bandwidth

App-level multicast
 How?
Peers
watching
stream
Live stream
source
(could be a
member of the
p2p network)
P2P
network
CS 414 - Spring 2011
P2P Streaming

Common arrangements to multicast the
stream
 Single
Tree
 Multiple Tree
 Mesh-based
 All nodes are usually interested in the stream

They all have to deal with node dynamism
(join/leave/fail/capacity-changes)
CS 414 - Spring 2011
Streaming in a single tree
Frames
c
o
d
e
r
packets
3
2
1
Source
3
2
1
3
2
(using RTP/UDP)
CS 414 - Spring 2011
1
Single Tree
Peers interested in the stream organize
themselves into a tree
Source
…
Nodes send as many
copies of a data
packet as they have
children
…

…
CS 414 - Spring 2011
Joining the tree
Find a node with spare capacity, then
make it parent
 If contacted node lacks capacity, pick child

 Random
child
 Round robin
 Child closest in physical network to joining
“Parent?”
node
“Try one of my
children”
CS 414 - Spring 2011
Leaving the tree or failing
Orphan nodes need a new parent
 Policies for new parent

 Children
pick source
 Subtree nodes pick source
 Children pick grandfather
 Subtree nodes pick grandfather
 …then repeat join procedure
CS 414 - Spring 2011
grandfather
Ex-parent
Orphan
children after
parent leaves
Single tree issues
Leaves do not use their outgoing
bandwidth
 Packets are lost while recovering after a
parent leaves/fails
 Finding unsaturated peer could take a
while
 Tree connections could be rearranged for
better transfer

CS 414 - Spring 2011
Multiple Trees

Challenge: a peer must be internal node in
only one tree, leaf in the rest
Source
1
2
3
3
1
Are nodes 1, 2, 3 receiving
the same data multiple
times?
2
2
1
CS 414 - Spring 2011
3
Multiple Description Coding
(MDC)
Packets for
description 0
c
o
d
e
r
30
10
Packets for
description n
3n

20
…
Frames
2n
1n
Each description can be independently
decoded (only one needed to reproduce
audio/video)
 More
descriptions received result in higher
quality
CS 414 - Spring 2011
Streaming in multiple-trees
using MDC

Assume odd-bit/even-bit encoding -description 0 derived from frame’s oddbits, description 1 derived from frame’s
even-bits
21
11
31
30
20
10
(using RTP/UDP)
1
2
3
3
1
CS 414 - Spring 2011
2
4
Multiple-Tree Issues

Complex procedure to locate a potentialparent peer with spare out-degree
 Degraded
quality until a parent found in every
tree

Static mapping in trees, instead of
choosing parents based on their (and my)
bandwidth
 An
internal node can be a bottleneck
CS 414 - Spring 2011
Mesh-based streaming

Basic idea
 Report
to peers the packets that you have
 Ask peers for the packets that you are missing
 Adjust connections depending on in/out
bandwidth
Description
0/1/2
Description
1,2
(mesh uses MDC)
Description
0/1/2
Description 0
(Nodes are randomly
connected to their
Description 0,1,2 peers, instead of
Description 1 CS 414 - Spring 2011
statically)
Content delivery
(Levels determined
by hops to source)
Description 0
Description 2
Description 1
1
4
10
2
6
5
11
12
7
13
(1) Diffusion Phase (
3
14
)
15
8
16
(2) Swarming Phase (
CS 414 - Spring 2011
9
17
)
Diffusion Phase
…
Segment 0
40
20
30
10
22
12
…
…
Segment 1
Have
segment 0
42
32
22
12
Send me
Segment 0
3
(during period 0)

As a new segment (set of packets)
of length L becomes available at
3
source every L seconds
22
 Level 1 nodes pull data units
12
from source, then level 2 pulls
from level 1, etc.
9
 Recall that reporting and
(during period 1)
pulling are performed
CS 414 - Spring 2011
periodically
Have
segment 0
Send me
Segment 0
(drawings follow
previous example)
Swarming Phase
10
2
20
11
10
11
12
21
7
13
20



14
10
15
11
21
9
16
21
17
11
At the end of the diffusion all nodes have at least one data unit of
the segment
Pull missing data units from (swarm-parent) peers located at same
or lower level
Can node 9 pull new data units from node 16?
 Node 9 cannot pull data in a single swarm interval
CS 414 - Spring 2011
(drawings follow
previous example)
Buffering

Due to diffusion nodes need a buffer
 Size:
c*L
 (diffusion tree depth + minimum number of
swarming intervals) <= c
CS 414 - Spring 2011
Conclusion
P2P technology – very viable technology
for multimedia distribution
 P2P streaming – strong alternative to CDN
(Content Distribution Networks) networks
 Examples of P2P streaming technology

 PPLive
 Skype
CS 414 - Spring 2011
Many more details in references
and source code





M. Castro, P. Druschel, A-M. Kermarrec, A. Nandi, A. Rowstron and
A. Singh, "SplitStream: High-bandwidth multicast in a cooperative
environment," SOSP 2003.
H. Deshpande, M. Bawa, H. Garcia-Molina. "Streaming Live Media
over Peers," Technical Report, Stanford InfoLab, 2002.
N. Magharei, R. Rejaie. "PRIME: Peer-to-Peer Receiver drIven
MEsh-Based Streaming," INFOCOM 2007.
N. Magharei, R. Rejaie, Y. Guo. "Mesh or Multiple-Tree: A
Comparative Study of Live P2P Streaming Approaches," INFOCOM
2007.
http://freepastry.org
CS 414 - Spring 2011