Application And Transport Layer Reliable Multicast
Download
Report
Transcript Application And Transport Layer Reliable Multicast
Application And
Transport Layer
Reliable Multicast
Leonard T. Armstrong
University Of Delaware
ELEG 667
Overview
What is multicast?
Application-layer reliable multicast
Overlay networks
Case study: Overcast
Transport-layer reliable multicast
Multicast reliability using ACKs versus NAKs
Case study: PGM
Some of the slides in this presentation are based on slides originally developed by (1) Ion
Stoica, University of California, Berkley and (2) Sachin Kamboj, University of Delaware
Multicast Definition
“…multicast: the sending of a packet from
one sender to multiple receivers with a single
send operation.”
-- James F. Kurose, Keith W. Ross
Computer Networking – A Top-Down Approach
Featuring the Internet
What Is Multicast?
Key:
Unicast transfer
Broadcast transfer
Multicast transfer
Unicast
Broadcast
One-to-one
Destination – unique receiver
host address
One-to-all
Destination – address of
network
Multicast
One-to-many
Multicast group must be
identified
Destination – address of group
Multicast Application
Examples
Financial services
Remote conferencing/e-learning
Streaming audio and video to many participants
(clients, students)
Interactive communication between participants
Interactive TV
Wireless entertainment
Delivery of news, stock quotes, financial indices, etc
Delivery of content to a high volume of wireless
devices
Bulk data transfer
Application-layer
Reliable Multicast
Overlay Networks And Overcast
Application-layer Multicast
With Overcast
Overlay network
Single-source “multicast”
Uses HTTP for data transfer*
Primary metric – optimization of bandwidth
Specific to a certain set of application requirements
Only one data source – Is this limiting?
Large latencies acceptable
Nodes have large amounts of permanent storage
Not intended for interactive applications
What Is An Overlay Network?
Virtual network configuration
Requires no infrastructure change
Set of nodes within an existing network fabric that
implement a network abstraction on top of the underlying
substrate network
Peers and their communication relationship form an
abstract logical network
Application-layer
Incrementally deployable
Do any popular applications use overlay networks?
KaZaA – File Sharing With An
Overlay Network
Key:
Content query
Query response
File request
Distributed content location
database
Some peers designated as
group leaders (GL)
New peers are assigned a GL
Peers share content database
with GL – GL only tracks
content of peers in its group
Peer queries its GL who may
forward request to other GLs
GLs send results of forwarded
queries directly to the
requestor
File transfer
GL
GL
GL
GL
Benefit/Burden Comparison Of
Overlay Networks
Benefits
Requires no change to
network infrastructure
Incrementally deployable
Network can be optimized
to application-specific
metrics
Customizable – nodes may
be multipurpose computers
Uses off-the-shelf network
technology for reliability, etc
Burdens
Management complexity
Management from afar
Management tools must
not scale with network
size
Must consider connectivity
limitations
Firewalls, proxies
Efficiency
Need to deduce network
topology
What Is Overcast?
Tree-structured overlay architecture
On-demand delivery of bandwidth-intensive content
Virtual links of overlay tree are the only paths for data
exchange
Internet delivery of high-resolution video
Allows data to be sent once to many destinations
Draws upon work in content distribution, caching,
and server replication
Comprised of
Root(s): central source (replicated)
Nodes: internal server nodes
Clients: end systems (HTTP clients)
R
Root
Node
Client
Key:
Motivation Of Overcast
Need to provide ondemand high-quality
video to customer base
Can’t count on
multicast support of
underlying network
Only application-layer
development is feasible
Bandwidth bottlenecks
develop as multiple
requests are made
Line thickness indicates
desired bandwidth usage
by sending entity
Key Points
How can the root and nodes of an Overcast network
be automatically configured?
Bandwidth-efficient trees
Dynamic topology
How can nodes maintain state (status) when
configuration is automatic?
Eliminating the root node as a single point of failure
How can a client obtain content from an Overcast
network?
Bandwidth Efficient Overlay
Trees
1
10 Mb/s
R
2
“…three ways of organizing the root and the… nodes into a distribution tree.”
1
R
R
2
1
2
R
2
1
How Does Overcast Build
Bandwidth Efficient Trees?
Goal – maximize bandwidth to root for all nodes
Places a new node as far away from root as
possible without sacrificing bandwidth
View algorithm
Additional details
Bandwidth: measured by timing a 10KB download
Improved techniques are planned
“”: Defined as “no more than 10% higher than”
SelectClosest: determined by traceroute
Won’t the results of this algorithm change over time?
Result Of The Node Adding
Algorithm
R
R
5
10
1
10
3
8
1
2
7
5
2
Physical network substrate
3
Overcast network tree
Result Of The Node Adding
Algorithm – A Clearer Animation
R
R
5
10
1
10
3
8
1
2
7
5
2
Physical network substrate
3
Overcast network tree
Sourabh’s Question
Will The Link Before A Leaf Always Have The Lowest Bandwidth?
R
R
5
10
1
10
3
20
1
2
7
5
2
Physical network substrate
3
Overcast network tree
Dynamic Topology
Overcast’s optimization metric will change over time
A node periodically reevaluates its position in the
tree by measuring the bandwidth been itself and
Node can relocate to become
Its parent (baseline)
Its grandparent
All its siblings
Child of a sibling
Sibling of a parent
Inherently tolerant of non-root failures
On dead parent a node can move up the ancestry tree
Interactions Between Node Adding
And Dynamic Topology
R
10
R
20
1
R
2
2
1
15
1
2
Physical network substrate
Overcast network tree
Round 1
Overcast network tree
Round 2
State Tracking – The Up/Down
Protocol
An efficient mechanism is needed to exchange
information between node
Assumes information either
Must scale sublinearly in terms of network usage
May scale linearly in terms of storage
Changes slowly (E.g., Health status of nodes)
Can be combined efficiently from multiple children into a
single description (E.g., Totals from subtrees)
Each node maintains state about all nodes in its
subtree
Management Of Information
With The Up/Down Protocol
Each node periodically contacts its parent
Parents assume a child (and all descendants) has
died if the child fails to contact within some interval
During contact, a node reports to its parent
Death certificates
Birth certificates
Extra information
Information propagated from children
Sequence numbers used to prevent race conditions
Scaling Sublinearly In Terms
Of Network Usage
1
A node (and descendants)
relocates under a sibling
The sibling must learn
about all the node’s
descendants
Birth certificates
The sibling passes this
information to the (original
node’s) parent
The parent recognizes no
changes and halts further
propagation
1.1
1.2.1
1.2
1.2.2
1.2.2.1
1.3
1.2.3
No change
observed.
Propagation
halted.
Birth
certificates
for 1.2.2,
1.2.2.1
Is The Root Node A Single
Point Of Failure?
Root is responsible for handling all join requests
from clients
Root’s Up/Down protocol functionality can not be
easily distributed
Note: root does not deliver content
Root maintains state for all Overcast nodes
Solution: configure a set of nodes linearly from root
before splitting into multiple branches
Each node in the linear chain has sufficient information to
assume root responsibilities
Natural side effect of Up/Down protocol
Fault-tolerant Root Node
Structuring
R1
Contains
state of
{R2, R3,
1, …, 6}
R2
Contains
state of
{R3,
1, …, 6}
1
3
2
4
R3
Contains
state of
{1, …, 6}
5
6
The Client Side – Joining A
Multicast Group
Clients join a multicast group through a
typical HTTP GET request
Root determines where to connect the client
to the multicast tree using
Pathname of URL (multicast group being joined)
Status of Overcast nodes
Location of client*
Root selects “best” server and redirects the
client to that server
How Should Client Join
Requests Work?
Key:
Content query
(multicast join)
Query redirect
Content delivery
R1
R2
1
3
2
4
R3
5
6
Redirecting To The “Best”
Server
“Seamlessly Selecting The Best Copy From Internet-Wide
Replicated Servers” – three methods
1. HTTP redirect
Central server accepts and redirects query to an alternate server
using “HTTP REDIRECT”
2. Domain Name Service (DNS) round trip times
Local name servers track RTT of requests to other servers
Different name servers can serve different IP addresses for a
common name
3. Shared IP Address
Illusion of a single logical machine connected in several places
Suspect Overcast uses a combination of HTTP Redirect, DNS
Favors servers that are faster (closer)
How Good Is Overcasting?
Is Overcasting Multicasting?
HTTP TCP unicast!
If a node has n children, n separate TCP
connections are used
Multiple unicast
Seems more like a contribution to smarter content
distribution systems than to multicasting
“Application-level multicast… uses unicast
transmission but involves the receivers in the
replication and forwarding of data.”
Computer Networking – A Top-Down Approach Featuring
the Internet
James F. Kurose, Keith W. Ross
Multicast Versus Multiple
Unicast
Multicast
Network traffic scales
well
Source transmits single
packets which are
duplicated by routers
Multicast-enabled
infrastructure
Common group
destination address
Multiple Unicast
Network traffic ~ group
size
Source transmits multiple
packets – no router
duplication
No infrastructure
requirement
Each packet intended
for a specific unicast
destination address
Transport-layer
Reliable Multicast
Pragmatic General Multicast
(PGM)
What is PGM?
Pragmatic General Multicast
Reliable transport protocol
Primarily ARQ-based
Was “Pretty Good Multicast”
Name changed c. 2000 because “Pretty Good” is
a registered trademark!?!
FEC support is optional
Defined in RFC 3208
Invented by TIBCO, bought by CISCO
Transport Layer Multicast With
PGM
Reliable transport protocol
Other reliable transport-layer multicast protocols include:
RMP – Reliable Multicast Protocol
Relies on lower-layer multicast protocols
RMTP – Reliable Multicast Transport Protocol
MFTP – Multicast File Transfer Protocol
IP multicast for best effort (unreliable) datagram delivery
Internet Group Membership Protocol (IGMP) for multicast
group membership
NAK-based protocol
PGM Services
Multicast data delivery from multiple sources
Multiple senders multiple receivers
Specification and presentation will focus on single-sender
Reliable
Data integrity (checksums)
Receivers receives all data packets or…
Receivers able to detect unrecoverable packet loss
No-duplicate packets
Ordered or unordered delivery
Sender has no knowledge of receiving group
membership
PGM Basic Building Blocks
Components
Sender
Receivers
Network elements (NE)
Routers
May or may not be
PGM-capable
Sender-to-receiver
packet types
ODATA – original data
RDATA – repair data
Source path message
(SPM)
NAK confirmation (NCF)
Receiver-to-sender
packet types
NAK
IP Multicast Primer
Groups defined using class D addresses
Distance Vector Multicast Routing Protocol
(DVMRP)
224.0.0.0/4 228 groups
Source-based trees: each source node is the root of its
own distribution tree
Reverse-path forwarding: received packets are forwarded
out all interfaces except the packet’s incoming interface
ONLY if the packet was received on the interface that is on
it’s own shortest path back to the sender
Packets from a given sender only travel down the sender’s
source path tree
Protocol Independent Multicast (PIM)
ACKs In Multicast – ACK
Implosion
Key:
Multicast send
ACK
Assume
ACK-based multicast
protocol
Multicast group of n
members
Each packet gets multicast to
all n members
All members acknowledge
receipt with a positive ACK
N ACKs are returned to
sender for each packet sent
Sender is overwhelmed by
ACKs
Result – ACK implosion
ACK-based protocols do
not scale well for multicast
ACK Versus NAK
Reliability is accomplished through
confirmation of data from receiver to source
Positive acknowledgements (ACK)…
Frequency is a function of the number of packets
transmitted
Used in TCP for transmission buffer management
(congestion control)
Negative acknowledgements (NAK)…
Frequency is a function of network reliability
NAKs To The Rescue
Key:
Multicast send
NAK
NAKs inform sender of lost
data on an as-needed basis
Scale with problems in
network, not with size of
group
Sequence numbers signal
lost packets to receivers
NAK-based protocols work
best when
Network reliability is high
Data is sent frequently
Drop
NAK Implosion
Key:
Multicast send
NAK
Assume
NAK-based multicast
protocol
Multicast group of n
members
Data packet can get lost
at any point including first
hop out of source
Result – ACK implosion
PGM takes precautions to
prevent NAK implosion
Drop
NAK And NCF Use
NAK – unicast UP the tree toward the sender
NCF – multicast DOWN the tree toward all
NE descendents
Sent from NE but uses source address of original
sender to insure correct path traversal
Strategies to lessen NAK burden
NAK suppression
NAK elimination
NAKety-NAK, Don’t NAK Back
NAK suppression
Receiver waits a random time before sending NAK – does
not send if corresponding NCF is heard
Random delays should be proportional to number of PGM
siblings
Increase probability of hearing NCF as a result of a NAK
from another node with common ancestor
Supports optional multicasting of NAK with TTL=1
Other nodes receiving NAK suppress their NAKs
NAK elimination
Parent NE aggregates or eliminates NAKs before
propagating up the tree
Parent NE forwards at most one NAK per packet
Source Path Messages
Key:
Multicast send
NAK
Determine next upstream
hop for reverse path
Continual update of next
upstream hop info when
path changes
Establish source path
state in PGM NE
Prompt detection of missing
packets in the absence of
data to transmit
Compensate for short
or slow data flows
Components Of A PGM
Network
Source – originator of data packets
Receiver – consumer data packets
Network elements
PGM-ready routers present in the intervening network
Unlike other transport layer protocols that only deal
with end-to-end communication, PGM requires PGM
compatible routers
Non-PGM routers may be part of the intervening
network
PGM Operation Under Loss
1.
2.
3.
4.
5.
6.
7.
8.
Receiver A detects lost packet,
NAK to 1st hop PGM NE
NCF from 1st hop PGM NE
Propagated NAK to upstream
PGM NE
NCF from upstream PGM NE
Propagated NAK to upstream
PGM sender
NCF from sender
Receiver B detects loss of
same packet, NAK to 1st hop
PGM NE
NCF from 1st hop PGM NE, no
further propagation of NAK
A
B
PGM Network Elements
Traditional transport-layer
protocols
End-to-end communication
Not present in intermediate
NEs
PGM
PGM NEs need transportlayer PGM code
Give special treatment to
SPM, NCF, RDATA
Transmitted with IP router
alert
Non-PGM NEs allowed –
operation of protocol is less
efficient
Source
Host
Application
Transport
Network
Link
Physical
Application
Transport
Network
Link
Physical
Destination
Host
Network Elements
Network
Link
Physical
Network
Link
Physical
PGM
Non-PGM
Transport
Network
Link
Physical
Network
Link
Physical
Application
Transport
Network
Link
Physical
Application
Transport
Network
Link
Physical
Undiscussed (Optional) PGM
Concepts
Designated local repairer (DLR)
Used to send RDATA “locally” rather than having the
RDATA come from the original source
DLR is one hop from NE for which it provides “local”
RDATA
Forward error correction (FEC)
Sends h additional parity packets for every k data packets
(“transmission group”) sent
Receiver is able to reconstruct the original k data packets
from receipt of any k of the h+k sent packets
Result: tolerance of up to h packet losses
PGM supports pro-active or on-demand FEC
How Well Does PGM Perform?
Data Rate
SPM/sec
Best Possible
Network
Utilization
97.1%
10 Mb/s
1
97.1%
10 Mb/s
5
97.0%
47 Kb/s
1
95.9%
47 Kb/s
5
91.7%
These figures do not account for losses and/or forward error correction (FEC).
Problems Inherent With NAKbased Approaches
TCP does not move it’s trailing cwnd window
edge until the data packet at the trailing edge
has been successfully ACKed
What problems could occur in a NAK-based
system, relative to trailing window edge and
why/how could they occur?
Leave With A Thought…
“Multicast has been the hot, up-and-coming
topic in computer networking… for the past
10 years.”
-- Anonymous Professor of Computer
Networking