Overlay Networks: Introduction and Multicast
Download
Report
Transcript Overlay Networks: Introduction and Multicast
Overlay Networks and
Overlay Multicast
May 2006
Definition
Network
- defines addressing, routing, and
service model for
communication between hosts
Overlay network
- A network built on top of one or
more existing networks
- adds an additional layer of
indirection/virtualization
- changes properties in one or
more areas of underlying
network
Alternative
- change an existing network
layer
2
A Historical Example
Internet is an overlay network
- goal: connect local area networks
- built on local area networks (e.g.,
Ethernet), phone lines
- add an Internet Protocol header to all
packets
3
Benefits
Do not have to deploy new equipment, or
modify existing software/protocols
- probably have to deploy new software on top
of existing software
- e.g., adding IP on top of Ethernet does not
require modifying Ethernet protocol or driver
- allows bootstrapping
• expensive to develop entirely new
networking hardware/software
• all networks after the telephone (Ethernet)
have begun as overlay networks
4
Benefits
Do not have to deploy at every node
- not every node needs/wants overlay network
service all the time
• e.g., QoS guarantees for best-effort traffic
- overlay network may be too heavyweight for
some nodes
• e.g., consumes too much memory, cycles, or
bandwidth
- overlay network may have unclear security
properties
• e.g., may be used for service denial attack
- overlay network may not scale (not exactly a
benefit)
5
• e.g. may require n2 state or communication
Costs
Adds overhead
- adds a layer in networking stack
• additional packet headers, processing
- sometimes, additional work is redundant
• e.g., an IP packet contains both Ethernet (48 + 48
bits) and IP addresses (32 + 32 bits)
• eliminate Ethernet addresses from Ethernet header
and assume IP header(?)
Adds complexity
- layering does not eliminate complexity, it only
manages it
- more layers of functionality more possible
unintended interaction between layers
- e.g., corruption drops on wireless interpreted as
congestion drops by TCP
6
Applications
Mobility
- MIPv4: pretends mobile host is in
home network
Routing – P2P, …
Addressing
Security
Multicast
7
Applications: Routing
Flat space
- every node has a route to every other node
- n2 state and communication, constant distance
Hierarchy
- every node routes through its parent
- constant state and communication, log(n) distance
- too much load on root
Mesh (e.g., CAN-Content Addressable Network)
- every node routes through 2d other nodes
- O(d) state and communication, n1/d distance
Chord
- every node routes through O(log n) other nodes
- O(log n) state and communication, O(log n) distance
8
Applications: Increasing Routing
Robustness
Resilient Overlay Networks
[Anderson et al 2001]
- overlay nodes form a
complete graph
- nodes probe other nodes
for lowest latency
- knowledge of complete
graph lower latency
routing than IP, faster
recovery from faults
9
Applications: Addressing
6
provide more address space
6/4
than underlying network
6bone
Internet v4
- IPv6 on IPv4
- requires NAT-like gateways
for IPv6-only hosts to communicate
6
with IPv4-only hosts
6/4
6/4
- main current deployment
NAT
of IPv6
6
TRIAD, IP-NL
- enhanced NAT
- separate Internet into realms, each with its
own IPv4 address space
10
- use overlay network for inter-realm routing
4
Applications: Security (VPN)
provide more security than underlying network
privacy (e.g., IPSEC)
- overlay encrypts traffic between nodes
- only useful when end hosts cannot be secure
anonymity (e.g., Zero Knowledge)
- overlay prevents receiver from knowing which host is
the sender, while still being able to reply
- receiver cannot determine sender exactly without
compromising every overlay node along path
service denial resistance (e.g., FreeNet)
- overlay replicates content so that loss of a single node
does not prevent content distribution
11
Multicast Outline
12
Problems with IP Multicast
Advantage: Highly efficient, Good delay
Scales poorly with number of groups
- A router must maintain state for every group
Supporting higher level functionality is difficult
- IP Multicast: best-effort multi-point delivery
service
- Reliability and congestion control for IP
Multicast complicated
• scalable, end-to-end approach for heterogeneous
receivers is very difficult
• hop-by-hop approach requires more state and
processing in routers
Deployment is difficult and slow
- ISP’s reluctant to turn on IP Multicast
13
Overlay Multicast
Provide multicast functionality above the IP
layer overlay or application level multicast
Potential Benefits over IP Multicast
- Quick deployment
- All multicast state in end systems
- Computation at forwarding points simplifies
support for higher level functionality
Challenge: do this efficiently
Narada [Yang-hua Chu et al, 2000 CMU]
-
Multi-source multicast
Involves only end hosts
Small group sizes <= hundreds of nodes
Typical application: chat
14
Narada: End System Multicast
Gatech
Stanford
Stan1
Stan2
CMU
Berk1
Berk2
Berkeley
Overlay tree
Stan1
Gatech
Stan2
CMU
Berk1
Berk2
15
End System Multicast: Narada
A distributed protocol for constructing
efficient overlay trees among end
systems
Caveat: assume applications with
small and sparse groups
- Around tens to hundreds of
members
16
Potential Benefits
Scalability
- routers do not maintain per-group state
- end systems do, but they participate in few groups
Easier to deploy
- only requires adding software to end hosts
Potentially simplifies support for higher level
functionality
- use hop-by-hop approach, but end hosts are routers
- leverage computation and storage of end systems
- e.g., packet buffering, transcoding of media streams,
ACK aggregation
- leverage solutions for unicast congestion control17 and
reliability
Performance Concerns
Gatech
Delay from CMU to
Berk1 increases
Stan1
Stan2
CMU
Berk1
Duplicate Packets:
Bandwidth Wastage
Gatech
Stanford
Berk2
Stan1
Stan2
CMU
Berk1
Berkeley
Berk2
18
Overlay Tree
Delay between the source and receivers is small
The number of redundant packets on any
physical link is low
Heuristic:
- Every member in the tree has a small degree
- Degree chosen to reflect bandwidth of connection to
Internet
CMU
Stan2
Stan1
Berk1
Berk2
High latency
Stan2
CMU
Stan2
Stan1
Gatech
Berk1
CMU
Stan1
Gatech
Berk2
High degree (unicast)
Berk1
Gatech
Berk2
“Efficient” overlay
19
Overlay Construction Problems
Dynamic changes in group membership
- Members may join and leave dynamically
- Members may die
Dynamic changes in network conditions and
topology
- Delay between members may vary over time
due to congestion, routing changes
Knowledge of network conditions is member
specific
- Each member must determine network
conditions for itself
20
Solution
Two step design
- Build a mesh that includes all participating
end-hosts
• what they call a mesh is just a graph
• members probe each other to learn
network related information
• overlay must self-improve as more
information available
- Build source routed distribution trees
21
Mesh
Advantages:
- Offers a richer topology robustness; don’t
need to worry too much about failures
- Don’t need to worry about cycles
Desired properties
- Members have low degrees
- Shortest path delay between any pair of
members along mesh is small
CMU
Stan2
Stan1
Berk2
Berk1
Gatech
22
Overlay Trees
Source routed minimum spanning tree on
mesh
Desired properties
- Members have low degree
- Small delays from source to receivers
CMU
CMU
Stan2
Stan2
Stan1
Stan1
Berk2
Berk1
Gatech
Berk2
Berk1
Gatech
23
Narada Components/Techniques
Mesh Management:
- Ensures mesh remains connected in face of
membership changes
Mesh Optimization:
- Distributed heuristics for ensuring shortest
path delay between members along the mesh
is small
Spanning tree construction:
- Routing algorithms for constructing datadelivery trees
- Distance vector routing, and reverse path
forwarding
24
Definitions
Utility gain of adding a link based on
- The number of members to which routing
delay improves
- How significant the improvement in delay to
each member is
Cost of dropping a link based on
- The number of members to which routing
delay increases, for either neighbor
Add/Drop Thresholds are functions of:
- Member’s estimation of group size
- Current and maximum degree of member in
the mesh
25
Optimizing Mesh Quality
CMU
Stan2
Stan1
A poor overlay topology:
Long path from Gatech2 to CMU
Berk1
Gatech1
Gatech2
Members periodically probe other members at
random
New link added if
Utility_Gain of adding link > Add_Threshold
Members periodically monitor existing links
Existing link dropped if
26
Cost of dropping link < Drop Threshold
Desirable properties of heuristics
Stability: A dropped link will not be immediately re-added
Partition avoidance: A partition of the mesh is unlikely to
be caused as a result of any single link being dropped
Stan2
CMU
Stan2
Stan1
CMU
Stan1
Probe
Berk1
Gatech1
Berk1
Gatech
2
Delay improves to Stan1, CMU
but marginally.
Do not add link!
Gatech1
Probe
Gatech
2
Delay improves to CMU, Gatech1
and significantly.
Add link!
27
Example
Stan2
CMU
Stan1
Berk1
Gatech1
Gatech2
Used by Berk1 to reach only Gatech2 and vice versa: Drop!!
Stan2
Berk1
CMU
Stan1
Gatech1
Gatech2
28
Simulation Results
Simulations
- Group of 128 members
- Delay between 90% pairs < 4 times
the unicast delay
- No link caries more than 9 copies
Experiments (in Internet)
- Group of 13 members
- Delay between 90% pairs < 1.5 times
the unicast delay
29
Summary
End-system multicast (NARADA) : aimed to
small-sized groups
- Application example: chat
Multi source multicast model
No need for infrastructure
Properties
- low performance penalty compared to IP
Multicast
- potential to simplify support for higher layer
functionality
- allows for application-specific
customizations
30
Summary
Narada demonstrates the flexibility of the
application level multicast
- I.e., the ability to optimize the multicast
distribution to the application needs
Issues
- 4x unicast delay could be a problem for
interactive applications
- reliability and congestion control for
heterogeneous receivers not demonstrated
- sender access control solution not
demonstrated
- overhead of probes is low for one group, what
about for n groups on same host?
31
- is stress really an important metric?
Other Projects
Overcast [Jannotti et al, Cisco, 2000]
- Single source tree
- Uses an infrastructure; end hosts are not part of
multicast tree
- Large groups ~ millions of nodes
- Typical application: content distribution
Scattercast (Chawathe et al, UC Berkeley)
- Emphasis on infrastructural support and proxybased multicast
- Uses a mesh like Narada, but differences in protocol
details
Yoid (Paul Francis, Cornell)
- Uses a shared tree among participating members
- Distributed heuristics for managing and optimizing
tree constructions
32
Overcast
Designed for throughput intensive content
delivery
- Streaming, file distribution
Single source multicast; like Express
Solution: build a server based infrastructure
Tree building objective: high throughput
33
Tree Building Protocol
Goal: Maximize bandwidth to root for all nodes
Idea: Add a new node as far away from the
route as possible without compromising the
Root
throughput
Join (new, root) {
current = root;
B = bandwidth(root, new);
do {
B1 = 0;
for all n in children(current) {
B1 = bandwidth(n, new);
if (B1 >= B) {
current = n;
break;
}
} while (B1 >= B);
new->parent = root; }
1
0.5
1
0.8
0.5
0.7
34
Details
A node periodically reevaluates its position by
measuring bandwidth to its
- Siblings
- Parent
- Grandparent
The Up/Down protocol: track membership
- Each node maintains info about all nodes in it subtree plus a log of changes
• Memory cheap
- Each node sends periodical alive messages to its
parent
- A node propagates info up-stream, when
• Hears first time from a children
• If it doesn’t hear from a children for a present
interval
35
• Receives updates from children
Details
Problem: root single point of failure
Solution: replicate root to have a backup source
Problem: only root maintain complete info about the
tree; need also protocol to replicate this info
Elegant solution: maintain a tree in which first levels
have degree one
- Advantage: all nodes at these levels maintain full info
about the tree
- Disadvantage: may increase delay, but this is not
important for application supported by Overcast
Nodes maintaining full Status info about tree
36
Some Results
Network load < twice the load of IP
multicast (600 node network)
Convergence: a 600 node network
converges in ~ 45 rounds
37
Summary
Overcast: aimed to large groups and high
throughput applications
- Examples: video streaming, software
download
Single source multicast model
Deployed as an infrastructure
Properties
- Low performance penalty compared to IP
multicast
- Robust & customizable (e.g., use local disks
for aggressive caching)
38
Enabling Conferencing Applications
on the Internet using an
Overlay Multicast Architecture
Yang-hua Chu, Sanjay Rao,
Srini Seshan and Hui Zhang
Carnegie Mellon University
Past Work
Self-organizing protocols
- Yoid (ACIRI), Narada (CMU), Scattercast
(Berkeley), Overcast (CISCO), Bayeux
(Berkeley), …
- Construct overlay trees in distributed fashion
- Self-improve with more network info
Performance results showed promise, but…
- Evaluation conducted in simulation
- Did not consider impact of network dynamics
40
on overlay performance
Focus of This Paper
Can End System Multicast support real-world
applications on the Internet?
- Study in context of conferencing
applications
- Show performance acceptable even in a
dynamic and heterogeneous Internet
environment
First detailed Internet evaluation to show the
feasibility of End System Multicast
41
Why Conferencing?
Important and well-studied
- Early goal and use of multicast (vic, vat)
Representative of interactive apps
- E.g., distance learning, on-line games
Stringent performance requirements
- High bandwidth, low latency
42
Roadmap
Enhancing self-organizing protocols for
conferencing applications
Evaluation methodology
Results from Internet experiments
43
Supporting Conferencing in ESM
(End System Multicast)
Source rate
2 Mbps
2 Mbps
C
A
D
2 Mbps
B
0.5 Mbps
Unicast congestion control
Transcoding
(DSL)
Framework
- Unicast congestion control on each overlay link
- Adapt to the data rate using transcoding
Objective
- High bandwidth and low latency to all receivers
along the overlay
44
Enhancements of Overlay Design
Two new issues addressed
- Dynamically adapt to changes in network
conditions
- Optimize overlays for multiple metrics
• Latency and bandwidth
Study in the context of the Narada protocol
(Sigmetrics 2000) - CRZ00
- Techniques presented apply to all selforganizing protocols
45
Adapt to Dynamic Metrics
Adapt overlay trees to changes in network condition
- Monitor avail bandwidth (eg, with Spruce) and
latency of overlay links
- Link measurements can be noisy
- Aggressive adaptation may cause overlay instability
bandwidth
transient:
do not react
persistent:
react
raw estimate
smoothed estimate
discretized estimate
time
• Capture the long term performance of a link
– Exponential smoothing, Metric discretization
46
Optimize Overlays for Dual Metrics
Source rate
2 Mbps
Source
60ms, 2Mbps
Receiver X
30ms, 1Mbps
Prioritize bandwidth over latency
Break tie with shorter latency
47
Example of Protocol Behavior
Mean Receiver Bandwidth
All members join at time 0
Single sender, CBR traffic
Reach a stable
overlay
• Acquire network
info
• Self-organization
Adapt to network
congestion
48
Evaluation Overview
Evaluation Goals
- Can ESM provide application level performance
comparable to IP Multicast?
- What network metrics must be considered while
constructing overlays?
- What is the network cost and overhead?
Compare performance of our scheme with
- Benchmark (IP Multicast)
- Other overlay schemes that consider fewer metrics
Evaluate schemes in different scenarios
- Vary host set, source rate
Performance metrics
- Application perspective: latency, bandwidth
49
- Network perspective: resource usage, overhead
Benchmark Scheme
IP Multicast not deployed (Mbone is an overlay)
Sequential Unicast: an approximation
- Bandwidth and latency of unicast path from
source to each receiver
- Performance similar to IP Multicast with
ubiquitous (well spread out) deployment
A
B
Source
C
50
Overlay Schemes
Overlay Scheme
Choice of Metrics
Bandwidth
Latency
Bandwidth-Latency
Bandwidth-Only
Latency-Only
Random
51
Experiment Methodology
Compare different schemes on the Internet
- Ideally: run different schemes concurrently
- Interleave experiments of different schemes
- Repeat same experiments at different time
of day
- Average results over 10 experiments
For each experiment
- All members join at the same time
- Single source CBR traffic with TFRC
adaptation
52
- Each experiment lasts for 20 minutes
Application Level Metrics
Bandwidth (throughput) observed by each
receiver
RTT between source and each receiver along
overlay
C
Source
A
Data path
D
RTT measurement
B
These measurements include queueing and
processing delays at end systems
53
Performance of Overlay Scheme
CMU
MIT
Exp1
Exp2
Harvard
30ms
Harvard
CMU
40ms
MIT
Exp1
Exp2
RTT
32ms
42ms
Rank
1
Different runs of the same scheme may
produce different but “similar quality” trees
2
Mean
Std. Dev.
“Quality” of overlay tree produced by a scheme
Sort (“rank”) receivers based on performance
Take mean and std. dev. on performance of same rank
across multiple experiments
54
Std. dev. shows variability of tree quality
Factors Affecting Performance
Heterogeneity of host set
- Primary Set: 13 university hosts in U.S.
and Canada
- Extended Set: 20 hosts, which includes
hosts in Europe, Asia, and behind ADSL
Source rate
- Fewer Internet paths can sustain higher
source rate
- More intelligence required in overlay
constructions
55
Three Scenarios Considered
Primary Set
1.2 Mbps
Primary Set
2.4 Mbps
Extended Set
2.4 Mbps
(lower) “stress” to overlay schemes (higher)
Does ESM work in different scenarios?
How do different schemes perform under
various scenarios?
Is it sufficient to consider just a single metric?
- Bandwidth-Only, Latency-Only
56
BW (Primary Set, 1.2 Mbps)
Internet pathology
Random scheme performs poorly even in a less “stressful” scenario
RTT results show similar trend
57
BW (Extended Set, 2.4 Mbps)
no strong correlation between
latency and bandwidth
58
Optimizing only for latency has poor bandwidth performance
RTT (Extended Set, 2.4Mbps)
Bandwidth-Only cannot avoid
poor latency links or long path length
59
Optimizing only for bandwidth has poor latency performance
Internet tools used in the design
From: “Measurement-Based Optimization Techniques
for Bandwidth-Demanding Peer-to-Peer Systems” T.
S. Eugene et al, Infocom 2003
•
•
•
RTT probing:
• a Ping pkt is sent to each candidate peer; select
lowest RTT peer
10KB TCP:
• TCP connection open, 10KB downloaded, fastest
download peer selected
BNBW (Bottleneck Bw):
• Nettimer (a packet pair scheme) used; peer with
60
largest BNBW is selected
Summary so far…
For best application performance: adapt
dynamically to both latency and bandwidth
metrics
Bandwidth-Latency performs comparably to IP
Multicast (Sequential-Unicast)
What is the network cost and overhead?
61
Resource Usage (RU)
Captures consumption of network resource of overlay tree
Overlay link RU = propagation delay
Tree RU = sum of link RU
40ms
UCSD
Scenario: Primary Set, 1.2 Mbps
(normalized to IP Multicast RU)
IP Multicast
Bandwidth-Latency
Random
Naïve Unicast
1.0
1.49
2.24
2.62
CMU
2ms
U.Pitt
Efficient (RU = 42ms)
40ms
UCSD
CMU
40ms
U. Pitt
Inefficient (RU = 80ms)
62
Protocol Overhead
Protocol overhead =
total non-data traffic (in bytes)
total data traffic (in bytes)
Results: Primary Set, 1.2 Mbps
- Average overhead = 10.8%
- 92.2% of overhead is due to bandwidth probe
Current scheme employs active probing for
available bandwidth
- Simple heuristics to eliminate unnecessary
probes
63
- Focus of our current research
Contribution
First detailed Internet evaluation to show the
feasibility of End System Multicast architecture
- Study in context of a/v conferencing
- Performance comparable to IP Multicast
Impact of metrics on overlay performance
- For best performance: use both latency and
bandwidth
More info: http://www.cs.cmu.edu/~narada
64