Transcript scan-pc02

SCAN: A Dynamic, Scalable, and
Efficient Content Distribution Network
Yan Chen, Randy H. Katz,
John D. Kubiatowicz
{yanchen, randy, kubitron}@CS.Berkeley.EDU
EECS Department
UC Berkeley
Outlines
•
•
•
•
•
•
•
Motivation
Goal and Challenges
Previous Work
SCAN Architecture and Components
Evaluation Methodology
Results
Conclusions
Motivation Scenario: World Cup 2002
Goal and Challenges
Provide content distribution to clients with good latency and
staleness, while retaining efficient and balanced resource
consumption of the underlying infrastructure
• Dynamic choice of number and location of replicas
– Clients’ QoS constraints: latency, staleness
– Servers’ capacity constraints
• Efficient resource consumption
– Small delay, bandwidth consumption for replica update
– Small replica management cost
• Scalability: millions of objects, clients and servers
• No global network topology knowledge
Previous Work
• Replica Placement
– Research communities: optimal static replica placement
• Assume clients’ distributions, access patterns & IP topology
• No consideration for clients’ QoS or servers’ capacity
constraints
– CDN operators: un-cooperative, ad hoc placement
• Centralized CDN name server cannot record replica locations
– place many more than necessary (ICNP ’02)
• Update Multicast
– No inter-domain IP multicast
– Most application-level multicast (ALM) unscalable
• Split root as common solution, suffers consistency overhead
SCAN: Scalable Content Access Network
data
source
replica
data plane
cache
always update
adaptive
coherence
Web content
server
CDN server
client
DOLR mesh
network plane
Components of SCAN
• Decentralized Object Location & Routing (DOLR)
– Properties needed
• Scalable location with guaranteed success
• Search with locality
– Improve the scalability of d-tree: each member only
maintains states for its parent and direct children
• Simultaneous Dynamic Replica Placement and dtree Construction
– Replica search: Singular, Localized or Exhaustive
– Replica placement on DOLR path: Lazy or Eager
Replica Search
• Singular Search
data plane
parent candidate
proxy
s
c
DOLR mesh
DOLR path
network plane
Replica Search
• Localized search
• Greedy load distribution
data plane
parent candidates
client child
proxy
c
DOLR mesh
DOLR path
s
parent
sibling
server child
network plane
Replica Placement: Eager
data plane
proxy
s
c
DOLR mesh
DOLR path
network plane
first placement choice
Replica Placement: Lazy
data plane
client child
proxy
s
c
DOLR mesh
DOLR path
network plane
first placement choice
Evaluation of Alternatives
• Two dynamic overlay approaches
– Overlay_naïve: Singular search + Eager placement
– Overlay_smart: Localized search + Lazy placement
• Compared with static placement + IP multicast
– Overlay_static: With global overlay topology
– IP_static: With global IP topology (ideal)
• Metrics
– Number of replicas deployed, load distribution
– Multicast performance: Relative Delay Penalty (RDP)
and bandwidth consumption
– Tree construction traffic (packets and bandwidth)
Methodology
• Network Topology
– 5000-node network with GT-ITM transit-stub model
– SCAN nodes placed randomly or on transit nodes
• NS-like Packet-level Network Simulations
• Workloads
– Synthetic flash crowd: all clients access a hot object in
random order
– Real Web server traces: NASA and MSNBC
Web Site Period
Duration
# Requests # Clients # objects
MSNBC
8/2/1999
10–11am
1.6M
140K
4186
NASA
7/1/1995
All day
64K
5177
3258
Methodology: Sensitivity Analysis
•
•
•
•
Various Client/Server Ratio
Various Server Density
Various Latency & Capacity Constraints
Various Network Topologies
– Average over 5 topologies with different setup
• All Have Similar Trend of Results
– Overlay_smart has close-to-optimal (IP_static)
number of replicas, load distribution, multicast
performance with reasonable amount of tree
construction traffic
Number of Replicas Deployed
and Load Distribution
• Overlay_smart uses only 30-60% of replicas than
overlay_naïve and very close to IP_static
• Overlay_smart has two times better load distribution
than od_naïve, overlay_static and very close to IP_static
Multicast
Performance
• 85% of overlay_smart Relative Delay Penalty (RDP)
less than 4
• Bandwidth consumed by overlay_smart is very close
to IP_static, and is only 1/3 of bandwidth by
overlay_naive
Tree Construction Traffic
Including “join” requests, “ping” messages, replica
placement and parent/child registration
• Overlay_smart consumes 3 - 4 times of traffic than
overlay_naïve, and the traffic of overlay_naïve is quite close
to IP_static
• Far less frequent event than access & update dissemination
Conclusions
• P2P networks can be used to construct CDNs
• SCAN: Scalable Content Access Network with good
QoS, efficiency and load balancing
– Simultaneous dynamic replica placement & d-tree
construction
– Leverage DOLR to improve scalability and locality
• In particular, overlay_smart recommended
– Localized search + Lazy placement
– Close to optimal number of replicas, good load
distribution, low multicast delay and bandwidth penalty
at the price of reasonable construction traffic
Results on Web Server Traces
• Limited simulations, most URLs have very few requests
• Overlay_smart uses only one third to half replicas than
overlay_naïve for hot objects
SCAN: Scalable Content Access Network
data
source
replica
data plane
cache
always update
adaptive
coherence
Web content
server
CDN server
client
DOLR mesh
network plane
Replica Search
• Singular Search
data plane
parent candidate
proxy
s
c
DOLR mesh
DOLR path
network plane
Replica Search
• Greedy load distribution
• Localized search
data plane
parent candidates
client child
proxy
c
s
parent
sibling
server child
network plane
DOLR path
Dynamic Replica Placement: naïve
• Singular Search
• Eager Placement
data plane
parent candidate
proxy
s
c
Tapestry mesh
Tapestry overlay path
network plane
first placement choice
Dynamic Replica Placement: smart
• Localized search
• Lazy placement
• Greedy load distribution
data plane
parent candidates
client child
proxy
c
s
parent
sibling
server child
network plane
Tapestry overlay path
first placement choice