INF5070 – Media Storage and Distribution Systems

Download Report

Transcript INF5070 – Media Storage and Distribution Systems

INF5070 – Media Storage and Distribution Systems:
Distribution – Part II
13/10 – 2003
Type IV – Distribution Systems

Combine



Server hierarchy




Types I, II or III
Network of servers
Autonomous servers
Cooperative servers
Coordinated servers
“Proxy caches”


Not accurate …
Cache servers


Keep copies on behalf of a
remote server
Proxy servers

Perform actions on behalf of
their clients
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Type IV – Distribution Systems

Combine



Server hierarchy




Types I, II or III
Hierarchically organized
servers
Autonomous servers
Cooperative servers
Coordinated servers
“Proxy caches”


Not accurate …
Cache servers


Keep copies on behalf of a
remote server
Proxy servers

Perform actions on behalf of
their clients
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Type IV – Distribution Systems

Combine



Server hierarchy




Types I, II or III
Hierarchically organized
servers
Autonomous servers
Cooperative servers
Coordinated servers
“Proxy caches”


Not accurate …
Cache servers


Keep copies on behalf of a
remote server
Proxy servers

Perform actions on behalf of
their clients
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Type IV – Distribution Systems
 Variations
 Gleaning
 Autonomous, coordinated possible
 In komssys
 Proxy prefix caching
 Coordinated, autonomous possible
 In Blue Coat (which was formerly Cacheflow, which was formerly Entera)
 Period multicasting with pre-storage
 Coordinated
 The theoretical optimum
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Gleaning

Webster’s Dictionary: from Late Latin glennare, of Celtic origin
1. to gather grain or other produce left by reapers
2. to gather information or material bit by bit

Combine patching with caching ideas


Caching





Non-conflicting benefits of caching and patching
reduce number of end-to-end transmissions
distribute service access points
no single point of failure
true on-demand capabilities
Patching


shorten average streaming time per client
true on-demand capabilities
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Gleaning
 Combines
Patching & Caching ideas




Wide-area scalable
Reduced server load
Reduced network load
Can support standard clients
Central server
Join !
Unicast patch stream
Proxy cache
multicast
cyclic
buffer
Unicast
1st client
INF5070 – media storage and distribution systems
Proxy cache
Unicast
2nd client
2003 Carsten Griwodz & Pål Halvorsen
Proxy prefix Caching
Central server
 Split movie
 Prefix
 Suffix
Unicast
 Operation
 Store prefix in prefix cache
 Coordination necessary!
 On demand
 Delivery prefix immediately
 Prefetch suffic from central
server
 Goal
 Reduce startup latency
 Hide bandwidth limitations,
delay and/or jitter in backbone
 Reduce load in backbone
INF5070 – media storage and distribution systems
Prefix cache
Unicast
Client
2003 Carsten Griwodz & Pål Halvorsen
MCache
 One of several Prefix Caching
variations
 Combines Batching and
Prefix Caching
Central server
Batch
(multicast)
Prefix cache
 Can be optimized per movie
Prefix cache
 server bandwidth
 network bandwidth
 cache space
 Uses multicast
 Needs non-standard clients
Unicast
1st client
INF5070 – media storage and distribution systems
Unicast
2nd client
2003 Carsten Griwodz & Pål Halvorsen
Proxy prefix Caching
 Basic version









Practical
No multicast
Not optimized
Aimed at large ISPs
Wide-area scalable
Reduced server load
Reduced network load
Can support standard clients
Can partially hide jitter
INF5070 – media storage and distribution systems
 Optimized versions




Theoretical
Multicast
Optimized
Optimum is constantly unstable
 jitter and loss is experienced
for each client !
2003 Carsten Griwodz & Pål Halvorsen
Periodic Multicasting with Pre-Storage
 Optimize storage and
network




Wide-area scalable
Minimal server load achievable
Reduced network load
Can support standard clients
Central server
Assumed start
of the show
 Specials
 Can optimize network load per
subtree
 Negative
 Bad error behaviour
INF5070 – media storage and distribution systems
1st client
2nd client
2003 Carsten Griwodz & Pål Halvorsen
Periodic Multicasting with Pre-Storage
 Optimize storage and
network




Central server
Wide-area scalable
Minimal server load achievable
Reduced network load
Can support standard clients
 Specials
 Can optimize network load per
subtree
 Negative
 Bad error behaviour
INF5070 – media storage and distribution systems
1st client
2nd client
2003 Carsten Griwodz & Pål Halvorsen
Type IV – Distribution Systems
 Autonomous servers
 Requires decision making on each proxy
 Some content must be discarded
 Caching strategies
 Coordinated servers
 Requires central decision making
 Global optimization of the system
 Cooperative servers
 No quantitative research yet
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Autonomous servers
Simulation
central server
 Binary tree model allows
optional
cache server
 Allows analytical comparison of
 Caching
 Patching
 Gleaning
network link
 Considering
INF5070 – media storage and distribution systems
0.16
relative probability/x/1
 optimal cache placement per
movie
 basic server cost
 per-stream costs of storage,
interface card, network link
 movie popularity according to
Zipf distribution
0.08
0
0
20
40
60
80
100
2003 Carsten Griwodz & Pål Halvorsen
Simulation
 Example






500 different movies
220 active users
basic server: $25000
interface cost: $100/stream
network link cost: $350/stream
storage cost: $1000/stream
 Analytical comparison
 demonstrates potential of the approach
 very simplified
Caching
Caching
Unicast transmission
Patching
No caching
Multicast
Client side buffer
375 Mio $
Gleaning
Caching
Multicast
276 Mio $
INF5070 – media storage and distribution systems
4664 Mio $
2003 Carsten Griwodz & Pål Halvorsen
Simulation
 Modeling




User behaviour
Movie popularity development
Limited resources
Hierarchical topology
 Individual user’s
 Intention
 depends on user’s time (model randomly)
 Selection
 depends on movies’ popularity
Movie title age
INF5070 – media storage and distribution systems
Observed Hits
Popularity
 Popularity development
Movie title age
2003 Carsten Griwodz & Pål Halvorsen
Caching Strategies
 Strategies
 FIFO
 First-in-first-out
 Remove the oldest object in the cache in favor of new objects
 LRU
 Least recently used strategy
 Maintain a list of objects
 Move to head of the list whenever accessed
 Remove the tail of the list in favor of new objects
 IRG-k
 Inter-reference gap




Log number of requests
Maintain a list of objects
Sort by number average of distance between k last requests
Remove object with largest number of intermediate requests in favor
of new objects
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Caching Strategies
 Considerations
 conditional overwrite strategies
 can be highly efficient
 limited uplink bandwidth
 quickly exhausted
 performance degrades immediately when working set is too large for
storage space
ECT
IRG
Forget object statistics when removed
Cache all requested objects
Log requests between hits
Remember object statistics forever
Compare requested object and
replacement candidate
Log times between hits
 ECT
 Eternal, Conditional, Temporal
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Effects of caching strategies on throughput
155 MBit/s u plink usage for sin gle server, 64 GB cach e
 

 
 
 



 
150











155 MBit/s uplink u sage fo r single server, 96 G B cache

150
  


100

ECT
FIFO
LRU



50

better




50
ECT
FIFO
LRU












0







0



100

















Throughput
Throughput




 

5000 10000
20000
30000
Users
40000
50000
0







0
5000 10000
20000
30000
Users
40000
50000
 Movies
 1.5 MBit/s, 5400 sec, size ~7.9 GB
 Uplink usage
 profits greatly from small cache increases ...
 ... if there is a strategy
 Conditional overwrite
 reduces uplink usage
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Effects of caching strategies on user hit rates
Cach e Hit Ratio fo r single server, 64 G B cache
Cache Hit Ratio for sin gle server, 96 GB cach e
1
1
 






     
 


 




 

     
   









 

ECT
F IFO
LRU
0.75
0.5
0
5000 10000
20000
30000



40000
Users
Hit Ratio
Hit Ratio

better
0.75
50000
ECT
FIFO
LRU

0.5
0
5000 10000
20000
30000



40000
50000
Users
 Hit ratio
 Dumb strategies do not profit from cache size increases
 Intelligent strategies profit hugely from cache size increases
 Conditional overwrite outperforms other strategies massively
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Effects of number of movies on uplink usage
ECT Cache up lin k usag e with 64 GB, 155 MBit/s link
100




















ECT Cache up lin k usag e with 64 GB, 622 MBit/s link

100





75



5000 users
10000 users

Throughput %
Throughput %


50
better
25

75
1000
2000
3000
4000
5000
Movies in system
6000
7000





50



25
0













0

5000 users
10000 users

1000
2000
3000
4000
5000


6000
7000
 In spite of 99% hit rates
 Increasing the number of user will congest the uplink
 Note
 scheduling techniques provide no savings on low-popularity movies
 identical to unicast scenario with minimally larger caches
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Effects of number of movies on hit ratio
ECT Cach e hit ratio with 64 GB, 155 MBit/s link
ECT Cach e hit ratio with 64 GB, 622 MBit/s link
1
1








0.9






0.9



















0.8




0.7


0.6
5000 users
10000 users







Hit Ratio
Hit Ratio




0.7
5000 users
10000 users


better



0.8
0.6


0.5
1000
2000
3000
4000
5000
6000
Movies in system
7000
0.5
1000
2000
3000
4000
5000
6000
7000
Movies in system
 Limited uplink bandwidth
 Prevents the exchange of titles with medium popularity
 Unproportional drop of efficiency for more users
 Strategy can not recognize medium popularity titles
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Effects of user numbers on refusal
probabilities
Refusal probability, 96 GB cache, 155 Mbit/s uplink
batching
gleaning
unicast
0.015
0.02
Refusal probability, 64 GB cache, 622 Mbit/s uplink
Caching strategy: ECT



Caching strategy: ECT
0.01









0.005




Refusal Probability
Refusal Probability
0.02
0.015
batching
gleaning
unicast





0.01

better



0.005




0
0






10000









20000
30000
Users


40000
50000
0
0






10000
 Uplink-bound scenario










20000
30000
Users


40000


50000
 Shows that low-popularity are accessed like unicast by all techniques
 Patching techniques with infinite window can exploit multicast
 Collecting requests does not work
 Cache size
 Is not very relevant for patching techniques
 Is very relevant for full-title techniques
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Bandwidth effect of daytime variations
 Change popularity according to time-of-day
 Two tests
 Popularity peaks and valleys uniformly distributed
 Complete exchange of all titles
 Spread over the whole day
 Popularity peaks and valleys either at 10:00 or at 20:00
 Complete exchange of all titles
 Within a short time-frame around peak-time
 Astonishing results
 For ECT with all mechanisms
 Hardly any influence on
 hit rate
 uplink congestion
 Traffic is hidden by delivery of low-popularity titles
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Hint-based Caching
Hit ratio development, increasing #hints, ECT history 8
0.9



















 

 
 











10 h in ts
1 00 h in ts
1 0 00 h in ts
1 0 0 00 h in ts






better
0.7
0.9








0.8 
Hit Ratio
Hit Ratio
0.8
1 0 h in ts
1 0 0 h in ts
1 0 0 0 h in ts
10 0 0 0 h in ts
Hit ratio development, increasing #hints, ECT history 64

















 
 



























0.7

0.6
10
100
Users
1000
0.6
10



100
Users
1000
 Idea
 Caches consider requests to neighbour caches in their removal decisions
 Conclusion
 Instability due to uplink congestion can not be prevented
 Advantage exists and is logarithmic as expected
 Larger hint numbers maintain the advantage to the point of instability
 Intensity of instability is due to ECT problem
 ECT inherits IRG drawback of fixed-size histograms
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Simulation
 High relevance of population sizes
 complex strategies require large customer bases
 Efficiency of small caches
 90:10 rule-of-thumb reasonable
 unlike web caching
 Efficiency of distribution mechanisms
 considerable bandwidth savings for uncached titles
 Effects of removal strategies
 relevance of conditional overwrite
 unlike web caching, paging, swapping, ...
 Irrelevance of popularity changes on short timescales
 few cache updates
compared to many direct deliveries
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Coordinated servers
Distribution Architectures
 Combined optimization
 Scheduling algorithm
 Proxy placement and dimensioning
origin server
d-1st level cache
d-2nd level cache
2nd level cache
1st level cache
client
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Distribution Architectures
 Combined optimization
 Scheduling algorithm
 Proxy placement and dimensioning
 No problems with simple scheduling mechanisms
 Examples
 Caching with unicast communication
 Caching with greedy patching
 Patching window in greedy patching is the movie length
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Distribution Architectures
Caching
Caching and Greedy Patching
Movies move
Away from clients
origin
origin
5
4
Cache Level
Cache Level
5
3
2
4
Network for free
3
2
1
1
0.5
client
200
100
0
1
Link
Cost
Increasing network costs
client
200
Movie (0-300 of 500)
100
1
0
0.5
Link
Cost
Movie (0-300 of 500)
Decreasing popularity
INF5070 – media storage and distribution systems
top movie
2003 Carsten Griwodz & Pål Halvorsen
Distribution Architectures
 Combined optimization
 Scheduling algorithm
 Proxy placement and dimensioning
 Problems with complex scheduling mechanisms
 Examples
 Caching with l-patching
 Patching window is optimized for minimal server load
 Caching with gleaning
 A 1st level proxy cache maintains the ”client buffer” for several clients
 Caching with MPatch
 The initial portion of the movie is cached in a 1st level proxy cache
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Unicast patch stream
Number of concurrent streams
Central server
position in movie (offset)
l-Patching
multicast
time
cyclic
buffer
1st client
2nd client
INF5070 – media storage and distribution systems
 M  2  F  U
2003 Carsten Griwodz & Pål Halvorsen
Distribution Architectures
 Placement for l-patching
Popular movies are further away
from the client
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Distribution Architectures
 Failure of the optimization
 Implicitly assumes perfect delivery
 Has no notion of quality
 User satisfaction is ignored
 Disadvantage
 Popular movies further away from clients




Longer distance
Higher startup latency
Higher loss rate
More jitter
 Popular movies are requested more frequently
 Average delivery quality is lower
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Distribution Architectures
 Placement for gleaning
 Combines
 Caching of the full movie
 Optimized patching
 Mandatory proxy cache
 2 degrees of freedom
 Caching level
 Patch length
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Distribution Architectures
 Placement for gleaning
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Distribution Architectures
 Placement for MPatch
 Combines




Caching of the full movie
Partial caching in proxy servers
Multicast in access networks
Patching from the full copy
 3 degrees of freedom
 Caching level
 Patch length
 Prefix length
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Distribution Architectures
 Placement for MPatch
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
Approaches
 Consider quality
 Penalize distance in optimality calculation
 Sort
 Penalty approach
 Low penalties
 Doesn’t achieve order because actual cost is higher

 High penalties
 Doesn’t achieve order because optimizer gets confused
 Sorting
 Trivial
 Very low resource waste
INF5070 – media storage and distribution systems

2003 Carsten Griwodz & Pål Halvorsen
Distribution Architectures
 Combined optimization
 Scheduling algorithm
 Proxy placement and dimensioning
 Impossible to achieve optimum with autonomous caching
 Solution for complex scheduling mechanisms
 A simple solution exists:
 Enforce order according to priorities
 (simple sorting)
 Increase in resource use is marginal
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen
References
1. S.-H. Gary Chan and Fourad A. Tobagi: "Distributed Server Architectures for Networked Video
Services", IEEE/ACM Transactions on Networking 9(2), Apr 2001, pp. 125-136
2. Subhabrata Sen and Jennifer Rexford and Don Towsley: "Proxy Prefix Caxching for Multimedia
Streams", Joint Conference of the IEEE Computer and Communications Societies (INFOCOM),
New York, NY, USA, Mar 1999, pp. 1310-1319
3. Sridhar Ramesh and Injong Rhee and Katherine Guo: "Multicast with cache (mcache): An
adaptive zero-delay video-on-demand service", Joint Conference of the IEEE Computer and
Communications Societies (INFOCOM), Anchorage, Alaska, USA, Apr 2001
4. Michael Bradshaw and Bing Wang and Subhabrata Sen and Lixin Gao and Jim Kurose and
Prashant J. Shenoy and Don Towsley: "Periodic Broadcast and Patching Services Implementation, Measurement, and Analysis in an Internet Streaming Video Testbed", ACM
Multimedia Conference (ACM MM), Ottawa, Canada, Sep 2001, pp. 280-290
5. Bing Wang and Subhabrata Sen and Micah Adler and Don Towsley: "Proxy-based Distribution of
Streaming Video over Unicast/Multicast Connections", Joint Conference of the IEEE Computer
and Communications Societies (INFOCOM), New York, NY, USA, Jun 2002
6. Carsten Griwodz and Michael Zink and Michael Liepert and Giwon On and Ralf Steinmetz,
"Multicast for Savings in Cache-based Video Distribution", Multimedia Computing and
Networking (MMCN), San Jose, CA, USA, Jan 2000
7. Carsten Griwodz and Michael Bär and Lars C. Wolf: "Long-term Movie Popularity in Video-onDemand Systems", ACM Multimedia Conference (ACM MM), Seattle, WA, USA, Nov 1997, pp.
340-357
8. Carsten Griwodz: "Wide-area True Video-on-Demand by a Decentralized Cache-based
Distribution Infrastructure", PhD thesis, Darmstadt University of Technology, Darmstadt,
Germany, Apr 2000
INF5070 – media storage and distribution systems
2003 Carsten Griwodz & Pål Halvorsen