Bandwidth estimation basics

Download Report

Transcript Bandwidth estimation basics

Bandwidth estimation
in computer networks:
measurement techniques &
applications
Constantine Dovrolis
College of Computing
Georgia Institute of Technology
The Internet as a “black box”
 Several network properties are important for applications and
transport protocols:
 Delay, loss rate, jitter, capacity, congestion, load, etc
 But, routers and switches do not provide direct feedback to
end-hosts (except ICMP, also of limited use)
 Mostly due to scalability, policy, and simplicity factors
Can we guess what is in the black box?
probe
packets
 End-systems can infer network state through end-to-end (e2e)
measurements
 Without any feedback from routers
 Objectives: accuracy, speed, non-intrusiveness
Bandwidth estimation in packet networks
 Bandwidth estimation (bwest):

Inference of various throughput-related metrics
with end-to-end network measurements
 Early works:






Keshav’s packet pair method for congestion control (‘89)
Bolot’s capacity measurements (’93)
Carter-Crovella’s bprobe and cprobe tools (’96)
Jacobson’s pathchar – per hop capacity estimation (‘97)
Melander’s TOPP avail-bw estimation (’00)
In last ± 5 years:





Several new estimation techniques
Many research papers at prestigious conferences
More than a dozen of new measurement tools
Several applications of bwest methods
Significant commercial interest in bwest technology
Overview
 This talk is an overview of the most important developments in
bwest area over the last 10 years

A personal bias could not be avoided..
 Overview


Bandwidth-related metrics
Capacity estimation


Packet pairs and CapProbe technique
Available bandwidth estimation



Iterative probing: Pathload and PathChirp
Direct probing: Spruce
Comparison of tools
Percentile estimation: Pathvar
Applications




Overlay-based Video Streaming
SOcket Buffer Auto-Sizing (SOBAS)
Network bandwidth
metrics
Ravi S.Prasad, Marg Murray, K.C. Claffy,
Constantine Dovrolis,
“Bandwidth Estimation: Metrics, Measurement
Techniques, and Tools,”
IEEE Network, November/December 2003.
Capacity definition
 Maximum possible end-to-end throughput at IP layer


In the absence of any cross traffic
Achievable with maximum-sized packets
 If Ci is capacity of link i, end-to-end capacity C defined as:
C  min Ci
i 1,..., H
 Capacity determined by narrow link
Available bandwidth definition
 Per-hop average avail-bw:
Ai = Ci (1-ui)
 ui: average utilization
 A.k.a. residual capacity
 End-to-end avg avail-bw A:

A  min Ai
i 1,..., H
 Determined by tight link
 ISPs measure per-hop
avail-bw passively (router
counters, MRTG graphs)
The avail-bw as a random process
 Instantaneous utilization ui(t): either 0 or 1
 Link utilization in (t, t+t)
ui (t , t  t ) 
1
t
t t
 u (t )dt
i
t
 Averaging timescale: t
 Available bandwidth in (t, t+t)
Ai (t , t  t )  Ci [1  ui (t , t  t )]
 End-to-end available bandwidth in (t, t+t)
A(t , t  t )  min Ai (t , t  t )
i 1,..., H
Available bandwidth distribution
 Avail-bw has significant variability

Need to estimate second-order moments, or even better,
the avail-bw distribution
 Variability depends on averaging timescale
 Larger timescale, lower variance
 Distribution is Gaussian-like, if
sufficient flow multiplexing
t
t >100-200 msec and with
E2E Capacity estimation
(a’):
Packet pair technique
Constantine Dovrolis, Parmesh Ramanathan, David
Moore,
“Packet Dispersion Techniques and Capacity
Estimation,”
In IEEE/ACM Transactions on Networking, Dec
2004.
Packet pair dispersion




Packet Pair (P-P) technique

Originally, due to
Jacobson & Keshav
Send two equal-sized packets
back-to-back

Packet size: L

Packet trx time at link i:
L/Ci
P-P dispersion: time interval
between last bit of two
packets
Without any cross traffic,
the dispersion at receiver is
determined by narrow link:
L L
 R  max   
i 1,..., H
 Ci  C
 out

L
 max  in , 
Ci 

Cross traffic interference



Cross traffic packets can affect P-P dispersion

P-P expansion: capacity underestimation

P-P compression: capacity overestimation
Noise in P-P distribution depends on cross traffic load
Example: Internet path with 1Mbps capacity
Multimodal packet pair distribution

Typically, P-P distribution includes several local modes


Sub-Capacity Dispersion Range (SCDR) modes:


One of these modes (not always the strongest) is located at L/C
P-P expansion due to common cross traffic packet sizes (e.g., 40B, 1500B)
Post-Narrow Capacity Modes (PNCMs):

P-P compression at links that follow narrow link
E2E Capacity estimation
(b’):
CapProbe
Rohit Kapoor, Ling-Jyh Chen, Li Lao, Mario
Gerla, M. Y. Sanadidi,
"CapProbe: A Simple and Accurate Capacity
Estimation Technique,"
ACM SIGCOMM 2004
Compression of p-p dispersion
 First packet queueing => compressed dispersion

Capacity overestimation
Expansion of p-p dispersion
 Second packet queueing => expansion of dispersion

Capacity underestimation
CapProbe


Both expansion and compression result from queuing delays
Key insight: packet pair with zero queueing delay would yield
correct estimate
 measure RTT for each probing packet of packet pair
 estimate capacity from pair with minimum RTT-sum
Capacity
Measurements - Internet, Internet2
• CapProbe implemented using PING packets, sent in pairs
UCLA-2
To
Cap
Probe
Pathrate
Pathchar
UCLA-3
UA
NTNU
Time
Capacity
Time
Capacity
Time
Capacity
Time
Capacity
0’03
5.5
0’01
96
0’02
98
0’07
97
0’03
5.6
0’01
97
0’04
79
0’07
97
0’03
5.5
0’02
97
0’17
83
0’22
97
6’10
5.6
0’16
98
5’19
86
0’29
97
6’14
5.4
0’16
98
5’20
88
0’25
97
6’5
5.7
0’16
98
5’18
133
0’25
97
21’12
4.0
22’49
18
3 hr
34
3 hr
34
20’43
3.9
27’41
18
3 hr
34
3 hr
35
21.18
4.0
29’47
18
3 hr
30
3 hr
35
Avail-bw estimation (a’):
Pathload
Manish Jain, Constantine Dovrolis,
“End-to-End Available Bandwidth:
Measurement Methodology, Dynamics, and
Relation with TCP Throughput,”
IEEE/ACM Transactions on Networking, Aug
2003.
Probing methodology
 Sender transmits periodic packet stream of rate R
K packets, packet size L, interarrival T = L/R
 Receiver measures One-Way Delay (OWD) for each packet
 D(k) = tarv(k) - tsnd(k)
 OWD variations: Δ(k) = D(k+1) – D(k)
 Independent of clock offset between sender/receiver

 With stationary & fluid-modeled cross traffic:


If R > A, then Δ(k) > 0 for all k
Else, Δ(k) = 0 for all k
Self-loading periodic streams
 Increasing OWDs means R>A
 Almost constant OWDs means R<A
Example of OWD variations
 12-hop path from U-Delaware to U-Oregon


K=100 packets, A=74Mbps, T=100μsec
Rleft = 97Mbps, Rright=34Mbps

Ri
Iterative probing and Pathload
Avail-bw time series
from NLANR trace
 Iterative probing in Pathload:




Send multiple probing streams (duration τ) at various rates
Each stream samples path at different time interval
Outcome of each stream is either Ri > A or Ri < A
Estimate upper and lower bound for avail-bw variation range
Avail-bw estimation
(b’):
PathChirp
Vinay Ribeiro, Rolf Riedi, Jiri Navratil, Rich
Baraniuk, Les Cottrell,
“PathChirp: Efficient Available Bandwidth
Estimation,”
PAM 2003
Chirp Packet Trains
 Exponentially decrease packet spacing within packet train
 Wide range of probing rates
 Efficient: few packets
  1.4  13 packets, 1 - 100Mbps
Chirps vs. CBR Trains
 Multiple rates in each chirping train


Allows one estimate per-chirp
Potentially more efficient estimation
Comparison with Pathload
• 100Mbps links
• pathChirp uses 10
times fewer bytes
for comparable
accuracy
Available
bandwidth
Efficiency
pathchirp
pathload
Accuracy
pathChirp
10-90%
pathload
Avg.min-max
30Mbps
0.35MB
3.9MB
19-29Mbps
16-31Mbps
50Mbps
0.75MB
5.6MB
39-48Mbps
39-52Mbps
70Mbps
0.6MB
8.6MB
54-63Mbps
63-74Mbps
Avail-bw estimation (c’):
Direct probing & Spruce
Jacob Strauss, Dina Katabi, and Frans
Kaashoek
“A Measurement Study of Available Bandwidth
Estimation Tools,”
IMC 2003
Direct probing
 Iterative methods are relatively slow and they
need to probe at multiple rates
 Direct probing methods
 Probe only once at a rate Ri and determine availbw from output rate Ro
 Spruce (IMC’03) followed this approach
 With a single-link fluid model (capacity Ct):
 Sender sends periodic stream with input rate Ri
 Receiver measures output rate Ro
Ct
A  Ct  Ri (  1)
Ro
What’s wrong with Direct Probing?

Ct
A  Ct  Ri (  1)
Ro
Two major issues:

Tight link capacity Ct is assumed to be known


What if tight link is different than narrow link?
Input rate Ri equal to source rate (set equal to Ct)




But what would happen in a multi-hop path?
Probing stream can arrive at tight link with lower rate than
source rate (due to a link with lower avail-bw than Ct before
the tight link)
Results in avail-bw underestimation
For details, see CCR paper, Oct 2006, by Lao, Dovrolis and
Sanadidi,
 “The Probe Gap Model can Underestimate the Available
Bandwidth of Multihop Paths”
Avail-bw estimation (d’):
percentile estimation
Manish Jain, Constantine Dovrolis,
“End-to-End Estimation of the Available
Bandwidth Variation Range,”
ACM SIGMETRICS 2005
Avail-bw distribution and percentiles
 Avail-bw random process, in timescale t: At(t)
Assume stationarity
 Marginal distribution of At:
 Ft(R) = Prob [At ≤ R]
 Ap :pth percentile of At, such that p = Ft(Ap)

Estimate variation range [AL, AH] for
given averaging timescale t
 Objective:


AL and AH are pL and pH percentiles of At
Typically, pL =0.10 and pH =0.90
Percentile sampling
 Question : Which percentile of At corresponds to rate R?
Given R and t, estimate Ft(R)
 Assume that Ft(R) is inversible
 Sender transmits periodic packet stream of rate R
 Length of stream = measurement timescale t
 Receiver classifies stream as
 Type-G if At ≤ R: I(R)= 1 with probability Ft(R)
 Type-L otherwise: I(R)= 0 with probability 1-Ft(R)
 Collect N samples of I(R) by sending N streams of rate R

 Number of type-G streams: I(R,N)=
Σi
Ii(R)
 E[ I(R,N) ] = Ft(R) * N
 Percentile rank of probing rate R: Ft(R) = E[I(R,N)] / N
 Use I(R,N)/N as estimator of Ft(R)
Non-parametric estimation
 Does not assume specific avail-bw distribution
 Iterative algorithm
Stationarity requirement across iterations
 N-th iteration: probing rate Rn
 Use percentile sampling to estimate percentile rank of Rn
 To estimate the upper percentile AH with pH = Ft(AH):
 fn = I(Rn,N)/N
 If fn is between pH±r, report AH = Rn
 Otherwise,
H
 If fn > p + r, set Rn+1 < Rn
H
 If fn < p - r, set Rn+1 > Rn
 Similarly, estimate the lower percentile AL

Validation example
 Verification using real Internet traffic traces
b=0.05
b=0.15
 Non-parametric estimator tracks variation range within 10-20%
 Selection of b depends on traffic
 Traffic spikes/dips may not be detected if b is too small
 But larger b causes larger MSRE
Parametric estimation
 Assume Gaussian avail-bw distribution
Justified assumption for large degree of traffic
multiplexing
 And/or for long averaging timescale (>200msec)
 Gaussian distribution completely specified by
 Mean m and standard deviation st
 pth percentile of Gaussian distribution
p
-1
 A = m + st f (p)

 Sender transmits N probing streams of rates R1 and R2
 Receiver determines percentiles ranks corresponding to R1
and R2
 m and st can be then estimated by solving
-1
 R1 = m + st f (p1)
-1
 R2 = m + st f (p2)
 Variation range is then calculated from:
H
-1 H
 A = m + st f (p )
L
-1 L
 A = m + st f (p )
Validation example
 Verification using real Internet traffic traces
 Evaluated with both Gaussian and non-Gaussian traffic
Gaussian traffic
non-Gaussian traffic
 Parametric algorithm is more accurate than non-parametric
algorithm, when
 traffic is good match to Gaussian model
 in non-stationary conditions
Experimental
comparison of avail-bw
estimation tools
Alok Shriram, Marg Murray, Young Hyun, Nevil
Brownlee, Andre Broido, k claffy,
”Comparison of Public End-to-End Bandwidth
Estimation Tools on High-Speed Links,”
PAM 2005
Testbed topology
Accuracy comparison - SmartBits
Direction 1, Measured AB
Direction 2, Measured AB
Actual AB
Accuracy comparison - TCPreplay
Actual Available Bandwidth
Measured Available Bandwidth
Measurement latency comparison
• Abing: 1.3 to 1.4 s
• Spruce: 10.9 to 11.2 s
• Pathload: 7.2 to 22.3 s
• Patchchirp: 5.4 s
• Iperf: 10.0 to 10.2 s
Probing traffic comparison
Applications of
bandwidth estimation
Applications of bandwidth estimation
 Large TCP transfers and congestion control


Bandwidth-delay product estimation
Socket buffer sizing
 Streaming multimedia
Adjust encoding rate based on avail-bw
Intelligent routing systems
 Overlay networks and multihoming
 Select best path based on capacity or avail-bw
Content Distribution Networks (CDNs)
 Choose server based on least-loaded path
SLA and QoS verification
 Monitor path load and allocated capacity
End-to-end admission control
Network “spectroscopy”
Several more..







Overlay-based Video
Streaming
M. Jain and C. Dovrolis,
“Path Selection using Available Bandwidth
Estimation in Overlay-based Video Streaming,”
published at IFIP Networking 2007.
Motivation
 Video/IPTV next “killer-application” in the
Internet
 IP networks present several challenges
 Network impairments such as losses and jitter
 Lack of QoS guarantees
 Previous approaches

Adapt encoding rate


Use proactive error correction techniques such as
forward error correction (FEC) or retransmission



Video quality not consistent
Bandwidth overhead for FEC
Potentially larger playback delay
Error concealment techniques

Limited applicability & effectiveness
Exploiting Path Diversity
 Multihoming and overlay networks make multiple paths
available between sender and receiver
 Applications can dynamically switch from one path to
another


Based on observed (or predicted) performance
Serves as an additional adaptation mechanism
 Schemes using path diversity based on network
measurement have been explored




Path selection techniques: Tao et al. [ACM Multimedia ‘04]
and Amir et al. [NOSSDAV ‘05]
Multi-description coding techniques: Begen et al. [Signal
Processing ‘05] and Apostolopoulos et al. [INFOCOM ‘02]
They use only loss and jitter measurements
Employ dummy packets for measurement
Our Approach
 We consider use of overlay network for video
streaming

To maximize perceived video quality
 Novel features in our approach:
 Use avail-bw to drive dynamic path selection
 Use VQM technique instead of PSNR to measure
video quality
 Described in ITU-T recommendation J.144
 Use data packets to substitute “dummy” packets
for network measurements
 Objective: Evaluate the performance of different
path selection schemes in terms of perceived
video quality
Overlay based Video Streaming
 Overlay nodes run measurement
module (m-module)
 To measure network
performance
 Video stream delivered via at most
two-hop path
 Based on results by Zhu,
Dovrolis and Ammar, 2006
 M-module uses active
measurements to estimate



Loss-rate
Jitter
Avail-bw: average and
variation range
 Uses video packets for in-band
measurements
 In paths with video streams
Self Loading Periodic Streams
(SLoPS)
 Objective: Determine whether avail-bw A is greater or smaller
than a given rate R
A > R or A <= R
 Sender transmits periodic packet stream of rate R
 K packets, packet size L, departure spacing T = L/R
 Receiver measures One-Way Delay (OWD) for each packet


D(k) = tarv(k) - tsnd(k)

OWD variations: Δ(k) = D(k+1) – D(k)

Independent of clock offset between sender/receiver
 With stationary & fluid-modeled cross traffic:
 If R > A, then Δ(k) > 0 for all k
 Else, Δ(k) = 0 for all k
In-band Avail-bw Estimation
 In-band avail-bw estimation uses video packets for probing
network paths
 Ingress M-module shapes the incoming probing packets to rate
Rp
 If incoming rate Rv < Rp  Add decreasing amount of delay di
 Else, add increasing amount of delay di
 Insert di in packet header
 Egress M-module re-shapes the video stream to rate Rv using di
Path Selection schemes
 We consider 4 path selection schemes
 Based on choice of measured network performance metric
 Loss based (LPS)
 Loss-rate monitored during 3 second period
 Path with lowest loss-rate selected
 Jitter based (JPS)
 Jitter of successive packet pairs measured in 3 second
period
th percentile of jitter measurements
 Path with minimum 90
selected
 If jitter is same, then path with lowest loss-rate is selected

Avail-bw based



Average avail (A-APS)
Lower bound of avail-bw variation range (L-APS)
In current path:
 If avail-bw < twice of video rate; then switch to path with
maximum avail-bw
Experimental Setup
 Testbed details:
 Video stream transmitted from node B to E
 A 2-minute MPEG-2 encoded clip from SMPTE test
video sequences
 M-module on nodes B, D, and E
 Cross traffic generated by replaying NLANR packet
traces
Video Quality Estimation
 Performance of path selection schemes evaluated
based on:

Video quality
 Measured using VQM tool
 http://www.its.bldrdoc.gov/n3/video/vqmsoftware.htm
VQM compares the original video stream with the
received video stream
 Lower VQM score implies higher quality
User-abort probability
 Based on VQM scores of 10-second video segments
Path switching frequency



Results: Experiment 1
 L-APS performs better than other schemes

A-APS performance similar to LPS

Average avail-bw does not capture the variability of avail-bw
distribution
 Ranking of 4 schemes in terms of abort fraction is similar to
long term VQM scores
Results: Experiment 1
 L-APS has lowest path switching frequency

JPS has higher path switching frequency than L-APS

Similar path switching frequency to A-APS
Results: Experiment 2
 Avail-bw based path selection results in better
perceived video quality

Compared to algorithms relying on jitter or loss-rate
Path Switching versus FEC
 We compared performance of simplest form of Reed-Solomon
FEC code (n, k)
 n-k out of n packets carry FEC packets
 In our experiments n = k+1
 L-APS performs consistently better than FEC
Summary
 Explored the use of measurement driven path
selection schemes for video streaming
 Avail-bw estimates based path selection shows
better performance

Compared to other network metrics
 An interesting next step would be to use real-time
video quality measurements to drive path selection
schemes
 Also, hybrid schemes involving use of both path
selection and FEC is unexplored
Improved TCP Throughput
with BwEst
Ravi S. Prasad, Manish Jain, Constantine Dovrolis,
“Socket Buffer Auto-Sizing for HighPerformance Data Transfers,”
Journal of Grid Computing, 2003
TCP performs poorly in fat-long paths

Basic idea in TCP congestion control:

Increase congestion window until packet loss occurs

Then reduce window by a factor of two (multiplicative decrease)

Consequences

Increased loss-rate

Avg. throughput less than
available bandwidth

Large delay-variation

Example: 1Gbps path, 100msec
RTT, 1500B pkts



Single loss ≈ 4000 pkts
window reduction
Recovery from single loss ≈
400 sec
Need very small loss
probability (ρ<2*10-8) to
saturate path
Can we avoid congestive losses w/o
changing TCP?
Ws = min {Wc, Wr}
But Wr<= S

S: rcv socket buffer size

We can limit S so that connection
is limited by receive window: Ws = S

Non-congested path:

Throughput R(S) = S/T(S)

R(S) increases with S
until it reaches MFT

MFT: Maximum Feasible Throughput (onset of congestion)

Objective: set S close to MFT, to avoid any losses and stabilize
TCP send window to a safe value

Congested path:

Path with losses before start of target transfer

Limiting S can only reduce throughput


Socket buffer auto-sizing (SOBAS)
 Apply SOBAS only in non-congested paths
 Receiving application measures rcv-throughput Rrcv
 When receive throughput becomes constant:
 Connection has reached its maximum throughput Rmax
 If the send window becomes any larger, losses will
occur
 Receiving application limits rcv socket buffer size S:
 S = RTT * Rmax
 With congestion unresponsive cross traffic: Rmax = A
 With congestion responsive cross traffic: Rmax = MFT
 SOBAS does not require any changes in TCP
 But estimation of RTT and congestion-status requires
“ping-like” probing
Measurements at WAN path
Non-SOBAS



SOBAS
Path: GaTech to LBL,CA
SOBAS flow get higher average throughput than non-SOBAS
flow because former avoids all congestion losses
SOBAS does not significantly increase path’s RTT
To conclude..
Things I have not talked about
 Other estimation tools & techniques
ABget, pipechar, STAB, pathneck, IGI/PTR, …
 Per-hop capacity estimation (pathchar, clink, pchar)
Other applications
 Bwest in new TCPs (e.g., TCP Westwood, TCP FAST, PCP)
 Bwest in wireless nets
 Bwest in multihoming
 Bwest in bottleneck detection
Theoretical analysis of packet pair/train dispersion
 Unavoidable estimation bias in avail-bw measurements due to
traffic burstiness
 Diminishes as packet train length increases
Scalable bandwidth measurements in networks with many monitoring
nodes
Practical issues
 Interrupt coalescence
 Traffic shapers
 Non-FIFO queues





Conclusions
 We know how to do the following:
 Estimate e2e capacity & avail-bw
 Apply bwest in several applications
 We know that we cannot do the following:
 Estimate per-hop capacity with VPS techniques
 Avoid avail-bw estimation bias when traffic is bursty
and/or path has multiple bottlenecks
 We do not yet know how to do the following:
 Scalable but accurate bandwidth measurements
 Integrate with more applications
 Many research questions are still open
 Help wanted!