Self-Similarity in Network Traffic
Download
Report
Transcript Self-Similarity in Network Traffic
Interesting Links
1
On the Self-Similar Nature of Ethernet Traffic
Will E. Leland, Walter Willinger and Daniel V. Wilson
Murad S. Taqqu
BELLCORE
BU
Analysis and Prediction of the Dynamic Behavior of Applications, Hosts, and Networks
Overview
What is Self Similarity?
Ethernet Traffic is Self-Similar
Source of Self Similarity
Implications of Self Similarity
3
Background
Network traffic did not obey Poisson
assumptions used in queuing analysis
This paper, for the first time, provided an
explanation and a systematic approach to
modeling realistic data traffic patterns
Sparked off research around the globe:
Results show self-similarity in ATM traffic,
compressed digital video streams, and Web
Traffic
4
Why is Self-Similarity Important?
In this paper, Ethernet traffic has been
identified as being self-similar.
Models like Poisson are not able to capture
the self-similarity property.
This leads to inaccurate performance
evaluation
5
Section 1:
What is Self-Similarity ?
Intuition of Self-Similarity
Something “feels the same” regardless of
scale (also called fractals)
7
8
9
10
Self-Similarity in Traffic Measurement
(Ⅰ) Traffic Measurement
11
What is self-similarity?
In case of stochastic objects like time-series,
self-similarity is used in the distributional
sense
Exhibits bursts at a wide range of timescales
Important performance implications
12
Self-Similarity in Traffic Measurement
(Ⅱ) Network Traffic
13
Consequences of Self-Similarity
Traditionally, networks are described by
generalized Markovian Processes
Have limited memory of the past
Reflect short-range dependence
Smoothing of bursty data is possible
Averaging of bursty traffic over a long period of
time gives rise to smooth data stream
14
Consequences of Self-Similarity
Traffic has similar statistical properties at a
range of timescales: ms, secs, mins, hrs,
days
Merging of traffic (as in a statistical
multiplexer) does not result in smoothing of
traffic
Bursty Data
Streams
Aggregation
Bursty Aggregate
Streams
15
Properties of Self Similarity
X = (Xt : t = 0, 1, 2, ….) is covariance stationary random
process (i.e. Cov(Xt,Xt+k) does not depend on t for all k)
Let X(m)={Xk(m)} denote the new process obtained by averaging
the original series X in non-overlapping sub-blocks of size m.
E.g. X(1)= 4,12,34,2,-6,18,21,35
Then X(2)=8,18,6,28
X(4)=13,17
Mean , variance 2
Autocorrelation Function r(k) ~ k-b, where 0 < b < 1.
Suppose that r(k) k-β, 0<β<1
16
Self Similarity
X is exactly second-order self-similar if
The aggregated processes have the same
autocorrelation structure as X. i.e.
r (m) (k) = r(k), k0 for all m =1,2, …
X is [asymptotically] second-order self-similar if
the above holds when [ r (m) (k) r(k), m ]
Most striking feature of self-similarity: Correlation
structures of the aggregated process do not
degenerate as m
17
This is in contrast to traditional models
Correlation structures of their aggregated
processes degenerate as m
i.e. r (m) (k) 0 as m , for k = 1,2,3,...
Example:
Poisson Distribution
Self-Similar Distribution
18
Long Range Dependence
Processes with Long Range Dependence are
characterized by an autocorrelation function that
decays hyperbolically as k increases
Important Property: k r (k )
This is also called non-summability of correlation
The intuition behind long-range dependence:
While high-lag correlations are all individually small, their
cumulative affect is important
Gives rise to features drastically different from conventional
short-range dependent processes
19
Intuition
Short-range processes:
Exponential Decay of autocorrelations , i.e.:
r(k) ~ pk , as k , 0 < p < 1
Summation is finite
Non-summability is an important propery
Guarantees non-degenerate correlation
structure of the aggregated processes
X(m) as m
20
The Measure of Self-Similarity
Hurst Parameter H , 0.5 < H < 1
Three approaches to estimate H (Based on
properties of self-similar processes)
Variance Analysis of aggregated processes
Analysis of Rescaled Range (R/S) statistic for
different block sizes
A Whittle Estimator
21
Variance Analysis
Variance of aggregated processes decays as:
Var(X(m)) = am-b as m inf,
For short range dependent processes (e.g. Poisson Process),
Var(X(m)) = am-1 as m inf,
Plot Var(X(m)) against m on a log-log plot
Slope gives the decay parameter
> -1 indicative of self-similiarity
22
The Hurst Effect
For self-similar data, rescaled range or R/S
statistic grows according to cnH
For short-range processes ,
H = Hurst Paramater, > 0.5
R/S statistic ~ dn0.5
History: The Nile river
In the 1940-50’s, Harold Edwin Hurst studies the
800-year record of flooding along the Nile river.
(yearly minimum water level)
Finds long-range dependence.
23
The R/S statistic
24
Self Similarity
X is exactly second-order self-similar with Hurst parameter
H (= 1- β/2) if for all m,
Var(X(m) ) = 2 m-β , and
r (m) (k) = r(k), k0
X is [asymptotically] second-order self-similar if
the above holds when [ r (m) (k) r(k), m ]
25
Properties of Self-Similarity
Var(X(m) ) (= 2 m-β ) decreases more slowly (than m –1)
r(k) decreases hyperbolically (not exponentially) so that
kr(k) = (long range dependence)
The spectral density [discrete time Fourier Transform of
r(k)] f(λ) cλ-(1- β), as λ0 (not bounded)
26
Modeling Self-Similarity
Fractional Gaussian noise (FGN)
Fractional ARIMA(p,d,q)
Gaussian process with mean , variance 2, and
Autocorrelation function r(k)=(|k+1|2H-|k|2H+|k-1|2H), k>0
Exactly second-order self-similar with 0.5<H<1
Asymptotically second-order self-similar with H=d+0.5 where 0<d<0.5
Discrete time M/G/ input model
Service time X given by heavy tail distribution (i.e. E[x]< , 2= )
Example : Pareto distribution P(X>k)
k-α, 1< α<2
N = {Nt ,t=1,2,…} is self-similar with H=(3- α)/2 where Nt denotes # of
members being serviced at time t
27
Inference for Self-Similar Processes
Time domain analysis based on R/S statistic
Variance analysis based on the aggregated process X(m)
Reminds Var(X(m) ) = 2 m-β, plot log(Var(X(m) )) against log m
Frequency domain analysis
Estimate PSD f(λ) using discrete time Fourier Transform
Reminds f(λ) cλ-(1- β), as λ0, plot log(f(λ)) against log λ
Provides confidence intervals when combining Whittle’s MLE
approach and the aggregation method
28
Section 2:
Ethernet Traffic is Self-Similar
Data Collection
Ethernet traffic of a large lab at Bellcore
100µs time resolution
Packet length, interface status, header info
Mostly IP and NFS
Four data sets over three year period
Over 100M packets in traces
30
Plots Showing Self-Similarity (Ⅰ)
H=1
H=0.5
H=0.5
Estimate H 0.8
31
Plots Showing Self-Similarity (Ⅱ)
High Traffic
5.0%-30.7%
Mid Traffic
3.4%-18.4%
Low Traffic
1.3%-10.4%
Higher Traffic, Higher H
32
H : A Function of Network Utilization
Observation shows “contrary to Poisson”
Network Utilization
H
As we shall see shortly, H measures traffic burstiness
As number of Ethernet users increases, the resulting
aggregate traffic becomes burstier instead of smoother
33
H : Measuring “Burstiness”
Intuitive explanation using M/G/ Model
As α 1, service time is more variable, easier
to generate burst
Increasing H !
Wrong way to measure “burstiness” of selfsimilar process
Peak-to-mean ratio
Coefficient of variation (for interarrival times)
34
Summary
Ethernet LAN traffic is statistically self-similar
H : the degree of self-similarity
H : a function of utilization
H : a measure of “burstiness”
Models like Poisson are not able to capture
self-similarity
35
Discussions
How to explain self-similarity ?
How this would impact existing performance?
Heavy tailed file sizes
Limited effectiveness of buffering
Effectiveness of FEC
How to adapt to self-similarity?
Prediction
Adaptive FEC
36
37
38
39
Thanks !