Self-Similarity in Network Traffic

Transcript Self-Similarity in Network Traffic

Interesting Links
1
On the Self-Similar Nature of Ethernet Traffic
Will E. Leland, Walter Willinger and Daniel V. Wilson
Murad S. Taqqu
BELLCORE
BU
Analysis and Prediction of the Dynamic Behavior of Applications, Hosts, and Networks
Overview

What is Self Similarity?

Ethernet Traffic is Self-Similar

Source of Self Similarity

Implications of Self Similarity
3
Background



Network traffic did not obey Poisson
assumptions used in queuing analysis
This paper, for the first time, provided an
explanation and a systematic approach to
modeling realistic data traffic patterns
Sparked off research around the globe:

Results show self-similarity in ATM traffic,
compressed digital video streams, and Web
Traffic
4
Why is Self-Similarity Important?

In this paper, Ethernet traffic has been
identified as being self-similar.

Models like Poisson are not able to capture
the self-similarity property.

This leads to inaccurate performance
evaluation
5
Section 1:
What is Self-Similarity ?
Intuition of Self-Similarity

Something “feels the same” regardless of
scale (also called fractals)
7
8
9
10
Self-Similarity in Traffic Measurement
(Ⅰ) Traffic Measurement
11
What is self-similarity?

In case of stochastic objects like time-series,
self-similarity is used in the distributional
sense

Exhibits bursts at a wide range of timescales

Important performance implications
12
Self-Similarity in Traffic Measurement
(Ⅱ) Network Traffic
13
Consequences of Self-Similarity

Traditionally, networks are described by
generalized Markovian Processes
Have limited memory of the past
Reflect short-range dependence

Smoothing of bursty data is possible



Averaging of bursty traffic over a long period of
time gives rise to smooth data stream
14
Consequences of Self-Similarity


Traffic has similar statistical properties at a
range of timescales: ms, secs, mins, hrs,
days
Merging of traffic (as in a statistical
multiplexer) does not result in smoothing of
traffic
Bursty Data
Streams
Aggregation
Bursty Aggregate
Streams
15
Properties of Self Similarity

X = (Xt : t = 0, 1, 2, ….) is covariance stationary random
process (i.e. Cov(Xt,Xt+k) does not depend on t for all k)

Let X(m)={Xk(m)} denote the new process obtained by averaging
the original series X in non-overlapping sub-blocks of size m.

E.g. X(1)= 4,12,34,2,-6,18,21,35
Then X(2)=8,18,6,28
X(4)=13,17

Mean , variance 2
Autocorrelation Function r(k) ~ k-b, where 0 < b < 1.

Suppose that r(k)  k-β, 0<β<1
16
Self Similarity

X is exactly second-order self-similar if


The aggregated processes have the same
autocorrelation structure as X. i.e.
r (m) (k) = r(k), k0 for all m =1,2, …

X is [asymptotically] second-order self-similar if
the above holds when [ r (m) (k)  r(k), m  ]

Most striking feature of self-similarity: Correlation
structures of the aggregated process do not
degenerate as m  
17



This is in contrast to traditional models
Correlation structures of their aggregated
processes degenerate as m  
i.e. r (m) (k)  0 as m  , for k = 1,2,3,...
Example:


Poisson Distribution
Self-Similar Distribution
18
Long Range Dependence



Processes with Long Range Dependence are
characterized by an autocorrelation function that
decays hyperbolically as k increases
Important Property: k r (k )  
This is also called non-summability of correlation
The intuition behind long-range dependence:


While high-lag correlations are all individually small, their
cumulative affect is important
Gives rise to features drastically different from conventional
short-range dependent processes
19
Intuition


Short-range processes:

Exponential Decay of autocorrelations , i.e.:

r(k) ~ pk , as k  , 0 < p < 1

Summation is finite
Non-summability is an important propery
 Guarantees non-degenerate correlation
structure of the aggregated processes
X(m) as m  
20
The Measure of Self-Similarity


Hurst Parameter H , 0.5 < H < 1
Three approaches to estimate H (Based on
properties of self-similar processes)



Variance Analysis of aggregated processes
Analysis of Rescaled Range (R/S) statistic for
different block sizes
A Whittle Estimator
21
Variance Analysis

Variance of aggregated processes decays as:
 Var(X(m)) = am-b as m inf,

For short range dependent processes (e.g. Poisson Process),



Var(X(m)) = am-1 as m inf,
Plot Var(X(m)) against m on a log-log plot
Slope gives the decay parameter

> -1 indicative of self-similiarity
22
The Hurst Effect

For self-similar data, rescaled range or R/S
statistic grows according to cnH


For short-range processes ,


H = Hurst Paramater, > 0.5
R/S statistic ~ dn0.5
History: The Nile river



In the 1940-50’s, Harold Edwin Hurst studies the
800-year record of flooding along the Nile river.
(yearly minimum water level)
Finds long-range dependence.
23
The R/S statistic
24
Self Similarity

X is exactly second-order self-similar with Hurst parameter
H (= 1- β/2) if for all m,



Var(X(m) ) = 2 m-β , and
r (m) (k) = r(k), k0
X is [asymptotically] second-order self-similar if
the above holds when [ r (m) (k)  r(k), m  ]
25
Properties of Self-Similarity

Var(X(m) ) (= 2 m-β ) decreases more slowly (than m –1)

r(k) decreases hyperbolically (not exponentially) so that
kr(k) =  (long range dependence)

The spectral density [discrete time Fourier Transform of
r(k)] f(λ) cλ-(1- β), as λ0 (not bounded)
26
Modeling Self-Similarity

Fractional Gaussian noise (FGN)




Fractional ARIMA(p,d,q)


Gaussian process with mean , variance 2, and
Autocorrelation function r(k)=(|k+1|2H-|k|2H+|k-1|2H), k>0
Exactly second-order self-similar with 0.5<H<1
Asymptotically second-order self-similar with H=d+0.5 where 0<d<0.5
Discrete time M/G/ input model

Service time X given by heavy tail distribution (i.e. E[x]< , 2= )


Example : Pareto distribution P(X>k)
k-α, 1< α<2
N = {Nt ,t=1,2,…} is self-similar with H=(3- α)/2 where Nt denotes # of
members being serviced at time t
27
Inference for Self-Similar Processes

Time domain analysis based on R/S statistic

Variance analysis based on the aggregated process X(m)


Reminds Var(X(m) ) = 2 m-β, plot log(Var(X(m) )) against log m
Frequency domain analysis



Estimate PSD f(λ) using discrete time Fourier Transform
Reminds f(λ) cλ-(1- β), as λ0, plot log(f(λ)) against log λ
Provides confidence intervals when combining Whittle’s MLE
approach and the aggregation method
28
Section 2:
Ethernet Traffic is Self-Similar
Data Collection

Ethernet traffic of a large lab at Bellcore

100µs time resolution

Packet length, interface status, header info

Mostly IP and NFS

Four data sets over three year period

Over 100M packets in traces
30
Plots Showing Self-Similarity (Ⅰ)
H=1
H=0.5
H=0.5
Estimate H  0.8
31
Plots Showing Self-Similarity (Ⅱ)
High Traffic
5.0%-30.7%
Mid Traffic
3.4%-18.4%
Low Traffic
1.3%-10.4%
Higher Traffic, Higher H
32
H : A Function of Network Utilization

Observation shows “contrary to Poisson”


Network Utilization
H
As we shall see shortly, H measures traffic burstiness
As number of Ethernet users increases, the resulting
aggregate traffic becomes burstier instead of smoother
33
H : Measuring “Burstiness”

Intuitive explanation using M/G/ Model



As α 1, service time is more variable, easier
to generate burst
Increasing H !
Wrong way to measure “burstiness” of selfsimilar process


Peak-to-mean ratio
Coefficient of variation (for interarrival times)
34
Summary

Ethernet LAN traffic is statistically self-similar

H : the degree of self-similarity
H : a function of utilization
H : a measure of “burstiness”



Models like Poisson are not able to capture
self-similarity
35
Discussions

How to explain self-similarity ?


How this would impact existing performance?



Heavy tailed file sizes
Limited effectiveness of buffering
Effectiveness of FEC
How to adapt to self-similarity?


Prediction
Adaptive FEC
36
37
38
39
Thanks !

Self-Similarity in Network Traffic

Transcript Self-Similarity in Network Traffic

Directory