Self-Similarity

Download Report

Transcript Self-Similarity

Self-Similarity
But First …
 Queuing Theory Examples

You know that the user tolerance for delay in displaying a web page is 8.3 seconds. You are
launching a new product and expect to have 1100requests/minute during the peak of your
offering. You have a cluster of web servers using DNS round robin load balancing. The
service time for each request averages .8 seconds with a standard deviation of .85 seconds.
You currently have a cluster of 15 web servers dedicated to your service.
•
•
•
•
•
•
•
1.What is the utilization of each server?
2.What is the mean response time for a request?
3.What is the standard deviation of the response time?
4.How much time does each request spend waiting in the queue?
5.How many requests are waiting in the queue at any time?
6.What is the 90th percentile of the response time (90% of the requests will be satisfied in less than
what time)?
7.How many servers would you need to have 90% of the requests satisfied in less than 8.3 seconds?
Example Performance Analysis

You are serving MP3 files to the web from your office in Los Angeles. You have a customer
in New York City that is complaining about his response time. He claims that the average
time for him to download a 1.5MByte MP3 file is 50 seconds. His connection to the internet
is a 512Kbps DSL line. Your connection to the internet is a 100Mbps ethernet line. The
speed of light is approximately 2*10^8m/sec and LA-NY = 2800 miles=4480km). The latency
through the protocol stack on each end is 100usec. The peak utilization of your web server
is 94% and your average service time is 1.2 seconds.
•
•
•
•
•
•
What is the total latency for this connection?
How long does it take for the file to be sent through the bottleneck bandwidth?
What is the response time of the server?
Why is the customer experiencing this long response time?
What one thing could you change that would cause the greatest improvement in download time?
What would the download time be with this improvement?
Queuing Theory Implications
 In queuing theory we assumed that if two
Poisson streams were merged, the
resulting stream would have a mean equal
to the sum of the two streams
 If the traffic is not Poisson, what happens
when we merge streams
 If it is self similar, the variance tends to
increase, causing buffer overflow
Self Similarity
 No natural length of a burst, at every time
scale, similar-looking traffic bursts are
evident.
 Aggregating streams of such traffic
intensifies the self-similarity instead of
smoothing it
 Aggregation causes more burstiness and
requires larger buffers
Cantor Set
656444656333333333656444656222222222222222222222222222656
444656333333333656444656
Data Comparison
7
6
5
4
Value
m=1
m=2
m=5
3
2
1
0
1 3
5
7
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81
Sample
Data Comparison
7
6
5
4
Value
m=1
m=2
m=5
3
2
1
0
1
3
5 7
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81
Sample
Variance
3.5
3
2.5
Variance
2
Canter
Random
1.5
1
0.5
0
1
2
3
Level of aggregation
4
5
Self Similar Intuition
 Structure repeats at all scales
 Queue size builds up more than would be
expected
 Aggregation of self-similar sources is not
smooth
 Just as Stochastic processes are invariant
to time, self-similar processes are
invariant to scale.
Self-Similarity
Formal Definition
 A stochastic process is self similar with
parameter H if:
for (0.5  H  1) a x(at) has the
same statistica l properties as x(t),
-H
The Mean, Variance and Autocorrel ation
are equal with a factor of a H , a 2H , a 2H
m-aggregated time series
(m)
 {x
x
(m)
k
km
1

xi

m i  km ( m 1)
x
( 3)
k
x
(m)
k
, k  0,1,2...}
x3k  2  x3k 1  x3k

3
Self-similar if autocorrelation remains constant for all m
Example
Xk(1)=65644465633333333365644465622222
22222222222222222222226564446563333
33333656444656
Xk(2)=65465333365464222222222222265465
333365464
Bellcore 1992 ethernet study
Bellcore Study
 4 years 1989-1992, IP Traffic NFS, email etc
Noticing Self-similarity
the variance of the sample mean decreases
more slowly than the reciprocal of the sample
size (slowly decaying variances), i.e., var(x (m)
) decreases more slowly than 1/m (1/mB), as
m , with 0 < B< 1;
 the autocorrelations decay hyperbolically
rather than exponentially fast, implying a nonsummable autocorrelation function
Sk r (k)
(long-rangedependence),

• Normally r (m) (k) 0, as m ( k 1, 2, . . . ). t
2N/K 1, . . .
Statistics
H=1
r=autocorrelation
H=.5 s=variance
Degree of aggregation
Estimate H=0.79
Variance decreases linearly on loglog plot indicating that the variance
is decreasing more slowly than the
reciprocal of m
Line with slope=-1 estimate
slope=-.4 estimate H=.8
Statistics
Periodgram plot, spectral
density centered around 0
Hurst parameter using three
previously mentioned methods
Statistics
Estimate from Pox diagram
Estimate from variance
Estimates of H for low,
medium, high use periods of
the day for 1989 August
Busy=15% (120 Hosts)
1989 October Busy=30%
after upgrade to RISC machines
(140 Hosts)
Statistics
1990 (1200 hosts) Link between
2 Labs at Bellcore and Internet
1992 More Routers, Firewall
(600 hosts outside of router)
Hurst Parameter
High traffic
Medium traffic
Low traffic
Number of packets
95%
Confidence
Interval
Number of bytes
Other Self-similar traffic
 WWW Traffic
 SS7 Traffic
 TCP, FTP, Telnet
 VBR Video
What about WWW traffic?
 Crovella and Bestavros analyze WWW logs
collected at clients over a 1.5 month period
• First WWW client study
• Instrumented MOSAIC
– ~600 students
– ~130K files transferred
– ~2.7GB data transferred
Self-Similar Aspects of Web traffic
 One difficulty in the analysis was finding
stationary, busy periods
• A number of candidate hours were found
 All four tests for self-similarity were employed
• 0.7 < H < 0.8
Mean Waiting Time
Poor Correlation
Self-similar Storage Model
Self-similar requires larger queues
Self-Similar Model
q

1
2 (1 H )
(1   )
H
1 H
For H=1/2, reduces to normal buffer
requirements
Joseph Effect
Joseph in Egypt, 7 years of plenty followed by 7
years of drought, self similar on Nile, difficult to
predict the size of storage tank necessary
Joseph Effect
 Although there were periodic fluxuations in
rainfall, in this case a minimum was
reflected at different time scales up to 7
years.
 What about the next scale up (2000 years)
• Can you apply the Hurst parameter to the
millenium? (1000 years of peace are coming)
• Could you model the amount of food storage to
prepare for?