Modeling Channel Conflict Probabilities between IEEE 802

Download Report

Transcript Modeling Channel Conflict Probabilities between IEEE 802

A Machine Learning-based
Approach for Estimating
Available Bandwidth
Ling-Jyh Chen1, Cheng-Fu Chou2 and Bo-Chun Wang2
1Academia Sinica
2National Taiwan University
Definition


Link Capacity: maximum IP-layer throughput that
a flow can get, without any cross traffic.
Available Bandwidth: maximum IP-layer
throughput that a flow can get, given (stationary)
cross traffic.
Related Work

Statistical cross-traffic models:



Self-induced congestion models:



Measure the time interval between the arrival of any two
successive probe packets at the receiver and use the
dispersion measurements to estimate the available
bandwidth.
E.g., Delphi, IGI, Spruce
Based on the intuition that if the probing rate is lower than
the available bandwidth, the probe packets will not
experience additional queueing delay during transmission.
E.g., TOPP, Pathload, pathChirp
However, these approaches are either inaccurate or
intrusive.
Our Contribution



We propose a machine learning-based
approach for accurate available bandwidth
estimation.
The proposed approach can estimate available
bandwidth even if there are no sample with
similar properties to the measured path in
the training dataset.
Using a set of simulations, we show the
proposed approach is fast, accurate and nonintrusive.
Basic Ideas

The fact:


Due to the diversity and dynamics of the Internet,
collecting and verifying the correctness of data of
such a large-scale network is hard.
Our ideas:



Create a representative network in the network
simulator using realistic network traces with wellestablished network traffic models.
Probe the network using effective probing model
and collect training data for the machine learning
tool.
Estimate the available bandwidth using the welltrained machine learning tool.
System Settings

Network Scenarios

Topology: Tiscali topology of
Rocketfuel’s trace [13]


750 links and 506 nodes (221 are end-users)
We assume




Propagation delay: uniformly distributed between
[10,20] ms
Buffer size of each link is 20
50% of end-users are ADSL (3M/1Mbps), and the
remainder are academic networks (100Mbps)
Core router links are 1Gbps
System Settings

Network Scenarios

Traffic: based on measurement results
[7] T. Karagiannis, K. Papagiannaki, and M. Faloutsos. Blinc: multilevel traffic classification
in the dark. In ACM SIGCOMM, 2005.
System Settings

Network Scenarios

Traffic: based on measurement results
[12] M. Roughan, S. Sen, O. Spatscheck, and N. Duffield. Class-of-service mapping for qos:
a statistical signature-based approach to ip traffic classification. In ACM IMC, 2004.
System Settings

Probing Models

Packet Train model



Send k packets in a burst (back-to-back)
k-1 dispersions observed
k = 11
System Settings

Probing Models

pathChirp-like model



Send a chirp of fifteen packets each time
The lowest sending rate is five percent of
the bottleneck capacity
Spread factor γ=1.2
System Settings

Machine Learning Tools



Unsupervised learning: EM, K-means clustering
Supervised learning: k-NN, SVM
We use SVM in this study, because



It can handle missing data caused by packet loss.
It can interpolate/extrapolate the system output.
The computation overhead is affordable.
Performance Evaluation




Packet Train model
pathChirp-like model
Comparison with other tools
Scale-Free approach
Evaluation: Packet Train Model

Each sample is the probing results of a randomly
selected node pair.

16,000 samples as the training data.

1,500 samples as the test data.


Each sample of the training data is comprised of
13 properties: 10 dispersions, hop count,
bottleneck capacity, and tightest link’s available
bandwidth.
Each sample of the test data contains all above
information, except the available bandwidth.
Evaluation: Packet Train Model

The results are divided into three groups
based on their bottleneck link capacity.
Evaluation: pathChirp-like Model

Each chirp consists of 15 packets with a spread
factor γ=1.2.

16,000 samples as the training data.

1,500 samples as the test data.


Each sample of the training data is comprised of
17 properties: 14 dispersions, hop count,
bottleneck capacity, and tightest link’s available
bandwidth.
Each sample of the test data contains all above
information, except the available bandwidth.
Evaluation: pathChirp-like Model

The results are divided into two groups
based on their bottleneck link capacity.
Comparing with Other Tools


Compare the proposed approach using
the pathChirp-like model with
pathChirp and Spruce.
Run 1,500 tests for both pathChirp
and Spruce in the same network
scenario.
Comparing with Other Tools

The results are divided into two groups
based on their bottleneck link capacity.
Scale-Free Approach



The proposed approach collects training
data from a very limited network scenario.
The cost of building a database covering all
types of Internet scenario is prohibitively
expensive.
We propose a Scale-Free approach to
normalize all properties in our system.


The dispersion measurements are divided b the
initial inter-packet gap.
The observed available bandwidth is replaced
with the utilization of the bottleneck link.
Scale-Free Approach


Two scenarios (6 Mbps and 50Mbps of the
bottleneck link capacity)
1,500 samples for each case
Conclusion



We propose a machine learning-based
approach for estimating the available
bandwidth of a network path.
We show that the pathChirp-like model
outperforms the packet train model in all
test cases.
By normalizing all attributes, we show this
novel approach is able to accurately
estimate available bandwidth, even if there
are no sample with similar properties to
the measured path in the training dataset.
Thanks!
http://www.iis.sinica.edu.tw/~cclljj/
http://nrl.iis.sinica.edu.tw/