Delay - IETF Tools

Download Report

Transcript Delay - IETF Tools

Draft-geib-ippm-metrictest-01.
Incorporates RFC2330 philosophy, draft-bradner and draft-morton.
Inputs and ideas on which this draft is based on:
 Draft-morton:
Compare the single implementation against the metric specification.
 Philosophy
RFC2330: IPPM metric implementations measuring simultaneously
along an identical path, should result in the same measurement. Validate for a
single implementation as well as for different compatible implemenatations. Apply
Anderson Darling K-sample test with 95 % confidence (see RFC’s 2330 & 2679).
 To
be conforming to a metric specification, publish the smallest resolution under
which Anderson Darling k- sample test was passed with 95% confidence.
 Document
the chosen implementation options (and be aware of the possibly
resulting limitations for a statistical test comparing different implementations).
 Draft-morton
(and IETF in general): improve IPPM metric specifications based on
implementation experience before promoting them to standards.
Geib/Morton/ Hasslinger/Fardid / draft-geib-metrictest-01
Nov 2009
1
Draft-geib-ippm-metrictest-01.
Identical networking conditions for repeated measurements.
Metric implementations will be operated in real networks. Metric compliance
should be tested under live network conditions too. Identical networking
conditions for multiple flows can be reached by:
 Setting
up a tunnel using IP/MPLS transport between two sites.
 Simultaneously measure with 5 or more flows per implementation.
 Ensure that the test set up doesn’t interfere with the metric measurement.
Example: „repeating“ measurements under identical network conditions with a single implementation by
measuring with two parallel flows.
Metric Implement.
A
Instance 1
Metric Implement.
A
Instance 2
Tunnel
termination
1
Internet
Tunnel
termination
2
Geib/Morton/ Hasslinger/Fardid / draft-geib-metrictest-01
Nov 2009
2
Draft-geib-ippm-metrictest-01.
Some results with two instances of a single implementation.
Unless stated otherwise: two implementations, partially sharing a path, same packet
size and same queue. Resolution is 1 s. Data has been normalised to the same
average value (ADK is sensitive to variations in averages too). AD2 (95%) test.
For more details on measurement set up see ietf_75_ippm_geib_metrictest.pdf.
 Single
instance, different packet sizes, different queues, low load, normalised on
same mean.
Delay
Mean [ms]
Standard Dev. [ms]
AD2 Test
 Two
Flow A
100.8
0.016
Flow B
100.8
0.019
passed
instances, same queue, same packet size, moderate load, not normalised.
Jitter
Mean [ms]
Standard Dev. [ms]
AD2 Test
Path A
0.268
0.271
passed
Path B
0.167
0.141
Geib/Morton/ Hasslinger/Fardid / draft-geib-metrictest-01
Nov 2009
3
Draft-geib-ippm-metrictest-01.
More results (1).
 Two
instances, same queue, same packet size, moderate load, not normalised, 32
samples only.
Pckt Loss
Mean [pckts]
Standard Dev. [pckts]
AD2 Test
Path A
10.5
11.3
passed
Path B
4.3
3.6
 Single
instance, single queue, low load, results split into four contiguous sets of
data (“repeated measurement single implementation”), not normalised.
Delay
Mean [ms]
Standard Dev. [ms]
ADK passed
AD2 failed
Interval A
100.40
0.186
With B,C
With D
Interval B
100.41
0.126
With A, C
With D
Interval C
100.44
0.136
With A, C, D
Interval D
100.44
0.115
With C
With A, B
Geib/Morton/ Hasslinger/Fardid / draft-geib-metrictest-01
Nov 2009
4
Draft-geib-ippm-metrictest-01.
More results (2).
 Two
instances, same queue, same packet size, low load, data has been
normalised.
Delay
Mean [ms]
Standard Dev. [ms]
AD2 Test
Path A
100.9
100.9
failed
Path B
22.9
22.6
ADK test passed after limiting temporal resolution to 25 s.
 Single
instance, different packet sizes, different queues, low load, normalised on
same mean.
Delay
Mean [ms]
Standard Dev. [ms]
AD2 Test
Flow A
100.9
0.0229
failed
Flow B
100.9
0.0226
ADK test passed after limiting temporal resolution to 150 s.
Geib/Morton/ Hasslinger/Fardid / draft-geib-metrictest-01
Nov 2009
5
Draft-geib-ippm-metrictest-01.
Next steps.
 If
the concept is comprehensible and makes sense, is there
support to go on?
 If
“yes”: complete the draft by adding more of Al’s ideas, add
figures and so on.
 Get
a review by design team (name volunteers).
 Improve
 If
draft again, resubmit and suggest as WG draft.
the answer is “no” – read the draft and suggest changes.
Geib/Morton/ Hasslinger/Fardid / draft-geib-metrictest-01
Nov 2009
6
Backup
Geib/Morton/ Hasslinger/Fardid / draft-geib-metrictest-01
Nov 2009
7
Draft-geib-ippm-metrictest-01.
Prior work: RFC2330 repeatability (precision).
RFC2330: „A methodology for a metric should [be] repeatable: if the methodology is
used multiple times under identical conditions, the same measurements should
result in the same measurements.”
Draft-geib: This demands a high precision.
High precision,
low accuracy
High accuracy,
low precision
Source: Wikipedia
By measuring a metric multiple times, probes are drawn from the underlying (and
unknown) distribution of networking conditions.
Geib/Morton/ Hasslinger/Fardid / draft-geib-metrictest-01
Nov 2009
8
Draft-geib-ippm-metrictest-01.
Prior work: RFC2330/2679 ADK sample (95% confidence).
RFC2330: „A methodology for a given metric exhibits continuity if, for small
variations in conditions, it results in small variations in the resulting
measurements.”
Using a different metric implementation under otherwise identical (network)
conditions should be a “small variation”.
The sample distribution of metric implementation A is taken as the „given“
distribution against which the sample distribution of metric implementation B is
compared by a goodness of fit test (proposal: Anderson-Darling k-test).
RFC2330 provides guidelines on testing for goodness of fit for calibration (quotes):
 Summarizing measurements using histograms, the “EDF” is preferred.
 IPPM goodness-of-fit tests are done using 5% significance (see also RFC2679).
 …recommends the Anderson-Darling EDF test.
Geib/Morton/ Hasslinger/Fardid / draft-geib-metrictest-01
Nov 2009
9