What do we use it for?
Download
Report
Transcript What do we use it for?
End-2-End Network Monitoring
What do we do ?
What do we use it for?
Richard Hughes-Jones
Many people are involved:
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
1
DataGRID WP7: Network Monitoring Architecture
for Grid Sites
LDAP
Schema
Grid Apps
GridFTP
PingER
(RIPE TTB)
iperf
UDPmon
rTPL
NWS
etc
Local Network
Monitoring
Store & Analysis
of Data (Access)
Backend LDAP script to fetch metrics
Monitor process to push metrics
local
LDAP
Server
Grid Application access via
LDAP Schema to
- monitoring metrics;
- location of monitoring data.
Access to current and historic data
and metrics via the Web, i.e. WP7
NM Pages, access to metric forecasts
Robin Tasker
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
2
WP7 Network Monitoring Components
Clients WEB Display
Predictions
LDAP
Web I/f
LDAP
Table
plot
Grid Brokers
LDAP
Table
raw
raw
plot
Analysis
LDAP
LDAP
Table
raw
plot
raw
raw
Scheduler
Cron
script
control
Cron
script
Cron
script
control
Tool
Ping
Netmon
UDPmon
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
iPerf
Ripe
3
WP7 MapCentre: Grid Monitoring & Visualisation
Grid network monitoring architecture uses LDAP & R-GMA - DataGrid WP7
Central MySQL archive hosting all network metrics and GridFTP logging
Probe Coordination Protocol deployed, scheduling tests
MapCentre also provides site & node Fabric health checks
Franck Bonnassieux
CNRS Lyon
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
4
WP7 MapCentre: Grid Monitoring & Visualisation
CERN – RAL UDP
CERN – IN2P3 UDP
CERN – RAL TCP
CERN – IN2P3 TCP
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
5
UK e-Science: Network Monitoring
Technology Transfer
DataGrid WP7 M/c
UK e-Science DL
DataGrid WP7 M/c
Architecture
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
6
UK e-Science: Network Problem Solving
24 Jan to 4 Feb 04
TCP iperf RAL to HEP
Only 2 sites >80 Mbit/s
RAL -> DL 250-300 Mbit/s
24 Jan to 4 Feb 04
TCP iperf DL to HEP
DL -> RAL ~80 Mbit/s
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
7
Tools: UDPmon – Latency & Throughput
UDP/IP packets sent between end systems
Latency
Round trip times using Request-Response UDP frames
1
Latency as a function of frame size
db
s
• Slope s given by:
data paths
dt
•
•
Mem-mem copy(s) + pci + Gig Ethernet + pci + mem-mem copy(s)
Intercept indicates processing times + HW latencies
Histograms of ‘singleton’ measurements
UDP Throughput
Send a controlled stream of UDP frames spaced at regular intervals
Vary the frame size and the frame transmit spacing & measure:
• The time of first and last frames received
• The number packets received, lost, & out of order
• Histogram inter-packet spacing received packets
• Packet loss pattern
Number of packets
• 1-way delay
n bytes
• CPU load
• Number of interrupts
Wait time
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
time
8
UDPmon: Example 1 Gigabit NIC Intel pro/1000
Throughput
Motherboard: Supermicro P4DP6
Chipset: E7500 (Plumas)
CPU: Dual Xeon 2 2GHz with 512k
L2 cache
Mem bus 400 MHz
PCI-X 64 bit 66 MHz
HP Linux Kernel 2.4.19 SMP
MTU 1500 bytes
Recv Wire rate
Mbits/s
1000
gig6-7 Intel pci 66 MHz 27nov02
50 bytes
100 bytes
200 bytes
400 bytes
600 bytes
800 bytes
1000 bytes
1200 bytes
1400 bytes
1472 bytes
800
600
400
200
0
0
Intel PRO/1000 XT
Latency
y = 0.0093x + 194.67
Latency us
250
10
Bus Activity
Intel 64 bit 66 MHz
300
5
15
20
25
Transmit Time per frame us
30
35
40
Send Transfer
200
y = 0.0149x + 201.75
150
100
50
0
0
900
64 bytes Intel 64 bit 66 MHz
800
500
1000
1500
2000
Message length bytes
512 bytes Intel 64 bit 66 MHz
800
2500
1024 bytes Intel 64 bit 66 MHz
800
700
700
700
700
600
600
600
500
500
500
N(t)
N(t)
500
N(t)
600
400
400
400
400
300
300
300
300
200
200
200
200
100
100
100
100
0
0
0
170
190
Latency us
210
170
190
Latency us
210
1400 bytes Intel 64 bit 66 MHz
N(t)
800
3000
0
190
210
Latency us
230
190
210
230
Latency us
GNEW2004 CERN March 2004 Receive Transfer
R. Hughes-Jones Manchester
9
Tools: Trace-Rate Hop by hop measurements
A method to measure the hop-by-hop capacity, delay, and loss
up to the path bottleneck
Effect of the bottleneck on a
Not intrusive
packet pair.
Operates in a high-performance environment
L is a packet size
C is the capacity
Does not need cooperation of the destination
Based on Packet Pair Method
Send sets of b2b packets with increasing
time to live
For each set filter “noise” from rtt
Calculate spacing – hence bottleneck BW
Robust regarding the presence of invisible nodes
Examples of parameters that are
iteratively analysed to extract
the capacity mode
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
10
Tools: Trace-Rate Some Results
Capacity measurements as function of load in Mbit/s from tests on the DataTAG Link:
Comparison of the number of packets required
Validated by simulations in NS-2
Linux implementations, working in a high-performance environment
Research report: http://www.inria.fr/rrrt/rr-4959.html
Research Paper: ICC2004 : International Conference on Communications, Paris,
France, June 2004. IEEE Communication Society.
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
11
Network Monitoring as a Tool to study:
Protocol Behaviour
Network Performance
Application Performance
Tools include:
web100
tcpdump
Output from the test tool:
• UDPmon, iperf, …
Output from the application
• Gridftp, bbcp, apache
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
12
Protocol Performance: RDUDP
Hans Blom
Monitoring from Data Moving Application & Network Test Program
DataTAG WP3 work
Test Setup:
Path: Ams-Chi-Ams Force10 loopback
Moving data from DAS-2 cluster with RUDP – UDP based Transport
Apply 11*11 TCP background streams from iperf
Conclusions
RUDP performs well
It does Back off and share BW
Rapidly expands when BW free
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
13
Performance of the GÉANT Core Network
Test Setup:
Supermicro PC in: London & Amsterdam GÉANT PoP
Smartbits in: London & Frankfurt GÉANT PoP
Long link : UK-SE-DE2-IT-CH-FR-BE-NL
Short Link : UK-FR-BE-NL
Network Quality Of Service
LBE, IP Premium
High-Throughput Transfers
Standard and advanced TCP stacks
Packet re-ordering effects
Jitter for IPP and BE flows under load
Flow: BE BG:60+40% BE+LBE Flow: IPP BG:60+40% BE+LBE
Flow:BE BG: 60% BE 1.4Gbit + 40% LBE 780Mbit
35000
1-way latency us
Frequency
30000
25000
20000
15000
10000
flow:IPP Background: 60% BE 1.4Gbit + 40% LBE
60000
780Mbit
250000
50000
200000
40000
Frequency
40000
30000
0
50
100
Packet Jitter us
150
150000
100000
50000
0
0
0
Flow:IPP Background: none
20000
10000
5000
Flow: IPP none
0
50
100
Packet Jitter us
150
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
0
50
100
Packet Jitter us
150
14
Effect of LBE background
Amsterdam-London
BE Test flow
Packets at 10 µs – line speed
10,000 sent
Packet Loss ~ 0.1%
% Out of order
Tests GÉANT Core: Packet re-ordering
20 UDP 1472 bytes NL-UK-lbexxx_7nov03
18
16
14
12
10
8
6
4
2
0
2
2.2
2.4
2.6
Total Offered Rate Gbit/s
hstcp
Standard TCP
line speed
90% line speed
2.8
3
3.2
200000
180000
160000
140000
120000
100000
80000
60000
40000
20000
0
Packet re-order 1472 bytes uk-nl 21 Oct 03 10,000 sent wait 10 us
0 5000
% lbe
Packet re-order 1400 bytes uk-nl 21 Oct 03 10,000 sent wait 10 us
104500
% lbe
4000
20 % lbe
3500
30 % lbe
3000
402500
% lbe
502000
% lbe
1500
60 % lbe
701000
% lbe
500
80 % lbe
0
1
2
3
4
5
6
Length out-of-order
7
8
9
0 % lbe
10 % lbe
20 % lbe
No. Packets
No. Packets
Re-order Distributions:
30 % lbe
40 % lbe
50 % lbe
60 % lbe
70 % lbe
80 % lbe
1
2
3
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
4
5
6
Length out-of-order
7
8
9
15
Application Throughput + Web100
2Gbyte file transferred RAID0 disks
Web100 output every 10 ms
Gridftp
See alternate 600/800 Mbit and zero
Apachie web server + curl-based client
See steady 720 Mbit
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
MB - NG
16
VLBI Project: Throughput Jitter 1-way Delay Loss
1472 byte Packets
Manchester -> Dwingeloo JIVE
1472 byte Packets man -> JIVE
FWHM 22 µs (B2B 3 µs )
Gnt5-DwMk5 11Nov03/DwMk5-Gnt5 13Nov03-1472bytes
1200
10000
Gnt5-DwMk5
DwMk5-Gnt5
800
600
1472 bytes w=50 jitter Gnt5-DwMk5 28Oct03
8000
N(t)
Recv Wire rate Mbits/s
1000
6000
4000
400
2000
200
0
0
20
40
60
0
0
5
10
15
20
Spacing between frames us
25
30
35
40
1-way Delay – note the packet loss
(points with zero 1 –way delay)
80
100
120
140
Jitter us
Packets Loss distribution
Prob. Density Function: P(t) = λ e-λt
Mean λ = 2360 / s [426 µs]
1472 bytes w12 Gnt5-DwMk5 21Oct03
packet loss distribution 12b bin=12us
80
12000
70
Number in Bin
8000
6000
4000
60
50
Measured
Poisson
40
30
20
10
2000
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
Time between lost frames (us)
17
972
912
852
792
732
672
612
552
492
432
5000
372
4000
312
2000
3000
Packet No.
252
1000
192
0
132
0
72
0
12
1-way delay us
10000
Passive Monitoring
Time-series data from Routers and Switches
Immediate but usually historical- MRTG
Usually derived from SNMP
Miss-configured / infected / misbehaving End Systems (or Users?)
Note Data Protection Laws & confidentiality
Site MAN and Back-bone topology & load
Help to user/sysadmin to isolate problem – eg low TCP transfer
Essential for Proof of Concept tests or Protocol testing
Trends used for capacity planning
Control of P2P traffic
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
18
Users: The Campus & the MAN [1]
NNW – to – SJ4 Access 2.5 Gbit PoS Hits 1 Gbit 50 %
Pete White
Pat Myers
Man – NNW Access 2 * 1 Gbit Ethernet
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
19
Users: The Campus & the MAN [2]
800
ULCC-JANET traffic 30/1/2004
700
600
700
500
600
400
300
In from SJ4
Out to SJ4
800
Traffic Mbit/s
Traffic Mbit/s
900
in
out
500
400
300
Message:
Not a complaint
Continue to work with your network group
LMN to site 1 Access 1 Gbit Ethernet LMN to site 2 Access 1 Gbit Ethernet
Understand the traffic
levels
In
In site1
250
350
out
Out site1
Understand
the
Network
Topology
300
200
200
100
100
0
00:00
0
02:24
04:48
07:12
09:36
12:00
14:24
16:48
19:12
21:36
24/01/2004 25/01/2004 26/01/2004 27/01/2004 28/01/2004 29/01/2004 30/01/2004 31/01/2004
00:00
00:00
00:00
00:00
00:00
00:00
00:00
00:00
00:00
200
Traffic Mbit/s
Traffic Mbit/s
250
200
150
150
100
100
50
50
0
0
24/01/2004 25/01/2004 26/01/2004 27/01/2004 28/01/2004 29/01/2004 30/01/2004 31/01/2004
00:00
00:00
00:00
00:00
00:00
00:00
00:00
00:00
24/01/2004 25/01/2004 26/01/2004 27/01/2004 28/01/2004 29/01/2004 30/01/2004 31/01/2004
00:00
00:00
00:00
00:00
00:00
00:00
00:00
00:00
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
20
VLBI Traffic Flows
Only testing – Could be worse!
Manchester – NetNorthWest - SuperJANET Access links
Two 1 Gbit/s
Access links:
SJ4 to GÉANT
GÉANT to SurfNet
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
21
GGF: Hierarchy Characteristics Document
Network Measurement Working Group
Characteristic
“A Hierarchy of Network Performance Characteristics for
Grid Applications and Services”
Discipline
Document defines terms & relations:
Network characteristics
Measurement methodologies
Observation
Discusses Nodes & Paths
For each Characteristic
Capacity
Capacity
Utilized
Available
Achievable
Queue
Length
Delay
Forwarding
Round-trip
Forwarding
Policy
One-way
Defines the meaning
Attributes that SHOULD be included
Issues to consider when making an observation
Forwarding
Table
Forwarding
Weight
Loss
Jitter
Loss Pattern
Availability
Round-trip
MTBF
One-way
Avail. Pattern
Others
Status:
Bandwidth
Hoplist
Closeness
Originally submitted to GFSG as Community Practice Document
draft-ggf-nmwg-hierarchy-00.pdf Jul 2003
Revised to Proposed Recommendation
http://www-didc.lbl.gov/NMWG/docs/draft-ggf-nmwg-hierarchy-02.pdf 7 Jan 04
Now in 60 day Public comment from 28 Jan 04 – 18 days to go.
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
22
GGF: Schemata for Network Measurements
Request Schema:
Ask for results / ask to make test
Schema Requirements Document made
•
•
XML test request
Use DAMED style names
e.g. path.delay.oneWay
Send: Char. Time, Subject = node | path
Methodology, Stats
Network
Monitoring
Service
XML tests results
Response Schema:
Interpret results
Includes Observation environment
Much work in progress
Common components
Drafts almost done
skeleton
request
schema
include x
include y
include z
pool of
common
components
src &
dest
method
-ology
skeleton
publication
schema
include x
include b
include c
2 (3) proof-of-concept implementations
2 implementations using XML-RPC by Internet2 SLAC
Implementation in progress using Document /Literal by DL & UCL
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
23
So What do we Use Monitoring for: A Summary
Detect or X-check problem reports
Isolate / determine a performance issue
Capacity planning
Publication of data: network “cost” for middleware
RBs for optimized matchmaking
WP2 Replica Manager
End2End Time Series
Throughput UDP/TCP
Rtt
Packet loss
Passive Monitoring
Routers Switches SNMP MRTG
Historical MRTG
Capacity planning
SLA verification
Isolate / determine throughput bottleneck – work
with real user problems
Test conditions for Protocol/HW investigations
Packet/Protocol Dynamics
Protocol performance / development
Hardware performance / development
Application analysis
Output from Application tools
Input to middleware – eg gridftp throughput
Isolate / determine a (user) performance issue
Hardware / protocol investigations
tcpdump
web100
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
24
More Information Some URLs
DataGrid WP7 Mapcenter: http://ccwp7.in2p3.fr/wp7archive/
& http://mapcenter.in2p3.fr/datagrid-rgma/
UK e-science monitoring: http://gridmon.dl.ac.uk/gridmon/
MB-NG project web site: http://www.mb-ng.net/
DataTAG project web site: http://www.datatag.org/
UDPmon / TCPmon kit + writeup:
http://www.hep.man.ac.uk/~rich/net
Motherboard and NIC Tests: www.hep.man.ac.uk/~rich/net
IEPM-BW site: http://www-iepm.slac.stanford.edu/bw
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
25
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
26
Network Monitoring to Grid Sites
Network Tools Developed
Using Network Monitoring as a Study Tool
Applications & Network Monitoring – real users
Passive Monitoring
Standards – Links to GGF
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
27
Data Flow: SuperMicro 370DLE: SysKonnect
Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset
CPU: PIII 800 MHz PCI:64 bit 66 MHz
Send CSR setup Send Transfer
RedHat 7.1 Kernel 2.4.14
Send PCI
Receive PCI
Packet on Ethernet Fibre
1400 bytes sent
Wait 100 us
~8 us for send or receive
Stack & Application overhead ~ 10 us / node
Receive Transfer
~36 us
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
28
10 GigEthernet: Throughput
1500 byte MTU gives ~ 2 Gbit/s
Used 16144 byte MTU max user length 16080
DataTAG Supermicro PCs
Dual 2.2 GHz Xeon CPU FSB 400 MHz
PCI-X mmrbc 512 bytes
wire rate throughput of 2.9 Gbit/s
6000
SLAC Dell PCs giving a
Dual 3.0 GHz Xeon CPU FSB 533 MHz
PCI-X mmrbc 4096 bytes
wire rate of 5.4 Gbit/s
5000
Recv Wire rate Mbits/s
16080 bytes
14000 bytes
12000 bytes
10000 bytes
9000 bytes
8000 bytes
7000 bytes
6000 bytes
5000 bytes
4000 bytes
3000 bytes
2000 bytes
1472 bytes
an-al 10GE Xsum 512kbuf MTU16114 27Oct03
4000
3000
2000
1000
0
0
5
10
15
20
25
Spacing between frames us
30
35
40
CERN OpenLab HP Itanium PCs
Dual 1.0 GHz 64 bit Itanium CPU FSB 400 MHz
PCI-X mmrbc 4096 bytes
wire rate of 5.7 Gbit/s
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
29
Tuning PCI-X: Variation of mmrbc IA32
16080 byte packets every 200 µs
Intel PRO/10GbE LR Adapter
PCI-X bus occupancy vs mmrbc
mmrbc
1024 bytes
Plot:
Measured times
Times based on PCI-X times from
the logic analyser
Expected throughput
mmrbc
2048 bytes
50
45
9
CSR Access
8
40
35
30
PCI-X Sequence
7
6
5
25
20
15
10
4
Measured PCI-X transfer time us
expected time us
rate from expected time Gbit/s
Max throughput PCI-X
5
0
0
1000
2000
3000
4000
Max Memory Read Byte Count
3
2
1
PCI-X Transfer rate Gbit/s
PCI-X Transfer time us
mmrbc
512 bytes
Data Transfer
Interrupt & CSR Update
mmrbc
4096 bytes
0
5000
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
30
10 GigEthernet at SC2003 BW Challenge
Three Server systems with 10 GigEthernet NICs
Used the DataTAG altAIMD stack 9000 byte MTU
Send mem-mem iperf TCP streams From SLAC/FNAL booth in Phoenix to:
Pal Alto PAIX
rtt 17 ms , window 30 MB
Shared with Caltech booth
4.37 Gbit hstcp I=5%
Then 2.87 Gbit I=16%
Fall corresponds to 10 Gbit on link
10
Router to LA/PAIX
Phoenix-PAIX HS-TCP
Phoenix-PAIX Scalable-TCP
Phoenix-PAIX Scalable-TCP #2
10 Gbits/s throughput from SC2003 to PAIX
9
8
Throughput Gbits/s
7
6
5
4
3
2
3.3Gbit Scalable I=8%
Tested 2 flows sum 1.9Gbit I=39%
1
0
11/19/03
15:59
11/19/03
16:13
11/19/03
16:27
Router traffic to Abilele
11/19/03
16:42
11/19/03
16:56
11/19/03
17:11
11/19/03
17:25 Date & Time
10 Gbits/s throughput from SC2003 to Chicago & Amsterdam
Phoenix-Chicago
10
Chicago Starlight
rtt 65 ms , window 60 MB
Phoenix CPU 2.2 GHz
3.1 Gbit hstcp I=1.6%
9
8
Throughput Gbits/s
Phoenix-Amsterdam
7
6
5
4
Amsterdam SARA
rtt 175 ms , window 200 MB
Phoenix CPU 2.2 GHz
4.35 Gbit hstcp I=6.9%
GNEW2004 CERN March 2004
Very Stable
Both used Abilene to Chicago R. Hughes-Jones Manchester
3
2
1
0
11/19/03
15:59
11/19/03
16:13
11/19/03
16:27
11/19/03
16:42
11/19/03
16:56
11/19/03
17:11
11/19/03
17:25 Date & Time
31
Summary & Conclusions
Intel PRO/10GbE LR Adapter and driver gave stable throughput and
worked well
Need large MTU (9000 or 16114) – 1500 bytes gives ~2 Gbit/s
PCI-X tuning mmrbc = 4096 bytes increase by 55% (3.2 to 5.7 Gbit/s)
PCI-X sequences clear on transmit gaps ~ 950 ns
Transfers: transmission (22 µs) takes longer than receiving (18 µs)
Tx rate 5.85 Gbit/s Rx rate 7.0 Gbit/s (Itanium) (PCI-X max 8.5Gbit/s)
CPU load considerable 60% Xenon 40% Itanium
BW of Memory system important – crosses 3 times!
Sensitive to OS/ Driver updates
More study needed
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
32
PCI Activity: Read Multiple data blocks 0 wait
Read 999424 bytes
Each Data block:
Setup CSRs
Data movement
Update CSRs
For 0 wait between reads:
Data blocks ~600µs long
take ~6 ms
Then 744µs gap
PCI transfer rate 1188Mbit/s
(148.5 Mbytes/s)
Read_sstor rate 778 Mbit/s
Data Block131,072 bytes
(97 Mbyte/s)
PCI bus occupancy: 68.44%
Concern about Ethernet Traffic 64
bit 33 MHz PCI needs ~ 82% for
930 Mbit/s
Expect ~360 Mbit/s
CSR Access
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
Data transfer
PCI Burst 4096 bytes
33
PCI Activity: Read Throughput
Flat then 1/t dependance
~ 860 Mbit/s for Read blocks >=
262144 bytes
CPU load ~20%
Concern about CPU load needed
to drive Gigabit link
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
34
BaBar Case Study: RAID Throughput & PCI Activity
3Ware 7500-8 RAID5 parallel EIDE
3Ware forces PCI bus to 33 MHz
BaBar Tyan to MB-NG SuperMicro
Network mem-mem 619 Mbit/s
Disk – disk throughput bbcp
40-45 Mbytes/s (320 – 360 Mbit/s)
PCI bus effectively full!
Read from RAID5 Disks
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
Write to RAID5 Disks
35
BaBar: Serial ATA Raid Controllers
3Ware 66 MHz PCI
1600
900
Read Throughput raid5 4 3Ware 66MHz SATA disk
Read Throughput raid5 4 ICP 66MHz SATA disk
800
1400
700
1200
600
Mbit/s
1000
Mbit/s
ICP 66 MHz PCI
800
600
readahead
readahead
readahead
readahead
readahead
readahead
400
200
max 31
max 63
max 127
max 256
max 512
max 1200
500
400
readahead
readahead
readahead
readahead
readahead
readahead
300
200
100
0
0
0
200
400
600
800
1000
1200
1400
1600
1800
2000
0
200
400
600
File size MBytes
1800
1600
Write Throughput raid5 4 3Ware 66MHz SATA disk
1200
1000
max 31
max 63
max 127
max 256
max 516
max 1200
1000
1200
1400
1600
1800
2000
800
Write Throughput raid5 4 ICP 66MHz SATA disk
1400
readahead
readahead
readahead
readahead
readahead
readahead
1200
1000
Mbit/s
readahead
readahead
readahead
readahead
readahead
readahead
1400
800
File size MBytes
1600
Mbit/s
max 31
max 63
max 127
max 256
max 512
max 1200
800
max 31
max 63
max 127
max 256
max 512
max 1200
600
600
400
400
200
200
0
0
0
200
400
600
800
1000
1200
1400
1600
1800
2000
0
200
File size MBytes
400
600
800
1000
1200
1400
1600
1800
File size MBytes
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
36
2000
VLBI Project: Packet Loss Distribution
packet loss distribution 12b bin=12us
80
70
Number in Bin
60
50
Measured
Poisson
40
30
20
10
972
912
852
792
732
672
612
552
492
432
372
312
252
192
132
P(t) = λ e-λt
72
0
12
Measure the time between
lost packets in the time
series of packets sent.
Lost 1410 in 0.6s
Is it a Poisson process?
Assume Poisson is
stationary
λ(t) = λ
Use Prob. Density Function:
Time between lost frames (us)
Mean λ = 2360 / s
packet loss distribution 12b
100
[426 µs]
Plot log: slope -0.0028
expect -0.0024
Could be additional process
involved
Number in Bin
y = 41.832e-0.0028x
y = 39.762e-0.0024x
10
1
0
500
1000
1500
2000
Time between frames (us)
GNEW2004 CERN March 2004
R. Hughes-Jones Manchester
37