Network_measurementsx

Download Report

Transcript Network_measurementsx

Time measurement of network
data transfer
R. Fantechi, G. Lamanna
25/5/2011
Outline
• Motivations
• Hardware setup
• Software tools
• Measurement and their (possible) interpretation
• Prospects
Motivations
• Network transfers to L1 and L2 need low latency
– For both TEL62-PC and PC-PC transfer, do we know how much
it is?
– For which network protocol is it the best?
– How does it depend from the computer HW?
– How does it depend from the network interface?
– How much is it the latency fluctuaction? GPUs are sensitive…
– The knowledge of fluctuations is important to stay within the
1ms budget
• Standard software monitor tools give averages
• Try to use hardware signals, generated in strategic points inside
the software
• Correlate signals from a sender to those from a receiver
Hardware setup
•
Two PCs with GB I/F
– A is a Pentium 4 2.4GHz
PCATE
• Called PCATE
– B is a 2*4 core Xeon
• Called PCGPU
– Direct Ethernet connection
on hidden network
– Each PC is equipped with a
Parallel port I/F
PCGPU
• It is used to generate
timing pulses
•
Lecroy scope
– Time measurements
– Histograms
– Storage of screenshots
Adapter for the
parallel port
Software tools
• Investigate three “protocols”
– Raw Ethernet packets (socket PF_PACKET, SOCK_RAW)
– IP packets (socket PF_INET, SOCK_RAW)
– TCP packets (socket PF_INET, SOCK_STREAM)
• Three pairs of simple senders/receivers
– The sender
• Gets from the command line packet size, number of packets,
delay between packets , downscaling factor (see later)
• Initialize the socket and go in a tight loop, with a delay inside
• Inside the loop, before and after the send command, write a
pulse on the parallel port
– The receiver
• After inizialization, go in a receive loop and write a pulse on the
parallel port after having received a packet
Code example
/* Create raw socket */
sock = socket(AF_INET, SOCK_RAW, PROBE_PROT);
if (sock < 0) {
perror("opening raw socket");
exit(1);
}
………………………….
if (iloop<0) iloop = 1000000000;
for (i=0;i<iloop;i++)
{
if (i%50==0) {
buf[0]=0x01;
out=0x01; outb(out,0x378);
out=0x00; outb(out,0x378);
}
else buf[0]=0x00;
Send a pulse
if (sendto(sock, buf,
buflen,0,&server,sizeof(struct sockaddr_in)) < 0)
perror("writing on stream socket");
out=0x02; outb(out,0x378);
out=0x00; outb(out,0x378);
for (k=0; k<conv_time; k++);
}
/* Create socket */
sock = socket(AF_INET, SOCK_RAW, PROBE_PROT);
if (sock < 0) {
perror("opening stream socket");
exit(1);
}
………………….
int kk=0;
serv_size = sizeof(server);
do {
if ((rval = recvfrom(sock, buf,
BUFFER_SIZE,0,(struct sockaddr
*)&server,&serv_size)) < 0)
perror("reading stream message");
i = 0;
if (rval == 0)
printf("Ending connection\n");
else {
if(rval== BUFFER_SIZE) {
outb(0x01,0x378);
outb(0x00,0x378);
}
("-->%d\n", rval);
} while (rval != 0);
Delay loop
• Sender
• Receiver
Software tools
• Maximum rate
– On the sender, some time is spent for the code execution
– The minimum achievable repetition rate between packets varies
from ~6 ms to ~10 ms
• Depending on machine speed, type of protocol, etc
• Downscaling factor
– Needed to operate properly the scope at high rates
• If the loop index modulo the downscaling factor is 0, send in the
packet the pattern to be written by the receiver on the parallel
port, otherwise 0
• Packets are sent at the specified rate, but the scope registers only
a fraction
• Additional tools used
• Wireshark and Tcpdump to check packet arrival
• Ifconfig and /proc/interrupts to count packet and interrupt loss
Basic method check
• Are these pulse reliable?
– A simple check: histogram the width of the pulse generated
by the sender
– Pulse width: ~1.22 ms , sdev 0.04 ms, watch out the maximum
Parameters used in the tests
• Packet size
– Small packets (200 bytes) or large packets (1300 bytes)
• Protocols
– 3 as mentioned before
• Delay between packets
– Usually from 10 ms down to the minimum
– Typical sequence: 10, 5, 2, 1 ms, 100, 50, 20, 10 ms
• Measurements
– Store interesting screenshots
– Record time difference, sigma, max value
• Time difference = time of rx pulse – time of tx pulse
Lost packets and interrupts
• No lost packets observed at any rate
– Checked with ifconfig at source and destination
• Interrupt behaviour via /proc/interrupts
– At high rates the number of interrupts decreases
• Well known phenomenon of “interrupt coalescence” in the driver
• Packets received too fast are buffered and the CPU interrupted
only once
• For TCP at high rates and 200 bytes buffers, interrupts are
reduced also because TCP puts many buffers in an Eth packet
• Anyway, measuring TCP performances is more difficult as the
protocol has the freedom of segmenting user buffers as it likes
(i.e. flow control)
RX interrupts - PCGPU
Interrupt coalescence
Two examples, at 15 ms (left) and 12 ms (right)
1300 bytes, PCATE->PCGPU
CPU usage
Sender
Receiver
Time across sendto
Time difference btw a pulse after sendto and
one before – The machine is the same
Time across sendto - Fluctuations
Count how many times the time is over 20 ms (wrt all times)
Raw
IP
TCP
~5/26000
~13/26000
min ~8/20000 (1 ms) max ~402/20000 (100 ms) - 1300 bytes
18/26000 - 200 bytes
On PCATE as sender
Quiet example
Only 15
Moving the mouse…
> 4500
Transfer time
As a function of time, different buffer sizes
Critical zone
Transfer time
As a function of packet size, different times, PCATE->PCGPU
Transfer time
PCATE -> PCGPU, raw, 1300 bytes
5 ms
500 ms
2 ms
200 ms
1 ms
100 ms
Transfer time
5 ms
PCGPU->PCATE
200
bytes
1300
bytes
~8 ms
Transfer time trending
PCGPU->PCATE, raw
200 bytes 50 ms
200 bytes 20 ms
1000 bytes 50 ms
1000 bytes 20 ms
1300 bytes 40 ms
1300 bytes 20 ms
Summary
• Hardware timing system
– Reliable, not interfering with the measurement (at level of
max 10 ms)
• Time spent in the sender
– A fraction (<10%) of the total transfer time
– Varies with the protocol type
– Stable with the packet rate
• Transfer time
– Down to 50 ms varies a little as a function of packet rate
• Between 50 and 120 ms
– Below 20 ms it increases (up to 2 ms) for raw, but not for IP
• This setup is not working below ~10 ms
– Where we are most interested
To be done
• Complete the measurement
– Both directions
– All protocols (TCP, maybe new ones)
• Performance as a function of CPU power
– Use different PCs
– Add load on the machines
• Test multiple I/F and switches
• Change the sender to an object driven by an FPGA
– TEL62 or TALK
• Investigate different protocol features
– New protocols or switch features of the old ones
• Test more complex transfer sw (i.e. TDBIO)
• Some work hopefully done by USA summer students…