Interferenza

Download Report

Transcript Interferenza

LHCb on-line / off-line
computing
Domenico Galli, Bologna
INFN CSN1
Lecce, 24.9.2003
Off-line computing

We plan LHCb-Italy off-line computing resources to
be as much centralized as possible.




Put as much computing power as possible in CNAF Tier-1.

To minimize system administration manpower.

To optimize resources exploitation.
“Distributed” for us means distributed among CNAF and
other European Regional Centres.
Virtual drawback: strong dependence on CNAF resource
sharing.
The improvement following the setup of Tier-3 in
major INFN sites for parallel nt-ples analysis should
be evaluated later.
LHCb on-line / off-line computing. 2
D. Galli
2003 Activities


In 2003 LHCb-Italy contributed to DC03 (production
of MC samples for TDR).
47 Mevt / 60 d.

32 Mevt minimum bias;

10 Mevt inclusive b;


50 signal samples,
whose size is 50 to
100 kevt.
18 Computing centres
involved.

Italian contribution:
11.5% (should be 15%).
LHCb on-line / off-line computing. 3
D. Galli
2003 Activities (II)



Italian contribution to DC03 has been obtained using
limited resources (40kSi2000, i.e. 100 1GHz PIII
CPUs).
Larger contibutions (Karlsruhe D, Imperial College,
UK) come from the huges, dinamically allocated,
resources of these centres.
DIRAC, LHCb distributed MC production system,
has been used to produce 36600 jobs; 85% of them
run out of CERN with 92% mean efficiency.
LHCb on-line / off-line computing. 4
D. Galli
2003 Activities (III)

DC03 has also been used to validate LHCb
distributed analysis model.





Distribution to Tier-1 centres of signal and bg MC samples
stored at CERN during production.
Samples has been pre-reduced based on kinematical or
trigger criteria.
Selection algorithms for specific decay channels (~30) has
been executed.
Events has been classified by means of tagging algorithms.
LHCb-Italy contributed to implementation of
selection algorithms for B decays in 2 charged
pions/kaons.
LHCb on-line / off-line computing. 5
D. Galli
2003 Activities (IV)


To perform high statistics data samples analysis the
PVFS distributed file system has been used.
110 MB/s aggregate I/O using 100Base-T Ethernet
connection (to be compared with 50 MB/s of a typical
1000BaseT NAS).
LHCb on-line / off-line computing. 6
D. Galli
2003 Activities (V)


Analysis work by LHCb-Italy has been included in
“Reoptimized Detector Design and Performance” TDR
(2 hadrons channel + tagging).
3 LHCb internal notes has been written:



CERN-LHCb/2003-123: Bologna group, “Selection of
B/Bsh+h- decays at LHCb”;
CERN-LHCb/2003-124: Bologna group, “CP sensitivity with
B/Bsh+h- decays at LHCb”.
CERN-LHCb/2003-115: Milano group, “LHCb flavour tagging
performance”.
LHCb on-line / off-line computing. 7
D. Galli
Software Roadmap
LHCb on-line / off-line computing. 8
D. Galli
DC04 (April-June 2004) – Physics Goals

Demonstrate performance of HLTs (needed for
computing TDR)


Improve B/S estimates of optimisation TDR


Large minimum bias sample + signal
Large bb sample + signal
Physics improvements to generators
LHCb on-line / off-line computing. 9
D. Galli
DC04 – Computing Goals

Main goal: gather information to be used for
writing LHCb computing TDR

Robustness test of the LHCb software and production
system


Test of the LHCb distributed computing model



Using software as realistic as possible in terms of performance
Including distributed analyses
Incorporation of the LCG application area software into the
LHCb production environment
Use of LCG resources as a substantial fraction of the
production capacity
LHCb on-line / off-line computing. 10
D. Galli
DC04 – Production Scenario


Generate (Gauss, “SIM” output):

150 Million events minimum bias

50 Million events inclusive b decays

20 Million exclusive b decays in the channels of interest
Digitize (Boole, “DIGI” output):



All events, apply L0+L1 trigger decision
Reconstruct (Brunel, “DST” output):

Minimum bias and inclusive b decays passing L0 and L1 trigger

Entire exclusive b-decay sample
Store:

SIM+DIGI+DST of all reconstructed events
LHCb on-line / off-line computing. 11
D. Galli
Goal: Robustness Test of the LHCb
Software and Production System

First use of the simulation program Gauss based on Geant4

Introduction of the new digitisation program, Boole




With HLTEvent as output
Robustness of the reconstruction program, Brunel

Including any new tuning or other available improvements

Not including mis-alignment/calibration
Pre-selection of events based on physics criteria (DaVinci)

AKA “stripping”

Performed by production system after the reconstruction

Producing multiple DST output streams
Further development of production tools (Dirac etc.)

e.g. integration of stripping

e.g. Book-keeping improvements

e.g. Monitoring improvements
LHCb on-line / off-line computing. 12
D. Galli
Goal: Test of the LHCb Computing Model

Distributed data production



Including LCG1

Controlled by the production manager at CERN

In close collaboration with the LHCb production site managers
Distributed data sets



As in 2003, will be run on all available production sites
CERN:

Complete DST (copied from production centres)

Master copies of pre-selections (stripped DST)
Tier1:

Complete replica of pre-selections

Master copy of DST produced at associated sites

Master (unique!) copy of SIM+DIGI produced at associated sites
Distributed analysis
LHCb on-line / off-line computing. 13
D. Galli
Goal: Incorporation of the LCG Software

Gaudi will be updated to:

Use POOL (persistency hybrid implementation) mechanism

Use certain SEAL (general framework services) services


All the applications will use the new Gaudi


e.g. Plug-in manager
Should be ~transparent but must be commissioned
N.B.:

POOL provides existing functionality of ROOT I/O


And more: e.g. location independent event collections
But incompatible with existing TDR data

May need to convert it if we want just one data format
LHCb on-line / off-line computing. 14
D. Galli
Needed Resources for DC04

CPU requirement is 10 times what was needed for DC03

Current resource estimates indicate DC04 will last 3 months


Assumes that Gauss is twice slower than SICBMC

Currently planned for April-June
GOAL: use of LCG resources as a substantial fraction of the
production capacity


We can hope for up to 50%
Storage requirement:



6TB at CERN for complete DST
19TB distributed among TIER1 for locally produced
SIM+DIGI+DST
up to 1TB per TIER1 for pre-selected DSTs
LHCb on-line / off-line computing. 15
D. Galli
Resources request to Bologna Tier-1 for
DC04

CPU power: 200 kSI2000 (500 1GHz PIII CPU).

Disk: 5 TB

Tape: 5 TB
LHCb on-line / off-line computing. 16
D. Galli
Tier-1 Grow in Next Years
CPU
[kSI2000]
Disk
[TB]
Tape
[TB]
2004
2005
2006
2007
200
200
400
800
5
20
100
200
5
20
200
600
LHCb on-line / off-line computing. 17
D. Galli
Online Computing

LHCb-Italy has been involved in online group to
design the L1/HLT trigger farm.

Sezione di Bologna


Sezione di Milano


G. Avoni, A. Carbone , D. Galli, U. Marconi, G. Peco, M. Piccinini,
V. Vagnoni
T. Bellunato, L. Carbone, P. Dini
Sezione di Ferrara

A. Gianoli
LHCb on-line / off-line computing. 18
D. Galli
Online Computing (II)


Lots of changes since the Online TDR

abandoned Network Processors

included Level-1 DAQ

have now Ethernet from the readout boards

destination assignment by TFC (Timing and Fast Control)
Main ideas the same

large gigabit Ethernet Local Area Network to connect
detector sources to CPU destinations

simple (push) protocol, no event-manager

commodity components wherever possible

everything controlled, configured and monitored by ECS
(Experimental Control System)
LHCb on-line / off-line computing. 19
D. Galli
DAQ Architecture
Level-1
Traffic
Front-end Electronics
FE FE FE FE FE FE FE FE FE FE FE FE TRM
126-224
Links
44 kHz
5.5-11.0 GB/s
62-87 Switches Switch Switch Switch
Switch
Switch
Storage
System
SFC
SFC
SFC
94-175 Links
7.1-12.6
GB/s
94-175
SFCs
Switch Switch Switch
Level-1 Traffic
Mixed Traffic
29 Switches
32 Links
L1-Decision
Readout Network
Gb Ethernet
323
Links
4 kHz
1.6 GB/s
Multiplexing
Layer
64-137 Links
88 kHz
HLT Traffic
HLT
Traffic
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
Sorter
TFC
System
SFC
SFC
Switch Switch
~1800
CPUs
CPU
CPU
CPU
LHCb on-line / off-line computing. 20
D. Galli
CPU
CPU
CPU
CPU
Farm
Following the Data-Flow
Front-end Electronics
FE
1 FE FE FE FE FE FE FE
2 FE FE FE FE
2
TRM
1
Switch
Switch
Switch Switch Switch
Readout Network
Sorter
L1-Decision
SFC
SFC
Storage
System
2
1
SFC
94 Links
7.1 GB/s
94
SFCs
Switch Switch Switch
Gb Ethernet
Level-1 Traffic
CPU
CPU
CPU
CPU
L1
L1 Trigger
CPU
D
CPU
HLT Traffic
CPU
CPU
CPU
~1800
CPUs
2
SFC
SFC
L0
Yes
TFC
L1
System
Yes
1
Switch Switch
HLT CPU
CPU
B
ΦΚ
CPU s
CPU
Mixed Traffic
LHCb on-line / off-line computing. 21
D. Galli
Yes
CPU
CPU
CPU
Farm
Design Studies


Items under study:

Physical farm implementation (choice of cases, cooling, etc.)

Farm management (bootstrap procedure, monitoring)

Subfarm Controllers (event-builders, load-balancing queue)

Ethernet Switches

Integration with TFC and ECS

System Simulation
LHCb-Italy is involved in Farm management,
Subfarm Controllers and their communication with
Subfarm Nodes.
LHCb on-line / off-line computing. 22
D. Galli
Tests in Bologna


To begin the activity in Bologna we started (August
2003) from scratch by trying to transfer data
through 1000Base-T (gigabit Ethernet on copper
cables) from PC to PC and to measure performances.
As we plan to use an unreliable protocol (RAW
Ethernet, RAW IP or UDP) because reliable ones (like
TCP, which retransmit datagrams not acknowledged)
introduce unpredictable latency, so, together with
throughput and latency, we need to benchmark also
data loss.
LHCb on-line / off-line computing. 23
D. Galli
Tests in Bologna (II) – Previous results




In IEEE802.3 standard specifications, for 100 m long
cat5e cables, the BER (Bit Error Rates) is said to be
< 10-10.
Previous measures, performed by A. Barczyc, B. Jost,
N. Neufeld using Network Processors (not real PCs)
and 100 m long cat5e cables showed a BER < 10-14.
Recent measures (presented A. Barczyc at Zürich,
18.09.2003), performed using PCs gave a frame drop
rate O(10-6).
 Many data (too much for L1!) get lost inside
kernel network stack implementation in PCs.
LHCb on-line / off-line computing. 24
D. Galli
Tests in Bologna (III)

Transferring data on 1000Base-T Ehernet is not as trivial as it was
for 100Base-TX Ethernet.





A new bus (PCI-X) and new chipsets (e.g. Intel E7601, 875P) has been
designed to support gigabit NIC data flow (PCI bus and old chipsets
have not enough bandwidth to support gigabit NIC at gigabit rate).
Linux kernel implementation of network stack has been rewritten 2
times since kernel 2.4 to support gigabit data flow (networking code is
20% of the kernel source). Last modification imply the change of the
kernel-to-driver interface (network driver must be rewritten).
Standard Linux RedHat 9A setup uses back-compatibility stuff and
looses packets.
No many people are interested in achieving very low packet loss
(except for video streaming).
Also a DATATAG group is working on packet losses (M. Rio, T. Kelly, M.
Goutelle, R. Hughes-Jones, J.P.Martin-Flatin, “A map of the networking
code in Linux Kernel 2.4.20”, draft 8, 18 August 2003).
LHCb on-line / off-line computing. 25
D. Galli
Tests in Bologna. Results Summary



Throughput was always higher than expected (957 Mb/s of IP
payload measured) while data loss was our main concern.
We have understood, first (at least) in the LHCb collaboration, how
to send IP datagram at gigabit/second rate from Linux to Linux
on 1000Base-T Ethernet without datagram loss (4 datagrams lost
/ 2.0x1010 datagrams sent).
This required:

use the appropriate software:


NAPI kernel ( 2.4.20 ).
NAPI-enabled drivers (for Intel e1000 driver, recompilation with a special
flag set was needed).

kernel parameters tuning (buffer & queue length).

1000Base-T flow control enabled on NIC.
LHCb on-line / off-line computing. 26
D. Galli
Test-bed 0

2 x PC with 3 x 1000Base-T interfaces each

Motherboard: SuperMicro X5DPL-iGM






Plugged-in PCI-X Ethernet Card: Intel Pro/1000 MT Dual Port
Server Adapter


Ethernet controller Intel 82546EB: 2 x 1000Base-T interfaces
(supports Jumbo Frames)
1000Base-T 8 ports switch: HP ProCurve 6108

16 Gbps backplane: non-blocking architecture
latency: < 12.5 µs (LIFO 64-byte packets)
throughput: 11.9 million pps (64-byte packets)
switching capacity: 16 Gbps

max 500 MHz (cfr 125 MHz 1000Base-T)




Dual Pentium IV Xeon 2.4 GHz, 1 GB ECC RAM
Chipset Intel E7501
400/533 MHz FSB (front side bus)
Bus Controller Hub Intel P64H2 (2 x PCI-X, 64 bit, 66/100/133 MHz)
Ethernet controller Intel 82545EM: 1 x 1000Base-T interface (supports
Jumbo Frames)
Cat. 6e cables
LHCb on-line / off-line computing. 27
D. Galli
Test-bed 0 (II)
1000Base-T switch
10.10.1.7
10.10.0.7

131.154.10.7
10.10.1.2
10.10.0.2
131.154.10.2
lhcbcn1
uplink
lhcbcn2
echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter

to use only one interface to receive packet owning to a
certain network (131.154.10, 10.10.0 and 10.10.1).
LHCb on-line / off-line computing. 28
D. Galli
Test-bed 0 (III)
LHCb on-line / off-line computing. 29
D. Galli
SuperMicro X5DPL-iGM Motherboard
(Chipset Intel E7501)

Chipset internal bandwidth
is granted

6.4 Gb/s min
LHCb on-line / off-line computing. 30
D. Galli
Benchmark Software

We used 2 benchmark software:



Netperf 2.2p14 UDP_STREAM
Self-made basic sender & receiver programs using UDP &
RAW IP
We discovered a bug in netperf on Linux platform:

since Linux calls setsockopt(SO_SNDBUF) &
setsockopt(SO_RCVBUF) set the buffer size to twice the
requested size, while Linux calls getsockopt(SO_SNDBUF) &
getsockopt(SO_RCVBUF) return the actual the buffer size,
then when netperf iterate to achieve the requested precision
in results, it doubles the buffer size each iteration, using the
same variable for both the sistem calls.
LHCb on-line / off-line computing. 31
D. Galli
Benchmark Environment

Kernel 2.4.20-18.9smp

GigaEthernet driver: e1000

version 5.0.43-k1 (RedHat 9A)

version 5.2.16 recompiled with NAPI flag enabled

System disconnected from public network

Runlevel 3 (X11 stopped)

Daemons stopped (crond, atd, sendmail, etc.)

Flow control on (on both NICs and switch)

Numer of descriptors allocated by the driver rings: 256, 4096

IP send buffer size: 524288 (x2) Bytes

IP receive buffer size: 524288 (x2), 1048576 (x2) Bytes

Tx queue length 100, 1600
LHCb on-line / off-line computing. 32
D. Galli
First Results. Linux RedHat 9A, Kernel
2.4.20, Default Setup, no Tuning.


First benchmark results about datagram loss showed
big fluctuations which, in principle, can due to packet
queue reset, other CPU process, interrupts, soft_irqs,
broadcast network traffic, etc.
Resulting
700
distribution is600
multi-modal.
1/33300
Mean value: 5.095 10 5 = 1/19630
Flow control on
Switch uplink off
Tx-ring: 256 descriptors
Rx-ring: 256 descriptors
Send buffer: 524288 B
Recv buffer: 524288 B
Tx queue length: 100
Driver e1000: 5.0.43-k1 (no NAPI)
500

Mean loss: 400
1 datagram
lost every 300 1/10300
1/3550
200
20000
datagram
100
sent.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Too much for 0
Lost datagram fraction for 10 datagram runs. 2600 runs
LHCb L1!!!
LHCb on-line / off-line computing. 33
6
D. Galli
0.9
-3
x 10
First Results. Linux RedHat 9A, Kernel
2.4.20, Default Setup, no Tuning (II)
1/33300
Mean value: 5.095 10 5 = 1/19630
700
1/186916
1/74074
1/45872
1/33003
600
400
300
1/10300
200
1/3550
100
0
600
Flow control on
Switch uplink off
Tx-ring: 256 descriptors
Rx-ring: 256 descriptors
Send buffer: 524288 B
Recv buffer: 524288 B
Tx queue length: 100
Driver e1000: 5.0.43-k1 (no NAPI)
500
1/26596
500
Flow control on
Switch uplink off
Tx-ring: 256 descriptors
Rx-ring: 256 descriptors
Send buffer: 524288 B
Recv buffer: 524288 B
Tx queue length: 100
Driver e1000: 5.0.43-k1 (no NAPI)
400
300
200
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
6
0.9 100
Lost datagram fraction for 10 datagram runs. 2600 runs
x 10
1/33112
1/44643
Mean value: 5.095 10 5 = 1/19630
250 Flow control on
Switch uplink off
1/42735
-3
0
0

Send buffer: 524288 B
Recv buffer: 524288 B
1/46512
Tx queue length: 100
150 Driver e1000: 5.0.43-k1 (no NAPI)
100
50
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Lost datagram fraction for 106 datagram runs. 2600 runs
0.5
-4
x 10
0.05
0.1
0.15
0.2
6
0.25
0.3
0.35
Lost datagram fraction for 10 datagram runs. 2600 runs
Tx-ring: 256 descriptors
200 Rx-ring: 256 descriptors
0
Mean value: 5.095 10 5 = 1/19630
0.4
-3
x 10
We think that peak
behavior is due to kernel
queues resets (all queue
packets silently dropped
when queue is full).
LHCb on-line / off-line computing. 34
D. Galli
Changes in Linux Network Stack
Implementation



2.1  2.2: netlink, bottom halves, HFC (harware flow control)

As few computation as possible while in interrupt context
(interrupt disabled).

Part of the processing deferred from interrupt handler to bottom
halves to be executed at later time (with interrupt enabled).

HFC (to prevent interrupt livelock): as the backlog queue is totally
filled, interrupt are disable until backlog queue is emptied.

Bottom halves execution strictly serialized among CPUs; only one
packet at a time can enter the system.
2.3.43  2.4: softnet, softirq

softirqs are software thread that replaces bottom halves.

possible parallelism on SMP machines
2.5.53  2.4.20 (N.B.: back-port): NAPI (new application
program interface)

interrupt mitigation technology (mixture of interrupt and polling
mechanisms)
LHCb on-line / off-line computing. 35
D. Galli
Interrupt livelock




Given the interrupt rate coming in, the IP processing
thread never gets a chance to remove any packets off
the system.
There are so many interrupts coming into the system
such that no useful work is done.
Packets go all the way to be queued, but are dropped
because the backlog queue is full.
System resourced are abused extensively but no
useful work is accomplished.
LHCb on-line / off-line computing. 36
D. Galli
NAPI (New API)

NAPI is a interrupt mitigation mechanism constituted
by a mixture of interrupt and polling mechanisms:


Polling:

useful under heavy load.

introduces more latency under light load.

abuses the CPU by polling devices that have no packet to offer.
Interrupts:


improve latency under light load.
make the system vulnerable to livelock as the interrupt load
exceed the MLFFR (Maximum Loss Free Forwarding Rate).
LHCb on-line / off-line computing. 37
D. Galli
Packet Reception in Linux kernel  2.4.19
(softnet) and  2.4.20 (NAPI)
rx-ring descriptor
interrupt
DMA
engine
cat 5e
cable
NIC
CPU # 0 backlog queue
push
CPU # 1 backlog queue
int handler
disable int
alloc sk_buff
fetch (DMA)
netif_rx()
enable int
point to
kernel memory
__cpu_raise_softirq()
(schedule a softirq)
Softnet (kernel  2.4.19)
interrupt
pop
if full do empty it completely
(to avoid inerrupt livelock, HFC)
softirq handler
net_rx_action()
ip_rcv()
further
processing
kernel thread
rx-ring descriptor
DMA
engine
poll_list
add
int handler
device
cat 5e
point to pointer eth1
disable int
cable
eth0
NIC
alloc sk_buff
…
fetch (DMA)
kernel memory
netif_rx_schedule()
softirq handler
__cpu_raise_softirq()
enable int
net_rx_action()
(schedule a softirq)
kernel thread
NAPI (kernel  2.4.20)
LHCb on-line / off-line computing. 38
D. Galli
read
poll
ip_rcv()
further
processing
NAPI (II)


Under low load, before the MLFFR is reached, the
system converges toward an interrupt driven
system: packets/interrupt ratio is lower and latency
is reduced.
Under heavy load, the system takes its time to poll
devices registered. Interrupts are allowed as fast as
the system can process them : packets/interrupt
ratio is higher and latency is increased.
LHCb on-line / off-line computing. 39
D. Galli
NAPI (III)

NAPI changes driver-to-kernel interfaces.



all network drivers should be rewritten.
In order to accommodate devices not NAPI-aware,
the old interface (backlog queue) is still available for
the old drivers (back-compatibility).
Backlog queues, when used in back-compatibility
mode, are polled just like other devices.
LHCb on-line / off-line computing. 40
D. Galli
True NAPI vs Back-Compatibility Mode
NAPI
interrupt
rx-ring descriptor
DMA
engine
poll_list
add
read
int handler
device eth1
cat 5e
point to pointer
disable int
cable
eth0
poll
NIC
alloc sk_buff
…
fetch (DMA)
kernel memory
netif_rx_schedule()
softirq handler
__cpu_raise_softirq()
enable int
ip_rcv()
net_rx_action()
(schedule a softirq)
kernel thread
NAPI kernel with NAPI driver
CPU # 0 backlog queue
rx-ring descriptor
interrupt
CPU # 1 backlog queue
push
DMA
add
engine
poll_list
read
int
handler
backlog
cat 5e
backlog 1
point to
disable int
cable
poll
eth0
NIC
alloc sk_buff
…
fetch (DMA)
kernel memory
netif_rx()
softirq handler
__cpu_raise_softirq()
enable int
ip_rcv()
net_rx_action()
(schedule a softirq)
kernel thread
NAPI kernel with old
LHCb on-line / off-line computing. 41
(not NAPI-aware) driver
D. Galli
further
processing
further
processing
The Intel e1000 Driver


Even in the last version of e1000 driver (5.2.16)
NAPI is turned off by default (to allow the usage of
the driver also in kernels  2.4.19).
To enable NAPI, e1000 5.2.16 driver must be
recompiled with the option:
make CFLAGS_EXTRA=-DCONFIG_E1000_NAPI
LHCb on-line / off-line computing. 42
D. Galli
Best Results


Maximum trasfer rate (udp 4096 byte datagrams):
957 Mb/s.
Mean datagram lost fraction (@ 957 Mb/s):
2.0x10-10 (4 datagram lost for 2.0x1010 4k-datagrams
sent)

corresponding to BER 6.2x10-15 (using 1 m cat6e cables) if data
loss is totally due to hardware CRC errors.
LHCb on-line / off-line computing. 43
D. Galli
To be Tested to Improve Further

kernel 2.5




sysenter & sysexit (instead of int 0x80) for context switching
following system calls (3-4 times faster).
Asynchronous datagram receiving
Jumbo frames


fully preemptive (real time)
Ethernet frames whose MTU (Maximum Transmission Unit) is 9000
instead of 1500. Less IP datagram fragmentation in packets.
Kernel Mode Linux (http://web.yl.is.s.u-tokyo.ac.jp/~tosh/kml/)



KML is a technology that enables the execution of ordinary userspace programs inside kernel space.
Protection-by-software (like in Java bytecode) instead of
protection-by-hardware.
System calls become function calls (132 time faster than int 0x80,
36 time faster than sysenter/sysexit).
LHCb on-line / off-line computing. 44
D. Galli
Milestones

8.2004 – Streaming benchmarks:

Maximum streaming throughput and packet loss using UDP, RAW IP
and ROW Ethernet with loopback cable.

Test of switch performance (streaming throughput, latency and
packet loss, using standard frames and jumbo frames).



Maximum streaming throughput and packet loss using UDP, RAW IP
and ROW Ethernet for 2 or 3 simultaneous connections on the
same PC.
Test of event building (receive 2 message stream and send 1 joined
messages stream)
12.2004 – SFC (Sub Farm Controller) to nodes communication:

Definition of SFC-to-nodes communication protocol.

Definition of SFC queueing and scheduling mechanism

First implementation of queueing/scheduling procedures (possibly
zero-copy).
LHCb on-line / off-line computing. 45
D. Galli
Milestones (II)


OS test (if performance need to be improved)

kernel Linux 2.5.53.

KML (kernel mode linux).
Design and test of bootstrap procedures:




Measure of the rate of failure of simultaneous boot of a cluster of
PCs, using pxe/dhcp and tftp.
Test of node switch on/off and powe cycle using ASF.
Design of bootstrap system (rate nodes/proxy servers/servers,
sofware alignment among servers)
Definition of requirement for the trigger software:

error trapping.

timeout.
LHCb on-line / off-line computing. 46
D. Galli