DCS-FEE during TPC commissioning

Download Report

Transcript DCS-FEE during TPC commissioning

FeeCom software during TPC
commissioning (Benchmarks)
22-01-2007
Normal text - click to edit
Sebastian Bablok
Dag Toppe Larsen
Matthias Richter
Benjamin Schockert
Department of Physics and Technology,
University of Bergen, Norway
Center for Telecommunication and Technology Transfer,
University of Applied Science Worms, Germany
TOC
TPC commissioning DCS –FEE part
Normal text - click to edit
Setup overview
Observations
Conclusion
Benchmarks during commissioning
results
remarks
Future plans
Front-End-Electronics in DCS
Control and monitor channels
PVSS II
Supervisory Layer
(FED - Client)
Normal text - click to edit
Front-End Device Interface (FED)
FED Server
Control Layer
Config.
File
InterComLayer
Config.
DB
FEE Client
Front-End Electronics
Interface (FEE)
Field Layer
FeeServer
FeeServer
Load configuration
data from file
OR database
FeeServer
Cmd / ACK
Channel
Service
Channel
Internal Bus
Systems
Hardware
Device
Hardware
Device
Hardware
Device
Message
Channel
Schematically layout for commissioning
External network
Normal text - click to edit
PVSS
(incl. FedClient)
Switch
tpcfee01 (ICL)
100MBit/s
10MBit/s
tpcfee02
(Test-FedClient)
Internal
network
100MBit/s
Switch
6 DCS boards
(FeeServer
incl. TPC CE)
DCS network setup
Based on standard protcols/tools: DHCP, DNS, NFS
DCS boards on private network 10.x.x.x
Normal text - click to edit
.feenet used as local TLD
Board number used for MAC and IP addresses (24 LSB) and hostnamealias (dcs<board#>.feenet)
Gateway running ICL provides communication with outside world
Hostname in format tpc-fee_x_yy_z.feenet, dcs<board#>.feenet as alias
FeeServer name set from hostname
FeeServer stored on and run from external NFS share
Logs written to NFS share
DCS bootup
MAC address set to board number
DCS board sends MAC address to DHCP server, requesting IP address and
hostname
Normal text - click to edit
DHCP server looks up IP address for MAC address, then queries Domain
Name Server for hostname matching IP-address
DHCP server returns IP configuration and hostname to DCS board
DCS board mounts two NFS shares – one RO and one RW
Boot-script run from RO shared directory
May start update scripts
Starts FeeServer with hostname as FeeServer name and logs outputed to RW share
Cables
DCS-side:
Normal text - click to edit
Uses non-standard connector without any locking
May easily fall out
Connectors are glued together, cable attached to cooling plate using cable ties
Switch-side:
Standard ethernet connector
Connectors not well made/attached, bad contact
Had to be re-crimped
Are still sensible to twisting when plugged into switch/patch panel
Network problems during commissioning
Some boards were unreachable via the network: 90% packet drop
Switch indicated 100Mb/s – not 10 as expected
Normal text - click to edit
Most boards affected, but some always, some rarely
However: a short power cycle seemed to help?
Turned out there was a bug in the kernel driver: autonegitiation not always
enabled on boot
Ethernet interface switched to 100Mb/s operation
The electronics between ethernet chip and cable on DCS board does not support this
because of modifications due to the strong magnetic field
Only a few packets got through
After kernel update, problems gone
Temperature measurements
• All FECs have temperature sensors
Normal text - click to edit
– If temperature too high electronics
may be damaged
– The FeeServer will export
temperatures to higher layers
– High temperatures will cause
electronics to be switched off
• During commissioning temperature
was written continuously to log files
– A temperature cross section for
each partition was plotted for
every 12th hour
– No alarming temperatures were seen
Software
Mostly OK
Normal text - click to edit
InterComLayer/FeeServers interplay is working
FeeServers sometimes “disappear” from DID, but not from ICL. It seems like
they are running, but not in a working state
FeeServers sometimes do not publish services – registration timeout
FeeServers crashes (and restarts) when FECs are turned on and off via DDL
The kernel update took care of most other problems (“impossible” to get all
DCS boards running without “dirty tricks”)
Commissioning conclusion
Normal text - click to edit
Network based configuration worked as planed
Some initial network problems, OK after kernel update
No alarming electronics temperatures seen
Some minor FeeServer issues
Ethernet cables must be handled with care
Benchmarks during TPC
commissioning
Benchmark done with one patch and a complete slice of the TPC
Normal text - click to edit
Benchmark test performed on TPC side 0 (a), slice 13 (single cast on patch 0)
Setup:
6 FeeServer with TPC ControlEngine (CE)
Switch: NETGEAR 7300S Series Layer 3 Managed Switch
InterComLayer on P4 (3.4GHz, dual core, 512 MB RAM, SLC 3)
FedClient implementation for testing purpose on different machine
Setup during commissioning and
benchmark tests
PVSS
(incl. FedClient)
Normal text - click to edit
Switch
6 DCS boards
(FeeServer
incl. TPC CE)
tpcfee01 (ICL)
100MBit/s
tpcfee02
(Test-FedClient)
10MBit/s
100MBit/s
Switch
Components used during benchmark
PVSS II
Supervisory Layer
(FED - Client)
Normal text - click to edit
Front-End Device Interface (FED)
FED Server
Control Layer
Config.
File
Load configuration
data from file
InterComLayer
FEE Client
Front-End Electronics
Interface (FEE)
Field Layer
FeeServer
/ CE
FeeServer
/ CE
FeeServer
/ CE
Cmd / ACK
Channel
Benchmarks layout
Issued command:
Normal text - click to edit
Switching on / off of all Front-End-Cards of the patch
command size: 12 Byte (+ 12 Byte of FeePacket header = 24 Byte)
CE was emulating the execution of “switch on/off FEC” command
Send as:
Singlecast and Broadcast for a complete slice
from Test-FedClient and from PVSS
Benchmark results during TPC
commissioning
SingleCast ControlFero command:
Normal
text
click
to
edit
time period for [sec]
average
max
min
Command in FedServer –
ACK in FeeClient
0.358162
1.092122
0.243506
SEND – ACK in FeeClient
0.3574644
1.091613
0.243026
Process time in ICL
0.000698
0.000999
0.00048
0.1118
0.84
0.02
FeeServer computing
Annotations:
command issued 100 times
no lost ACKs
Benchmark results during TPC
commissioning
BroadCast ControlFero command (FedServer – Ack in FeeClient):
Normal
text
click
to
edit
patch0
patch1
patch2
patch3
patch4
[sec]
all
average
0.404874
0.267716
0.275715
0.303979
0.290279
0.313083
0.32129
max
1.012536
0.619624
0.847929
0.775591
1.011102
0.902006
0.848276
min
0.249206
0.235348
0.032372
0.236584
0.236367
0.064199
0.228168
96
84
92
91
95
92
90
count
patch5
Annotations:
command issued 96 times,
lost ACKs: 21 (for missing already FeeServer no command had been issued)
Benchmark results during TPC
commissioning
FeeServer/CE benchmark (receive command – send ACK):
patch0
patch1
patch2
patch3
Normal
text - click
to
editpatch4
patch5
0.028901
0.041023
0.031837
0.042708
0.028316
0.027245
max [sec]
0.22
1.11
0.61
0.62
0.66
0.44
min [sec]
0.02
0.02
0.02
0.02
0.02
0.02
seg faults
3
4
0
0
1
1
duplicated ACKs
6
15
4
8
10
4
counts
91
88
98
96
95
98
average [sec]
Annotations:
command issued 100 times,
duplicated ACKs may indicate temporarily lost links to ICL and/or DIM-DNS
Remarks to Benchmark tests
ACKs very delayed
Normal text - click to edit
very few ACK reached at the FeeClient after the ACK of the following
Command has already been received
take over of ACK not possible in FeeServer and DIM framework
 most likely package temporarily stuck in switch
duplicated ACKs
most likely due to lost link to FeeServer, DIM-DNS
should not disturb the system, filtered out by InterComLayer
Future Tests
Extended tests with more slices: 2, 9, 18 (one side), 36 (whole TPC, both
sides)
Normal text - click to edit
preparing a complete set of benchmark test when TPC is available again in
May 2007
Test with real commands, real configuration data and real execution in CE
Benchmarks of the Service Channels (fast triggered update of temp, etc.)
(usage of the CommandCoder during tests)
further investigation of delayed ACKs
verify that duplicated ACKs will not disturb the system
Normal text - click to edit