Drift chamber test DAQ software

download report

Transcript Drift chamber test DAQ software

Process and Data Flow Control in KLOE
E. Pasqualucci (INFN - Roma)
[email protected]
Outline
• System overview
• Process structure and local
communication
• SNMP and remote communication
• Process control
• Data Flow Control system
• DFC monitor
DAQ system architecture
~ 23000 FEE channels @ 2.5 kHz f + bckg (~10 kHz)
Bandwidth: ~ 50 Mbytes/s (5 Kbyte/ev.)
Storage: 200 Tbyte/y
F
C
V
D
P
I
D
U
C
I
R
O
C
K
M
A A
A R
V
U D
D O
I
...
X C
C C
C
M 16
1 K
VIC
Tested with peak rates of 10 kHz in
multibunches mode.
Tested at maximum required throughput
using no zero suppressed calorimeter data
Trigger chain
A A
A R
V
U D
D O
I
...
X C
C C
C
M 16
1 K
A A
A R
V
U D
D O
I
...
X C
C C
C
M 16
1 K
DFC system
A A
A R
V
U D
D O
I
...
X C
C C
C
M 16
1 K
VIC
F
C
V
D
P
I
D
U
C
I
F
C
V V
D
P
I I
D
U
C C
I
R
O
C
K
M
...
FDDI
Run Control
FDDI Switch
Monitor System
CPU server
CBUS
...
Storage system
CPU server
F
C
V V
D
P
I I
D
U
C C
I
R
O
C
K
M
Level-2
crates
DAQ software organization
Level 1 chain
DFC
system
Data
Map data
Messages
Traps
VME
Chain tools
Collector
simulation
CmdSrv
Level 2
GeoVme map
Circ
Monitor
system
Sender
SpyBuff
dmap
RSpyD
Didone
FDDI switch
Receiver
SpyD
RunCtl
Circ
Builder
Recorder
CmdSrv
SlowCtl
system
Farm
Circ (Ybos)
To Disk/Tape
Farm status
Spy dump
Process structure
– Msg Q creation
– Shmem subscription
– Shmem space allocation
for variables
• Main Loop
– Process Event
– Process Command
– Idle time
• Interrupt Handler
– Extract command from
Msg Q.
Id
Contents
Mapping
Process number
Header Pointer to 1st process
Proc. 1 Pointer to 2nd process
Processes
• Initialization
Process name
Process id
Message queue id
Process status
Last command
Last command status
Number of variables
Variable 1
Variable 2
…..
Proc. 2 Pointer to 3rd process
…..
All
Local communication
• Getting a variable:
Id
Header Pointer to 1st process
Locates the process
Gets its id and message Q
Puts command to Q
Sends an interrupt
Polls on command status
– The receiver:
• Reads the Q
• Writes the command and
status and executes it
• Writes the command
status (acknowledgement)
Proc. 1
Pointer to 2nd process
My
process
Process
name
Process
id id
My
process
My
Message
Q id queue id
Process status
Stop
Last command
!
Executing
Last
command status
Success
Number of variables
Variable 1
Variable
2
My
variable
= value
…..
Proc. 2
Pointer to 3rd process
…..
Processes
• Sending a command:
•
•
•
•
•
Mapping
Process number
• Locate process
• Locate variable
– The sender:
Contents
signal
My process
Stop
Q !
All
Managing the DAQ network
• SNMP (Simple Network Management Protocol)
• Largely used to manage network devices
• Defined as a standard by the IETP (Internet Engineering
Task Force)
• Implemented using a reliable UDP protocol
• Used to retrieve and/or set information about :
–
–
–
–
network configuration
traffic
faults
accounting
• Managed objects defined in a Manager Information Base
(MIB) defined by IETP
• Private extensions of the standard MIB are allowed
• Public domain software, allows the implementation of :
– dedicated agents
– utilities for remote access
SNMP client-server policy
• MIB
– Variables organized as a tree
• Primitives:
– get, get-next, set
• Each device runs a daemon able to:
– Understand MIB requests
– Obtain required information
– Execute required actions
• Trap mechanism
• KLOE uses SNMP to:
–
–
–
–
Control DAQ devices and network
Implement message distribution
Implement process control
Implement Data Flow Control (DFC)
The command server and
the KLOE MIB sub-tree
iso.org.dod.internet.mgmt.mib-2
system(1)
KLOE(13)
sysServices(7)
sysDescr(1)
sysLocation(6)
sysObjectID(2)
sysName(5)
sysUpTime(3)
sysContact(4)
kprocNumber(1)
kprocVarTable(3)
kprocTable(2)
kprocEntry(1)
….
kprocesses(1)
kprocVarEntry(1)
kprocVarValue(n,6)
kprocVarProcIndex(n,1)
kProcVarIndex(n,2)
kprocVarNumber(8)
kprocVarName(n,3)
kprocIndex(1)
kprocName(2)
kprocId(3)
kprocLastCommandStatus(7)
kprocLastCommand(6)
kprocMsgQId(4)
kprocStatus(5)
kprocVarType(n,5)
kprocVarSize(n,4)
Message system implementation
Node A
Run Control
Node B
locate process
send command
SNMP ack
Command Server
put
command
second ack req
INT
second ack
get
process variables
Msg Q
Shared Memory
write last command
and status
executing
execute
command
write command status
(success, fault)
DAQ Process
get
command
Remarks and performance
• Command server
– DAQ process
• receives commands and shares variables
– Command distributor
• Run and process control tools
– tcl/tk commands implemented
• get variable, send message
– Fortran interface for old fashioned software
– Portable
• AIX, OSF1, HP-UX, Solaris, Linux, LynxOS supported
• Optimized library
– Parallel message distribution implemented
• Performance
• Local command ~1.2 ms
• Remote variable reading ~1.2 ms
• Remote command completion ~4 ms
Production process control
command
command + start
trap
signal
check
pcd
OffCtl
Control
node
cmdsrv
Production
node
locpc
Shmem
(variables)
Proc_2
Proc_1
DAQ system architecture
A A
A R
V
U D
D O
I
...
X C
C C
C
M 16
1 K
A A
A R
V
U D
D O
I
...
X C
C C
C
M 16
1 K
VIC
A A
A R
V
U D
D O
I
...
X C
C C
C
M 16
1 K
Trigger chain
F
C
V
D
P
I
D
U
C
I
R
O
C
K
M
DFC system
A A
A R
V
U D
D O
I
...
X C
C C
C
M 16
1 K
VIC
F
C
V
D
P
I
D
U
C
I
F
C
V V
D
P
I I
D
U
C C
I
R
O
C
K
M
...
FDDI
Run Control
FDDI Switch
Monitor System
CPU server
CBUS
...
Storage system
CPU server
F
C
V V
D
P
I I
D
U
C C
I
R
O
C
K
M
Level-2
crates
The DFC System
• Changes the packet distribution sequence
– Avoids slow-down in data transmission and blocking
timeouts
• Keeps latency under control
DFC status
TS
VIC bus
shmem
Flow table
Network and trigger stat
Performance stat
Statistics
Commands
Traps
Flow table data
DFCd
DFC
Flow table
latmon
Collector
Receiver
RunCtl
Receiver protocol
•
•
•
•
Receives event sub-packets through the GigaSwitch
Put packets into multiple circular buffer
Implements DFC and LatMon farm interface
Dynamic thresholds
0.5 MB/s
TCP/IP on
FDDI
...
0.5 MB/s
0.5 MB/s
Select and copy sub-event packets
If last # arrived
...
To LatMon
Get max occupancy
If “full”
Send trap
“full”
If “empty”
after “full”
To DFC system
EVB (1)
EVB (n)
Send trap
“empty”
Send LatMon
trap (#)
DFC Protocol
DFC data in
VME shared memory
• Initialization:
– Wait for “trap”
– On trap (full/empty):
– Sends auto-test traps
N. of RECV nodes
IP addresses
Flags
111111…1111
00 trigger
Validity
Flags
111101…1111
0
...
• Reads the last trigger number from
Trigger Supervisor
• Creates next table
• Modifies the validity of the previous table
DFC map
• Main Loop:
Max number of tables
Flow tables
– Builds Network Map
– Builds DFC map (ordered list of RECV IP
addresses)
– Creates the first table with Infinity
Trigger number validity
DFC algorithm and performance
• Validity:
– v = t0 + (ttr + (tdfc + ksdfc))*(n + ksn) + t
• k=5
– autotest
• DFCd reaction time (trap):
– 1.2 ms
• DFC reaction time:
–
–
–
–
tlocal ~ 1.2 ms
trigger interaction ~6-7 ms
tdfc ~ O(10-2) ms
total 10 ms
• DFC-L2 interaction rate:
– ~ 1 table / 50 ms (sustained)
• DFC “dead time” implemented
The DFC status monitor
Packet latency
• Latency measurements:
– SNMP traps sent to LatMon:
• Collector trap when the packet # is released for sender
• Receiver trap when all the sub-packets # arrived
• Test for receiver’s buffers
Summary
• A fast and reliable message system has been
implemented using standard UNIX mechanisms and
the SNMP protocol
• Very simple to use
– process template + command definition
– fortran and tcl/tk interface
• Allows full process control
• A Data Flow Control system has been developed using
message system and SNMP traps
• It allows to redirect network traffic taking into
account the dynamics of the whole system
• Dynamic redefinition of thresholds
• It successfully ran during KLOE data acquisition