Presentation

Download Report

Transcript Presentation

Supervision of the ATLAS High Level
Triggers
Sarah Wheeler
on behalf of the ATLAS Trigger/DAQ High
Level Trigger group
CHEP 24-28 March 2003
Sarah Wheeler
1
ATLAS Trigger and Data Acquisition
Trigger
Calo
MuTrCh
40 MHz
DAQ
Other detectors
40 MHz
RoI
2.5 ms
LV
L1
75 kHz
LVL2
L2 Supervisor
L2 N/work
L2 Proc Unit
H
~2 kHz
L
ROIB
~ 200 Hz
120 GB/s
~ 10 ms
RoI
requests
L2SV
L2P
L2N
EFP
EFP
EFP
EFP
~ sec
EFacc = ~0.2 kHz
High Level Triggers
CHEP 24-28 March 2003
Sarah Wheeler
Read-Out Drivers
120 GB/s
ROD-ROB Connection
ROB
Read-Out Buffers
ROS
Lvl2 acc = ~2 kHz
FE Pipelines
RRC
RRM
T Event Filter
Event Filter
Processors
ROD
RoI data = 2%
~3 GB/s
RoI Builder
Lvl1 acc = 75 kHz
D
E
T
RO
DFM
EBN
SFI
D
A
T
A
F
L
O
W
ROD-ROS Merger
Read-Out Sub-systems
~3+3 GB/s
Dataflow Manager
Event Building N/work
Sub-Farm Input
EFN
Event Filter N/work
SFO
Sub-Farm Output
~ 300 MB/s
2
Supervision of the HLT

EVENT BUILDING
SFI
SFI
SFI


EFP
EFP
EFP
EFP
EFP
EFP
EFP
EFP
EFP
Event Filter Farm

HLT implemented as hundreds
of software tasks running on
large processor farms
For reasons of practicality
farms split into sub-farms
Supervision is responsible for
all aspects of software task
management and control
 Configuring
 Controlling
 Monitoring
Supervision is one of the areas where commonality between
Level-2 and Event Filter can be effectively exploited
CHEP 24-28 March 2003
Sarah Wheeler
3
Prototype HLT supervision system







Prototype HLT supervision system has been implemented using
tools from the ATLAS Online Software system (OnlineSW)
OnlineSW is a system of the ATLAS Trigger/DAQ project
Major integration exercise: OnlineSW provides generic services for
TDAQ wide configuration, control and monitoring
Successfully adapted for use in the HLT
For HLT control activities following OnlineSW services are used:
 Configuration Databases
 Run Control
 Supervisor (Process Control)
Controllers based on a finite-state machine are arranged in a
hierarchical tree with one software controller per sub-farm and
one top-level farm controller
Controllers successfully customised for use in HLT
CHEP 24-28 March 2003
Sarah Wheeler
4
Controlling a Farm
Supervisor
setup
SubFarm 1
boot
Supervisor
Control
SubFarm 2
Control
Supervisor
Control
load
configure
start
CHEP 24-28 March 2003
Sarah Wheeler
5
Monitoring Aspects



Monitoring has been implemented using tools from
OnlineSW
Information Service
 Statistical information written by HLT processes to
information service servers and retrieved by others
for e.g. display
Error Reporting system
 HLT processes use this service to issue error
messages to any other TDAQ component e.g. the
central control console where they can be displayed
CHEP 24-28 March 2003
Sarah Wheeler
6
Monitoring a Farm

Example of Event Filter monitoring panel
CHEP 24-28 March 2003
Sarah Wheeler
7
Scalability Tests (January 2003)
Control Tree



Event Filter Farm

CHEP 24-28 March 2003
Sarah Wheeler
Series of tests to determine
scalability of control
architecture
Carried out on 230 node IT
LXPLUS cluster at CERN
Configurations studied:
 Constant total number of
nodes split into a varying
number of sub-farms
 Constant number of subfarms with number of
nodes per sub-farm varied
Tests focused on times to
startup, prepare for datataking & shutdown of
configurations
8
Generation of Configuration Database
Custom GUI written to create configuration database files
CHEP 24-28 March 2003
Sarah Wheeler
9
Results – Constant number of Nodes
Constant Farm Size (230 nodes)
10
Time (s)
8
setup
6
boot
4
shutdown
2
0
3
8
21
Number of Sub-Farms



Graph shows times to start and stop control infrastructure
Increase in times seen with number of sub-farms
More sub-farms mean more controller and supervisor processes
CHEP 24-28 March 2003
Sarah Wheeler
10
Results – Constant number of Nodes
Time (s)
Constant Farm Size (230 nodes)
16
14
12
10
8
6
4
2
0
load
configure
start
stop
unconfigure
unload
3
8
21
Number of Sub-Farms



Graph shows times to cycle through run control sequence
Decrease seen with number of sub-farms
More sub-farms imply fewer nodes, therefore fewer trigger processes
to control per sub-farm
CHEP 24-28 March 2003
Sarah Wheeler
11
Results – Constant number of Sub-Farms
Constant Number of Sub-Farms (10)
Time (s)
7
6
load
5
configure
4
start
3
stop
2
unconfigure
1
unload
0
5
10
20
Number of Nodes per Sub-Farm

Times increase with increasing numbers of nodes and processes to
control as expected
CHEP 24-28 March 2003
Sarah Wheeler
12
Conclusions and future




Results are very promising for the implementation of
the HLT supervision system for the first ATLAS run
All operations required to startup, prepare for datataking and shutdown configurations take of the order
of a few seconds to complete
Largest tested configurations represent 10-20% of final
system
Future enhancements of supervision system to include:
 Combined Run Control/Process Control component
 Parallelised communication between control and
trigger processes
 Distributed configuration database
CHEP 24-28 March 2003
Sarah Wheeler
13