The Prompt Reconstruction system in BaBar

Download Report

Transcript The Prompt Reconstruction system in BaBar

The BaBar Prompt Reconstruction
Manager: a Real Life Example of a
Constructive Approach to Software
Development.
Francesco Safai Tehrani
Istituto Nazionale di Fisica Nucleare, Sezione di RomaI
for the BaBar Prompt Reconstruction & Computing Groups
CHEP2000 7-11 Feb 2000, Padova, Italy
The Prompt Reconstruction System

The software structure of the BaBar experiment
requires that all the incoming data get fully
processed in less than 8 hours after the data
taking. To obtain that the PR system requires:
– a high degree of automation of the processing
(automatic scheduling, fault tolerant system)
– a system able to achieve a sustained processing rate of
100 Hz.
Currently: peak ~55 Hz , the average processing rate can
be much lower due to startup times, run length...
FOR MORE INFO...
http://www.slac.stanford.edu/www/Computing/Online/PromptReco/
The Prompt Reconstruction System
(full view)
PR Instance
PR instance
Farm CPU
GFD
GFM
PRD
PRF
PRM
PR Instance
Farm CPU
GFD
GFM: Global Farm Manager
PRM: Prompt Reco Manager
GFD: Global Farm Daemon
PRD: Prompt Reco Daemon
PRF: Prompt Reco Framework
PRD
PRF
What is the PRM?

Prompt Reconstruction Manager
Tasks:
– Scheduling of jobs
(user-request based or policy based)
– Bookkeeping
(Electronic logbook, Constants Block database)
– User/GFM interface
(simple command language interface)
– Automated data retrieval
(temporarily)
– Multiple instances
(to allow for parallel processing and reprocessing)
The Prompt Reconstruction System
(PRM detailed view, current)
PRM
data retrieval server
bookkeeping server
“finalize” server
interface
Logging Manager
GFD
The Necessity-driven Development Model

It is a response to various specific needs:
– Fast prototyping:
reliable working system on a short timescale.
– Flexibility:
ability to adapt to ever-changing requests and to the
lessons learned daily “on the field”.
– Design:
coherence with OPR specifications.
– Maintainability:
easy to maintain even though it contains parts written with
different techniques/languages.
– Reliability:
robust design technique.
Technology (I): OOAD & iterations

“relaxed” OOAD
–
–
–
–

use of well known analysis and design techniques
patterns  flexibility, reusability, robustness
system evolution is simple and reliable
migration to other OO languages (C++, Java) is simple
“relaxed” iteration development model
– iterations are a simple and powerful way to model the
software development cycle
but
– we need to be able to change the requests for the
current iteration as necessities arise  fast response
Technology (II): the language

use of scripting languages
– pre-existing core software: Bourne scripts
– scripts allow rapid prototyping

use of PERL as scripting language
PERL has several advantages:
– it is Object Oriented, thus allowing for a natural
extension of OOAD concepts
– it is widely available and distributed under an Open
Source license (GNU like)
– it has high quality libraries (“modules”) that make it
easy to implement powerful systems (OO Socket library,
DBI Database Interface for SQL database…)
– can be easily embedded in C++/Java code
Technology (III): interface layers

The PRM system behaves like a dynamic
network of interacting “objects”:
– message-exchange communication
– complex behaviour: easy to modify

The communication between “objects” is
realized through interface layers:
– the implementation details are filtered at the interface
level (i.e. the implementation language)
– it is easy to “evolve” parts of the system without
disrupting the existing (and working) one
– it is easy to create “wrap” interfaces around existing
scripts, integrating them seamlessly in the system
Technology (IV): the future

Communication layer:
– current: TCP/IP
– future: CORBA

Language evolution
– current: PERL and Bourne scripts
– future: C++/Java/PERL

Dynamic resizing or partitioning of the
processing farm
 Better protection/recovery mechanisms
(watchdogs, self healing software...)
An Example of Client-Server Interaction
“local” query
client
proxy
server
“remote” query

This architecture provides a high degree of
isolation from the communication layer:
– “smart” proxy  efficient use of resources, abstraction
– “painless” to switch from TCP/IP to CORBA
– the proxy can be as “smart” as needed
Current Status of the PRM





It has been successfully “in production” since
July 1999.
It has proven to be robust and reliable, requiring
little or no human intervention.
Its flexible design adapts easily to ever-changing
requests and fulfills the needs of the Online
Prompt Reconstruction project.
The watchdog is being implemented and will go
in production soon.
The multiple instance version of the PRM has
been recently released and is being tested.
The current structure of the PRM
PRM
data retrieval server
bookkeeping server
“finalize” server
interface
Logging Manager
GFD
Some Data about BaBar

Input data from DAQ:
– ~30 KB/ev @ 100 Hz

Farm Architecture:
– Sun Ultra 5  Sun T1
– 100-200 single CPU machines
– 256 MB RAM
– 100 Mb/s Ethernet
– 24 Objectivity Server Sun E450