Data Handling in KLOE

Download Report

Transcript Data Handling in KLOE

CHEP 2000
Data Handling in KLOE
I.Sfiligoi
INFN LNF, Frascati, Italy
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
1
The KLOE experiment
KSp +p KLp +p - (CP not)
•
•
at DAFNE -factory
main goal:
•
•
CP violation study
other interesting fields:
•
•
•
kaon form factors
kaon rare decays
radiative  decays
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
KSp +p KL3p 06g
I. Sfiligoi
2
KLOE Requirements
•
Data acquisition (at full DAFNE luminosity)
•
•
•
Computing power
•
•
1011 events per year acquired
50 MB/s sustained throughput
ALL the events need to be reconstructed
Storage requirements
•
•
one petabyte of raw and reconstructed
events
hundreds of megabytes of related data
(configurations, slow control data, calibration parameters,
etc.)
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
3
KLOE computing
environment
•
•
•
Based on a set of medium-sized
servers
Connected using commercial switched
networks (Fast Ethernet and Gigabit Ethernet)
Heterogeneous environment, several
platforms:
•
•
•
•
IBM AIX on PowerPC
Sun Solaris on Sparc
Compaq Tru64 Unix on Alpha
HP-UX on PA-RISC
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
4
KLOE storage pool
•
Different policies for different types of
data:
•
•
•
raw and reconstructed events on tape
libraries, with big disk pools for data
caching
related data managed by a disk based
database system
analysis output on disk pools
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
5
Disk pools
•
Four categories of disk pools are
present:
•
•
•
•
each data acquisition node in the farm has
its own small disk pool
computing nodes write their output to
centralized, NFS mounted disk pools
separate disk pools are used as a cache
for the events on tape
analysis output is written to its own, central
AFS mounted disk pool
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
6
Tape library
•
Several automated tape libraries
supported
(at the moment the 5500 slot tape library is
partitioned between two tape servers)
•
Accessed using commercial software
•
IBM ADSM with the current tape library
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
7
KLOE software
•
Three distinct
categories
•
•
•
DAQ (or online)
reconstruction and
analysis (or offline)
Monte Carlo
ANSI C
FORTRAN
inside A_C
FORTRAN
The interface to the Data Handling
System must be compatible with all of
them
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
8
KLOE Data Handling System
•
Composed of four elements:
•
•
•
•
Database System
Archiving System
Spy System
KLOE Integrated Dataflow (KID)
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
9
KLOE Data Handling System
•
A mix of commercial and custom
software
• commercial
•custom software mostly
software carries on
all the vital
functions
•
extends and coordinates
the functionality of the
commercial software
the dependency on commercial
software is minimized by the layers of
custom software
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
10
KLOE Data Handling System
•
•
Based on a set of multi-threaded nonprivileged daemons and related libraries
Distributed across several nodes
•
Communication by means of TCP/IP
sockets on high ports
bypasses TCP/IP filtering
flexible, programming language and
operating system independent
no configuration needed on the client
side
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
11
KLOE Data Handling System
•
Composed of four elements:
•
•
•
•
Database System
Archiving System
Spy System
KLOE Integrated Dataflow (KID)
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
12
Database System
•
Two distinct database systems are used
offline database
system
based on HepDB data stored as ZEBRA
banks
• online database
system
based on a Relational DBMS
•
extended for distributed
environments data are structured in
fields
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
13
Online Database System
•
data stored in a Relational DBMS
•
•
IBM DB2 Universal Database at the
moment
communication between the clients
(user applications) and the RDBMS
through a database daemon
app
app
RDBMS
app
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
DD
I. Sfiligoi
14
Database Daemon
•
The database daemon is the only link
between the applications and the
RDBMS
•
•
if the RDBMS is changed in the future, only
the database daemon will need to be
changed
Different kinds of commands are
managed by the daemon
•
•
general SQL commands
KLOE specific commands
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
15
Database Daemon
•
Different kinds of commands are
managed by the daemon
•
general SQL
commands
•passed directly to the RDBMS
select run_nr from run_logger where status = 'OK'
•KLOE
specific commands
•managed
by the daemon itself
•the
RDBMS is used to retrieve and
store data needed by the daemon itself
log that I am starting processing file relative to run 3
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
16
Database Daemon
•
The use of KLOE specific commands
has several advantages
•
•
•
additional checks and restrictions are
possible
data consistency management is
centralized
• for example, the DAQ configuration cache
reduces
thecaches
typical access
from 4 to 0.1
fast
central
can betime
implemented
s
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
17
A light version
•
•
The RDBMS is used to ensure flexibility,
reliability and performance
Demanding in terms of computing
resources and management effort
•
•
stand-alone environments often
cannot afford it
A RDBMS-independent version of the
database daemon is under
development
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
18
A light version
A RDBMS-independent version of the
database daemon is under
development
•
limited to KLOE specific and the most
frequently used SQL commands
• based on use of flat files containing a small
portion
of the
data
not
suitable
for production
environment,
but enough for home use
•
•
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
19
KLOE Data Handling System
•
Composed of four elements:
•
•
•
•
Database System
Archiving System
Spy System
KLOE Integrated Dataflow
(KID)
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
20
KLOE Archiving System
•
Expected event data managed by
KLOE
•
•
Tape libraries needed
•
•
•
1 PB
data storage and retrieval non trivial
random access to data very inefficient
Disk-based intermediate buffers used
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
21
KLOE Archiving System
•
Two types of intermediate buffers
•
•
DAQ, offline and Monte Carlo output are
structured as YBOS files and written on
their disk output areas
event data needed by offline as input are
read from the archiving system disk-cache
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
22
KLOE Archiving System
•
Data needs to be migrated
•
from output areas to the tape library
• as soon as possible
(taking into account also efficiency concerns)
•
•
from the tape library to the disk cache
• when an application needs it
(or even better, a bit earlier)
Migration is totally automated and
transparent to the applications
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
23
KLOE Archiving System
•
The Archiving System is made of four
components
•
•
storage managers
disk space managers
•
•
output areas
cache areas
spacekeep
er
filekeeper
archival director
• cache manager
archiver
Communication by means retrieve
of TCP/IP sockets
•
•
•
archADS
M
Coordinated by the online database
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
24
Storage Managers
•
•
One for each logical tape library
Allows
•
•
•
•
queries about tape library content
file archival
file retrieval
Transaction oriented
(if the underlying tape library software
supports it)
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
25
Storage Managers
•
The only link between the tape library
and the rest of the system
•
•
•
interface independent of the underlying
archiving software
IBM ADSM is used with the current tape
library
if other products is used in the future, only
a specific storage manager will need to be
developed
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
26
Disk Space Managers
•
•
One for each disk pool
Create and delete files
•
CHEP 2000: 7-11 February,
2000
unused files get deleted
to
make space for new
ones
Data Handling in
KLOE
I. Sfiligoi
27
Archival Director
•
•
Fully automated
Works in polling mode
•
•
•
•
from time to time looks for files ready to be
archived
starts archiving only when enough data is
available
Files are ordered and grouped to
minimize the expected retrieve time
Several groups of files can be archived
in parallel
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
28
Cache Manager
•
User driven
•
•
•
when a file is needed, the application asks
the cache manager where it is located
a retrieve is performed by the manager if
needed
Several requests can be issued at the
same time
the manager reorders them internally to
minimize the tape mounts
Communication by means of TCP/IP sockets
•
•
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
29
KLOE Archival System
Tape
Library
Tape
Library
archiver
archADSM
spacekeep
er
Disk
Pool
filekeeper
Disk
Pool
..n
.
.m.
.
DB
.k .
.
archADSM
Disk
Pool
spacekeepe
r
Disk
Pool
filekeeper
retrieve
NFS
mount
CHEP 2000: 7-11 February,
2000
local file
system
TCP/IP
socket
Data Handling in
KLOE
TCP/IP
socket
I. Sfiligoi
30
KLOE Data Handling System
•
Composed of four elements:
•
•
•
•
Database System
Archiving System
Spy System
KLOE Integrated Dataflow (KID)
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
31
Spy System
•
•
•
KLOE data acquisition software allows
the event data to be read-out before
they get written to disk
The mechanism that reads those data
is called Spy
Based on use of shared memory buffers
•
•
DAQ processes are piped using this
mechanism
the spy system reads data from the buffers
without interfering with the DAQ
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
32
KLOE Data Handling System
•
Composed of four elements:
•
•
•
•
Database System
Archiving System
Spy System
KLOE Integrated Dataflow
(KID)
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
33
KLOE Integrated Dataflow
(KID)
•
Integration library
•
•
database accesses and retrieve operations
hidden
Offers a single point of access to all the
services
•
URI-based selection
spy:/buffer
datarec:(run_nr=5000) and (stream='ksl')
open a spy
channel and pass
the events to the
application
read the list from DB, ask the cache
manager for the files, pass the events
from the files to the application
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
34
Management effort
•
The entire system is managed by only a
few people:
•
•
•
3 people (2 full time) are engaged in KLOE
computing system management (including
storage)
1 person is engaged in the development
and management of the online database
and the archiving system
2 people spend few percent of their time
for the maintenance of the offline database
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
35
CHEP 2000
Data Handling in KLOE
I.Sfiligoi
INFN LNF, Frascati, Italy
CHEP 2000: 7-11 February,
2000
Data Handling in
KLOE
I. Sfiligoi
36