dag_wroclaw - Indico

Download Report

Transcript dag_wroclaw - Indico

Data production and virtualisation status
Dag Toppe Larsen
Wrocław, 2013-10-07
Outline

Data production status

Virtualisation status


Production database

Data production script

Web interface

Plan forward
Proposal for new production directory structure
2
Data production status

Data production team:



Dag, Bartek, Kevin
So far this year

17 mass productions

72 test productions
Castor sometimes slow and/or unresponsive



Typically lasts for a couple of days, then gets better
Also a problem for check of produced data since
nsls also hangs
Have contacted Castor support, but problem often
goes away before properly diagnosed
3
Virtualisation status

Have requested and obtained new “NA61”
project on final Lxcloud service

Same quota (200VCPUs/instances) as before

Access controlled by new e-group “na61-cloud”

Migration completed


A few minor issues had to be worked out with IT
Latest software versions (13e legacy, v0r5p0)
installed on CVMFS

Mass production of BeBe160 has started

Next step
4
Production DB

Production DB has grown a bit beyond what
was originally intended


Complicated to access information from Castor and
bookkeeping DB
Elog data not always consistent (needs to be
standardised)



Elog data needed as input for data production (magnetic
field)
Difficult to work with the production information
without a proper SQL database
Created a sqlite DB with three tables: run,
production and chunkproduction
5
Production DB schema

runs




All information for
given run
Information
imported from elog
via bookkeeping
DB
Primary key: run
Fields target,
beam, momentum,
year obtained from
elog
6
runs table

Contains all information for given run

All elog information for run is imported

Elog information are used to fill the fields target,
beam, momentum, year and magnet


Some normalising required to reduce elog entropy
Separate elog_* contain elog entries as
extracted from elog


Can be used with SQL select queries, but entropy
sometimes makes this challenging
Can be interesting to add additional fields to table to7
contain “standardised” elog values, easier to select
runs.beam


Contains the beam
type for given run
Derived from
elog_beam_type


Not too much entropy
(right), but some
standardisation
required
Used to determine
“reaction”
sqlite3 prod.db "select
elog_beam_type,
count(*) from runs
group by
elog_beam_type"
|369
Be|1209
Be |15
K-|89
No beam|6
None|314
Pb|31
8
runs.target


Contains the target
type of given run
Derived from
elog_target_type


Some entropy (right),
standardisation
required
Field used to
determine “reaction”
sqlite3 prod.db "select
elog_target_type,
count(*) from runs
group by
elog_target_type"
|369
2C target IN|116
2C target OUT|34
Be target IN|1003
Be target IN |11
Be target OUT|203
Be target OUT |4
9
runs.momentum


Contains beam
momentum for given
run
sqlite3 prod.db "select
elog_beam_momentu
m, count(*) from runs
group by
Derived from
elog_beam_momentu
elog_beam_momentu
m"
m
|372
Some entropy (right),
0 GeV/c|313
but not too much
10 GeV/c|15
Assume
100 GeV/c|14
30GeV!=31GeV
120 GeV/c|161
10
Assume
13 GeV/c|1137



75GeV!=80GeV
productions table


A unique combination
of target, beam,
momentum, year, key,
legacy, shine, mode,
os, source, type is a
production
Primary key
production

Auto-generated
unique number

production: e.g. 1

target: e.g. Be

beam: e.g. Be

momentum: e.g. 158

year: e.g. 11

key: e.g. 040

legacy: e.g. 13c

shine: e.g. v0r5p0

mode: e.g. pp
11
chunkproductions table



Stores all chunks
produced
Associated to
production, run and
chunk

production: e.g. 1

run: e.g. 123456

chunk e.g. 123

Has potential to
contain order of 10^6
rows


By far largest table in
DB
Potential performance

rerun: number of
times chunk has
failed and been
reprocessed
status: waiting /
processing / checking
/ ok / failed (numeric 12
values)
Magnetic field

Originally planed to store this information in
separate field in run table (extracted from elog)


Needed for KEY5 and residual corrections
However, Seweryn has now added this
information in same database as global key



(but not part of global key)
Working on integrating this information into
production scripts
Will make automatic data production much simpler
13
Database

Currently using sqlite

Pro:





Con



DB contained in single file on file system
No need to set up data base
Everybody can easily access it with custom SQL quires
Open format/code, we “really” own the data
Not sure if performance will be an issue
Backup via normal file system backup
Have also tried central Oracle database
(na61_cloud@pdbr1)
14
Automated data production script
commands
./prodna61-produce.sh
Usage:
./prodna61-produce.sh <command>
<command> one of:
reactions
- list all reactions
in database
productions
- list all
productions in database
./prodna61-produce.sh <command>
<path_in>
<command> one of:
regreaction
- register all
15
Data production command usage

prodna61-produce.sh regreaction
/afs/cern.ch/11/Be/Be160




Will register all runs found at the path in the runs
table
Obtains run information from bookkeeping
database/elog
Only has to be done one time per reaction (path)
prodna61-produce.sh regproduction Be Be 158
11 040 13e v0r5p0 pp phys def “A new prod.”

Creates a new production on the production table,
and inserts a new row in the chunkproductions table16
for each of the chunks of the reaction
Data production script status

Can in most cases produce data





Further work on standardisation on elog data
needed
Sometimes reactions have “specialities” that has to
be taken into account
Lxbatch and CernVM versions have diverted a bit,
need to be (re-)unified
Could be nice to use key-value pairs of parameters
Need to add possibility to process range of runs
(test productions)

Not expected to be difficult
17
Web interface (prototype)
18
Web interface (prototype)

Web interface to production DB


http://na61cld.web.cern.ch/na61cld/cgibin/start?reaction=Be|Be|158|11
Experimenting with best interface/usability for
different use cases

Make it “intuitive” and easy to use

Have not put effort into making it “look” good


But should be easy to do, relies on style sheets (CSS) for
design
Think “reaction” and “production” are the main
entities to build around
19
General plan forward



Complete CernVM test BeBe160 test
production
Finish outstanding issues with automatic
production script
Finalise web interface for data production

Add functionality, improve performance
20
Proposal for production directory
structure (after moving to Shine)



Preferably, all unique production parameters
should be encoded in the path to avoid conflicts
A deep directory structure is however
undesirable
Proposal: divide directory path into four levels:
“type”, “reaction”, “reconstruction conditions”
and “file type”

/castor/cern.ch/na61/
<type>/
<target>_<beam>_<momentum>_<year>/
<key>_<shine>_<mode>_<os>_<source>/
<file_type>/
21