dag_wroclaw - Indico
Download
Report
Transcript dag_wroclaw - Indico
Data production and virtualisation status
Dag Toppe Larsen
Wrocław, 2013-10-07
Outline
Data production status
Virtualisation status
Production database
Data production script
Web interface
Plan forward
Proposal for new production directory structure
2
Data production status
Data production team:
Dag, Bartek, Kevin
So far this year
17 mass productions
72 test productions
Castor sometimes slow and/or unresponsive
Typically lasts for a couple of days, then gets better
Also a problem for check of produced data since
nsls also hangs
Have contacted Castor support, but problem often
goes away before properly diagnosed
3
Virtualisation status
Have requested and obtained new “NA61”
project on final Lxcloud service
Same quota (200VCPUs/instances) as before
Access controlled by new e-group “na61-cloud”
Migration completed
A few minor issues had to be worked out with IT
Latest software versions (13e legacy, v0r5p0)
installed on CVMFS
Mass production of BeBe160 has started
Next step
4
Production DB
Production DB has grown a bit beyond what
was originally intended
Complicated to access information from Castor and
bookkeeping DB
Elog data not always consistent (needs to be
standardised)
Elog data needed as input for data production (magnetic
field)
Difficult to work with the production information
without a proper SQL database
Created a sqlite DB with three tables: run,
production and chunkproduction
5
Production DB schema
runs
All information for
given run
Information
imported from elog
via bookkeeping
DB
Primary key: run
Fields target,
beam, momentum,
year obtained from
elog
6
runs table
Contains all information for given run
All elog information for run is imported
Elog information are used to fill the fields target,
beam, momentum, year and magnet
Some normalising required to reduce elog entropy
Separate elog_* contain elog entries as
extracted from elog
Can be used with SQL select queries, but entropy
sometimes makes this challenging
Can be interesting to add additional fields to table to7
contain “standardised” elog values, easier to select
runs.beam
Contains the beam
type for given run
Derived from
elog_beam_type
Not too much entropy
(right), but some
standardisation
required
Used to determine
“reaction”
sqlite3 prod.db "select
elog_beam_type,
count(*) from runs
group by
elog_beam_type"
|369
Be|1209
Be |15
K-|89
No beam|6
None|314
Pb|31
8
runs.target
Contains the target
type of given run
Derived from
elog_target_type
Some entropy (right),
standardisation
required
Field used to
determine “reaction”
sqlite3 prod.db "select
elog_target_type,
count(*) from runs
group by
elog_target_type"
|369
2C target IN|116
2C target OUT|34
Be target IN|1003
Be target IN |11
Be target OUT|203
Be target OUT |4
9
runs.momentum
Contains beam
momentum for given
run
sqlite3 prod.db "select
elog_beam_momentu
m, count(*) from runs
group by
Derived from
elog_beam_momentu
elog_beam_momentu
m"
m
|372
Some entropy (right),
0 GeV/c|313
but not too much
10 GeV/c|15
Assume
100 GeV/c|14
30GeV!=31GeV
120 GeV/c|161
10
Assume
13 GeV/c|1137
75GeV!=80GeV
productions table
A unique combination
of target, beam,
momentum, year, key,
legacy, shine, mode,
os, source, type is a
production
Primary key
production
Auto-generated
unique number
production: e.g. 1
target: e.g. Be
beam: e.g. Be
momentum: e.g. 158
year: e.g. 11
key: e.g. 040
legacy: e.g. 13c
shine: e.g. v0r5p0
mode: e.g. pp
11
chunkproductions table
Stores all chunks
produced
Associated to
production, run and
chunk
production: e.g. 1
run: e.g. 123456
chunk e.g. 123
Has potential to
contain order of 10^6
rows
By far largest table in
DB
Potential performance
rerun: number of
times chunk has
failed and been
reprocessed
status: waiting /
processing / checking
/ ok / failed (numeric 12
values)
Magnetic field
Originally planed to store this information in
separate field in run table (extracted from elog)
Needed for KEY5 and residual corrections
However, Seweryn has now added this
information in same database as global key
(but not part of global key)
Working on integrating this information into
production scripts
Will make automatic data production much simpler
13
Database
Currently using sqlite
Pro:
Con
DB contained in single file on file system
No need to set up data base
Everybody can easily access it with custom SQL quires
Open format/code, we “really” own the data
Not sure if performance will be an issue
Backup via normal file system backup
Have also tried central Oracle database
(na61_cloud@pdbr1)
14
Automated data production script
commands
./prodna61-produce.sh
Usage:
./prodna61-produce.sh <command>
<command> one of:
reactions
- list all reactions
in database
productions
- list all
productions in database
./prodna61-produce.sh <command>
<path_in>
<command> one of:
regreaction
- register all
15
Data production command usage
prodna61-produce.sh regreaction
/afs/cern.ch/11/Be/Be160
Will register all runs found at the path in the runs
table
Obtains run information from bookkeeping
database/elog
Only has to be done one time per reaction (path)
prodna61-produce.sh regproduction Be Be 158
11 040 13e v0r5p0 pp phys def “A new prod.”
Creates a new production on the production table,
and inserts a new row in the chunkproductions table16
for each of the chunks of the reaction
Data production script status
Can in most cases produce data
Further work on standardisation on elog data
needed
Sometimes reactions have “specialities” that has to
be taken into account
Lxbatch and CernVM versions have diverted a bit,
need to be (re-)unified
Could be nice to use key-value pairs of parameters
Need to add possibility to process range of runs
(test productions)
Not expected to be difficult
17
Web interface (prototype)
18
Web interface (prototype)
Web interface to production DB
http://na61cld.web.cern.ch/na61cld/cgibin/start?reaction=Be|Be|158|11
Experimenting with best interface/usability for
different use cases
Make it “intuitive” and easy to use
Have not put effort into making it “look” good
But should be easy to do, relies on style sheets (CSS) for
design
Think “reaction” and “production” are the main
entities to build around
19
General plan forward
Complete CernVM test BeBe160 test
production
Finish outstanding issues with automatic
production script
Finalise web interface for data production
Add functionality, improve performance
20
Proposal for production directory
structure (after moving to Shine)
Preferably, all unique production parameters
should be encoded in the path to avoid conflicts
A deep directory structure is however
undesirable
Proposal: divide directory path into four levels:
“type”, “reaction”, “reconstruction conditions”
and “file type”
/castor/cern.ch/na61/
<type>/
<target>_<beam>_<momentum>_<year>/
<key>_<shine>_<mode>_<os>_<source>/
<file_type>/
21