The Database Project

Download Report

Transcript The Database Project

The Database Project
a starting work by
Arnauld Albert, Cristiano Bozza
The Database Project
Server technology: Oracle Enterprise Server (RAC - Cluster)
Well hosted and supported at CCIN2P3
Working licenses already available in several sites in Italy
Discounts possible through CERN agreement
Good experience of technical support by Oracle (also 247 for critical cases)
Huge (unequalled?) variety of tools and libraries (by Oracle and independent
software producers)
Accessible through: C/C++, Java, C#, Python, PHP, VB, Perl, ODBC, ODP.NET, …
OS-independent
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
2
The Database Project
System: Symmetric datacenters
Continuous data
synchronization
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
3
The Database Project
Schema for construction – detector description
Locations
Products
Users
Container
Mapping
Logical and hierarchical relationships among
products are stored
The structure of the detector (PBS) is also
stored and documented in the DB
PBS
Descriptions
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
4
The Database Project
Schema for construction – activity management
Locations
Operations
Users
Decisions
Each operation corresponds to a well defined task
An operation may contain one or more sub-operations, like a tree
Each operation is linked to the place where it is performed and
the user that does it or is responsible for it. At a certain time, it is
in a certain “status”; on completion, decisions may be taken
concerning it.
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
Status
5
The Database Project
Schema for construction – activity management
Operation
Types
Operations
Operation
Types
Possible
Decisions
The Type of a certain operation contains links
to programs and their operating parameters.
This allows not only documentation but also
management through the DB
Decisions
Each operation type has a set of possible
decisions.
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
6
The Database Project
Schema for construction – detailed bookkeeping information
Product
bookkeeping
Operations
Integration
Tests
Depending on the type of operation, additional
information may be required
Detail tables are linked to the main operation table
For each kind of test, the list of parameters to be
tested is defined
For each component all the output of the testing is
stored; it is easy to add new parameters as needed
Test types
Parameter
values
Sets of
parameters
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
7
The Database Project
Schema for construction – Content
The DB is completely flexible about testing parameters
The DB can contain explanation of parameters and the testing procedure can be
described and documented
We are collecting information from experts about testing procedures
For the moment, we have some sample data (thanks to Tamas, Oleg and
Emanuele)
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
8
The Database Project
Access
Three basic user roles have been defined:
Administrator (km3net), Reader (km3read), Writer (km3write)
with obvious meaning
As for information, we distinguish between “information author/consumer” and
DB user
Information author/consumer: any person that produces or uses data from the
DB, directly or indirectly  stored in Users table
DB User: person or batch process that connects directly to the DB, using part of
its memory, CPU, disk speed  Oracle DB Server authentication
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
9
The Database Project
Access
A DB user can have multiple connections if needed
A DB user can initiate transactions, lock rows/tables, run SQL queries
A person should have a DB user account if he/she is aware of the related
responsibility in resource usage
User accounts can also be given on institution or group basis
Most people will just need data, and also in friendly format
Many people will want to stay focused on data and do not care about technicalities
A special DB user, named km3web, is used by a dedicated Web site to provide a
friendly user interface
The Web site can provide information properly formatted and with detailed
explanations
The Web site can also be used to upload information about results of tests
This can be also done by uploading data files
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
10
The Database Project
KM3Net Web DB access
Login provided through Users table
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
11
The Database Project
KM3Net Web DB access
For the moment, we are waiting for expert feedback to build useful pages
The Web site can also provide “raw SQL access” (technical tests, maintenance, etc.)
As the recent trends of Internet show, Web servers are increasingly becoming
application servers for machine-to-machine data exchange
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
12
The Database Project
Outlook on Physics data storage
It is possible to use a relational DB to store not only “data about the detector” but also
“data from the detector”
Moreover, intermediate output of reconstructions can also be stored, and specific
datasets can be flagged and indexed – help and support analysis!
There is recent fruitful experience concerning this technique
(e.g. OPERA DB – designed to range between 50 and 100 TB, currently 34 TB)
A first technical test has been successful for NEMO-phase-1 post-trigger data
(thanks to Tommaso for data, discussion and help)
Operational settings (“datacards”), trigger configuration, detector status, and sampled
waveforms all easily stored
A C++ library has also been developed to store data without any knowledge of SQL
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
13
The Database Project
Outlook on Physics data storage
Test for NEMO-phase-1 post-trigger data
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
14
The Database Project
Outlook on Physics data storage
Test for NEMO-phase-1 post-trigger data: code snippet to store a full run
The only places where you know you’re storing to a DB are those in red:
fill username/password to connect, and commit the transaction to tell the DB
that all data were written and they can be stored
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
15
The Database Project
Outlook on Physics data storage
Detector data storage is also related to the whole computing model
Pro’s
Con’s
Reliable storage, with corruption check
Additional checks load CPU and disk
OS-independent
Data access requires linking a library
(ODBC, OCI, ODP.NET, …)
Languange and technology independent
storage (C++,C#,Java,VB,Python,PHP,…)
Easy to manage using SQL
Consistency and integrity are
automatically enforced
Views help present data effectively with
few lines of code
In case of data model evolution, old
programs still run without changes (no
recompilation needed)
Normally, also file-system storage uses an
I/O library, so this is not a characteristic
“con” of relational DB’s
DB administrators of course take also care
of developing and maintaining the I/O
library
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
16
The Database Project
Conclusions
The work to set up the DB to document and support construction has already begun
Startup schema defined
User access defined
Web site to make access user-friendly already set up, needs to be filled with useful
pages (input from experts!)
WE NEED DATA!
Outlook
It is possible to store not only construction data, but also raw data and physics output
There is already know-how on that, and we can start a broad discussion
A. Albert, C. Bozza – KM3Net Collaboration Meeting – Marseille, Jan 2013
17