presentation
Download
Report
Transcript presentation
The History and Future of ATLAS Data
Management Architecture
D. Malon, S. Eckmann, A. Vaniachine (ANL),
J. Hrivnac, A. Schaffer (LAL), D. Adams (BNL)
CHEP’03
San Diego, California
24 March 2003
Outline
Persistent principles
Pre-POOL: The ATLAS event store architecture
Hybrid event stores
ATLAS and POOL
Non-event data in ATLAS
ATLAS data management and grids
ATLAS data management and other emerging technologies
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
2
Long-standing principles
Transient/persistent separation—not by any means unique to ATLAS—
means (in ATLAS, anyway):
Physics code does not depend on storage technology used for input or
output data
Physics code does not “know” about persistent data model
Selection of storage technology, explicit and implicit, involves only job
options specified at run time
Commitment to use common LHC-wide solutions wherever possible,
since at least the time of the ATLAS Computing Technical Proposal
Once this implied Objectivity/DB (RD45)
Now this implies LCG POOL
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
3
Pre-POOL: The ATLAS event store
architecture
Event collection is fundamental: processing model is “read one or more
collections, write one or more collections”
Note: collections are persistent realizations of something more general
Model allows navigational access, in principle, to all upstream data
Output “event headers” retain sufficient information to reach any data
reachable from input event headers
Architecture supports strategies for “sharing” data, so that writing
events to multiple streams, for example, does not require (but may
allow) replication of component event data
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
4
Pre-POOL: The ATLAS event store
architecture - II
Data selection model is, in relational terms, “SELECT … FROM …
WHERE …”
SELECT which components of event data?
FROM which event collection(s)?
WHERE qualifying events satisfy specified conditions
Designed to allow server-side and client-side selection implementation
Architecture also describes a placement service, though our
implementations were rather rudimentary
Mechanism for control of physical clustering of events (e.g., by stream), of
event components (e.g., by “type”), and for handling file/database allocation
and management
Interface could satisfied by a within-job service, or by a common service
shared by many jobs (e.g., in a reconstruction farm)
“Extract and transform” paradigm for selection/distribution
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
5
Hybrid event stores
There were (and are) many questions about where genuine database
functionality may be required, or useful, in an event store
STAR, for example, had already successfully demonstrated a “hybrid”
approach to event stores:
File-based streaming layer for event data
Relational database to manage the files
A hybrid prototype (AthenaROOT) was deployed in ATLAS in parallel
with the ATLAS baseline (Objectivity-based)
Transient/persistent separation strategy supports peaceful coexistence;
physicists’ codes remain unchanged
…all of this was input to the LCG requirements technical assessment
group that led to initiation of POOL…
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
6
ATLAS and POOL
ATLAS is fully committed to using LCG persistence software as its
baseline, and to contributing to its direction and its development
This means POOL: see earlier talks for POOL details
What POOL provides is closer to a framework than to an architecture
…though architectural assumptions, both implicit and explicit, go into
decisions about POOL components and their designs
The influence is bidirectional
Because POOL is still in its infancy, we do not fully understand its
implications for ATLAS data management architecture
Not a criticism: even if POOL delivered all the functionality of
Objectivity/DB (or of Oracle9i), LHC experiments would still need to
define their data management architectures
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
7
Non-event data in ATLAS
While event and conditions stores may be logically distinct and
separately managed at some levels, there is no reason they should not:
employ common storage technologies (POOL/ROOT, for example)
register their files in common catalogs
ATLAS currently has a variety of non-event data in relational databases
(e.g., the “primary numbers” that parameterize ATLAS detector
geometry)
Today this entails ATLAS-specific approaches, but in principle, a POOL
MySQL(ODBC?) Storage Manager implementation could be used, as could
the LCG SEAL dictionary for data definition
For interfaces unique to time-varying data, e.g., access based on
timestamps and intervals of validity, ATLAS again hopes to employ
common LHC-wide solutions where possible
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
8
Architectural principles for time-varying
data
Separate the interval-of-validity (IOV) database infrastructure from
conditions data storage.
Should be possible to generate and store conditions data in any
supported technology (POOL ROOT, MySQL, plain ASCII files, …)
without worrying about the interval-of-validity infrastructure
Generation of data, and assignment of intervals of validity, versions, and
tags, may be widely separated in time, and done by different people
Later register the data in the IOV database, assigning an interval of
validity, tag, version, …
Transient IOV service will consult IOV database to get pointer to correct
version of data, then invoke standard Athena conversion services to put
conditions objects in the transient store
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
9
Transient Conditions Store
4. Build transient conditions object
Conditions data
2. Ref to data
1. Timestamp, tag, version
IOV Database
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
10
Nonstandard data?
Cannot expect that all useful non-event data are produced by ATLASstandard tools
How do such data enter ATLAS institutional memory?
Is it as simple as registration in an appropriate file catalog (for files, anyway)
managed by the ATLAS virtual organization?
Is there a minimal interface such data must satisfy?
ATLAS “dataset” notions are relevant here
Ability to externalize a pointer to an object in this technology?
What is required in order for an application in the ATLAS control
framework (Athena) to access such data?
Provision of an LHCb/Gaudi-style conversion service?
LCG SEAL project may influence our answers to these questions
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
11
ATLAS data management and grids
With the current state of grid tools, grid data management has meant,
primarily, file replica cataloging and transfer, with a few higher-level
services
ATLAS has prototyped and used a range of grid replica catalogs
(GDMP, EDG, Globus (pre-RLS), RLS,…), grid file transport tools, grid
credentials
Principal tool for production purposes is MAGDA (ATLAS-developed)
MAGDA designed so that its components can be replaced by grid-standard
ones as they become sufficiently functional and mature
This has already happened with transport machinery; will happen with replica
catalog component as RLS implementations improve
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
12
Databases on grids
Need more detailed thinking on how to deal with database-resident data
on grids
Do Resource Brokers know about these?
Connections between grid replica tracking and management and databaseprovided replication/synchronization, especially when databases are
updateable
Have looked a bit at EDG Spitfire
Could, in some cases, transfer underlying database files via replication
tools, and register them (a la GDMP with Objectivity/DB databases)
Have done some prototyping with MySQL embedded servers
On-demand access over the net poses some challenges
Grid/web service interfaces (OGSA) should help
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
13
Recipe management, provenance, and
“virtual data”
Every experiment maintains, via software repositories and managed
releases, official versions of recipes used to produce data
Everyone logs the recipes (job options, scripts) used for official
collaboration data production in a bookkeeping database
ATLAS does this in its AMI bookkeeping/metadata database
Virtual data prototyping has been done in several ATLAS contexts
Parameterized recipe templates (transformations), with actual parameters
supplied and managed by a database in DC0/1 (derivations)
See Nevski/Vaniachine poster
Similar approach in AtCom (DC1)
GriPhyN project’s Chimera virtual data catalog and Pegasus planner, used
in SUSY data challenge, and (currently) for reconstruction stage of Data
Challenge 1
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
14
Provenance
“Easy” part of provenance is at the “job-as-transformation” level:
What job created this file?
What job(s) created the files that were input to that job?
…and so on…
But provenance can be almost fractal in its complexity:
An event collection has a provenance, but provenance of individual events
therein may be widely varying
Within each of those events, provenance of event components varies
Calibration data used to produce event component data have a provenance
Values passed as parameters to algorithms have a provenance
…
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
15
Some provenance challenges
CHALLENGES:
Genuinely browsable, queryable transformation/derivation catalogs,
with sensible notions of similarity and equivalence
Integration of object-level history tracking, algorithm version
stamping, …., (currently experiment-specific), with emerging
provenance management tools
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
16
The metadata muddle
Ill-defined: One man’s metadata is another man’s data
Essential, though: a multi-petabyte event store will not be navigable
without a reasonable metadata infrastructure
Physicist should query a physics metadata database to discover what
data are available and select data of interest
Metadata infrastructure should map physics selections to, e.g., lists of
logical files, so that resource brokers can determine where to run the
job, what data need to be transferred, and so on
Logical files have associated metadata as well
Some metadata about provenance is principally bookkeeping, but some
is as useful as physics properties to physicists trying to select data of
interest
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
17
Metadata integration?
Current ATLAS data challenge work distinguishes 1.) physics metadata, 2.)
metadata about logical and physical files, 3.) recipe/provenance metadata, 4.)
permanent production bookkeeping, and 5.) transient production log data, as a
starting point
It is not too hard to build an integrated system when the components are all
under the experiment’s control, but when replica metadata management is
coming from one project, provenance metadata management from another,
physics, perhaps, from the experiment itself, bookkeeping from (perhaps) still
another source, a system that supports queries across layers is a CHALLENGE
…in ATLAS, we still do not quite have a system that lets a physicist choose a
physics sample as input, and emits EDG JDL, for example, and this is just a
small step
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
18
Beyond persistence
Persistence—saving and restoring object states—is a minimalist view:
it is necessary, but is it sufficient?
Should the object model (in transient memory) of the writer determine
the view that clients can extract from an event repository?
“Schema evolution” does not suffice:
Schema evolution recognizes that, though I write objects {A, B, C, D} today,
the class definitions of A, B, C, and D may change tomorrow
It fails to recognize that my object model may use entirely different classes
{E,F, G} in place of {A, B, C, D} next year
Simple persistence fails to acknowledge that readers may not want the
same objects that writers used, and that not all readers share a single
view
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
19
Beyond persistence: a trivial example
Can a reader build a “simple” track (AOD track) from a “full track” data
object (the saved state of an object of a different class), without creating
an intermediate “full track” on the client side?
In a relational database, I can write a 500-column table but read and
transfer to clients the data from only three of the columns
Simplified view: Need an infrastructure that can answer the question,
“Can I build an object of type B from the data pointed to by this
persistent reference (perhaps the saved state of an object of type A)?”
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
20
LCG Object Dictionary: Usage (diagram
thanks to Pere Mato)
.adl
.xml
.h
Population
ROOTCINT
GCC-XML
CINT generated
code
Dict generating
code
in
(1)
CINT
Dict
LCG to CINT
Dict gateway
Conversion
Streamer
ADL/GOD
ROOT I/O
LCG
Dictionary
(2)
out
Other
Clients:
(python,
GUI, etc.)
Reflection
Requires elaboration
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
21
Elaboration?
Selection of persistent representation and streamer generation can be
separated
More than one persistent representation may be supported
Custom streamers
Separation of transient object dictionaries and persistent layout dictionaries
On input, what one reads need not dictate what one builds in transient memory
Not “Ah! This is the state of a B; I’ll create a transient B!”
Rather, “Can I locate (or possibly create) a recipe to build a B from these
data?”
Streamer
CINT
Dict
LCG to CINT
Dict gateway
Conversion
ROOT I/O
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
22
Other emerging ideas
Current U.S. ITR proposal is promoting knowledge management in
support of dynamic workspaces
One interesting aspect of this proposal is in the area of ontologies
An old term in philosophy (cf. Kant), a well-known concept in the (textual)
information retrieval literature, and a hot topic for semantic web folks
Can be useful when different groups define their own metadata, using
similar terms with similar meanings, but not identical terms with identical
meanings
Could also be useful in defining what is meant, for example, by “Calorimeter
data,” without simply enumerating the qualifying classes
David M. Malon, ANL
CHEP'03, San Diego
24 March 2003
23