Data-Centric Issues

Download Report

Transcript Data-Centric Issues

Data Centric Issues
Particle Physics and
Grid Data Management
Tony Doyle
University of Glasgow
Outline: Data to Metadata to Data
Introduction
Yesterday
“.. all my troubles seemed so far away”
 (non-Grid) Database Access
 Data Hierarchy
Today
“.. is the greatest day I’ve ever known”
 Grids and Metadata Management
 File Replication
 Replica Optimisation
Tomorrow
 Event Replication
 Query Optimisation
“.. never knows”
GRID Services: Context
Applications
Chemistry
Cosmology
Biology
Application
Toolkits
Environment
High Energy Physics
DataRemote
Distributed
Collaborative
intensive
Visualisation
computing
applications
applications
applications
toolkit
toolkit
toolkit
toolkit
Problem
Remote
solving
instrumentation
applications
applications
toolkit
toolkit
:
Grid
Services
(Middleware)
E.g.,
Resource-independent
and application-independent services
authentication, authorisation, resource location, resource allocation, events,
accounting, remote
data access, information, policy, fault detection
:
Grid Fabric Resource-specific implementations of basic services
E.g., transport protocols, name servers, differentiated services, CPU schedulers,
(Resources)
public key infrastructure, site accounting, directory service, OS bypass
Online Data Rate vs Size
Level 1
Rate
(Hz)
“How can this
data reach the
end user?”
It doesn’t…
Factor
O(1000)
Online data
reduction
via trigger
selection
10
High Level-1 Trigger
(1 MHz)
6
LHCB
105
104
High No. Channels
High Bandwidth
(500 Gbit/s)
ATLAS
CMS
HERA-B
KLOE
CDF II
High Data Archive
(PetaByte)
CDF
103
102
104
H1
ZEUS
NA49
UA1
105
LEP
ALICE
106
107
Event Size
(bytes)
Offline Data Hierarchy
“RAW, ESD, AOD, TAG”
~1 MB/event
~100 kB/event
~10 kB/event
~1 kB/event
RAW
Recorded by DAQ
Triggered events
Detector digitisation
ESD
Reconstructed
information
Pseudo-physical information:
Clusters, track candidates
(electrons, muons), etc.
AOD
TAG
Physical information:
Transverse momentum,
Selected
Association of particles, jets,
information
(best) id of particles,
Physical info for relevant “objects”
Analysis
information
Relevant information
for fast event selection
Physics Analysis
ESD: Data or Monte Carlo
ATA FLOW
INCREASING D
Event Selection
Event Tags
Tier 0,1
Collaboration
wide
Analysis Object Data
Analysis
Data
AnalysisObject
Object Data
Calibration Data
AOD
Raw Data
Analysis, Skims
Tier 2
Analysis
Groups
Physics
Physics
Physics
Objects
Objects
Objects
Tier 3, 4
Physicists
Physics Analysis
Data Structure
Physics Models
Trigger System
Monte Carlo Truth Data
Data Acquisition
Run Conditions
Level 3 trigger
Trigger Tags
Raw Data
Calibration Data
Reconstruction
Event Summary Data
ESD
Event Tags
Detector Simulation
MC Raw Data
Reconstruction
MC Event Summary Data MC Event Tags
REAL and SIMULATED data required.
Central and Distributed production.
A running (non-Grid) experiment
1.
ESD
TAG
2.
3.





Three Steps to select an event today
Remote access to O(100) TBytes of
ESD data
Via remote access to 100 GBytes of
TAG data
Using offline selection e.g. ZeusIOVariable (Ee>20.0)and(Ntrks>4)
Access to remote store via batch job
1% database event finding overhead
O(1M) lines of reconstruction code
No middleware
20k lines of C++ “glue” from Objectivity
(TAG) to ADAMO (ESD) database
100 Million selected events from 5 years’ data
TAG selection via 250 variables/event
A future (Grid) experiment
Inter
DataBase
Solutions Inc.
1000 Million events
from 1 year’s data-taking
TAG selection via
250 variables
1.
2.
3.







Three steps to (analysis) heaven
10 (1) PByte of RAW (ESD) data/yr
1 TByte of TAG data (local access)/yr
Offline selection e.g. ATLASIOVariable (Mee>100.0)and(Njets>4)
Interactive access to local TAG store
Automated batch jobs to distributed
Tier-0, -1, -2 centres
O(1M) lines of reconstruction code
O(1M) lines of middleware… NEW…
O(20k) lines of Java/C++ provide TAG
“glue” from TAG to ESD database
All working?
Efficiently?
Grid Data Management:
Requirements
1. “Robust” - software development
infrastructure
2. “Secure” – via Grid certificates
3. “Scalable” – non-centralised
4. “Efficient” – Optimised replication
Examples:
GDMP
Spitfire
Reptor
Optor
1. Robust?
Development Infrastructure
 CVS Repository
 management of DataGrid source code
 all code available (some mirrored)
 Bugzilla
 Package Repository
testbed 1 source code lines
java
cpp
ansic
python
perl
sh
csh
sed
sql
makefile
 public access to packaged DataGrid code
 Development of Management Tools
 statistics concerning DataGrid code
 auto-building of DataGrid RPMs
 publishing of generated API documentation
 latest build = Release 1.2 (August 2002)
140506 Lines of Code
10 Languages
(Release 1.0)
1. Robust?
Software Evaluation
Component
ETT
UT
IT
NI
NFF
MB
Resource Broker
v
v
v
Job Desc. Lang.
v
v
v
l
Info. Index
v
v
v
l
User Interface
v
v
v
Log. & Book. Svc.
v
v
v
Job Sub. Svc.
v
v
v
Broker Info. API
v
v
l
SpitFire
v
v
l
l
IT
v
v
v
v
l
FTree
v
v
R-GMA
v
v
v
v
v
v
l
v
v
l
v
l
NI
NFF
v
l
Image Install.
v
l
LSF Info. Prov.
v
MB
l
Not Installed
NFF
Some Non-Functioning Features
MB
Some Minor Bugs
SD
Successfully Deployed
ETT
UT
IT
V
v
NI
NF
F
MB
S
D
l
l
File Elem. Script
Info. Prov. Config.
V
v
l
RFIO
V
v
l
l
Mkgridmap &
daemon
v
l
v
l
l
CRL update &
daemon
l
Security RPMs
v
SD
l
CCM
PBS Info. Prov.
NI
MSS Staging
UT
v
LCFG
Integrated Testing
SE Info. Prov.
l
ETT
GRM/PROVE
IT
l
l
Archiver Module
Unit Testing
Component
v
Schema
UT
SD
v
Component
Globus
Rep. Cat.
Extensively Tested in Testbed
l
GDMP
Rep. Cat. API
ETT
EDG Globus Config.
v
v
l
v
v
l
Component
v
v
l
UT
IT
PingER
v
v
l
UDPMon
v
v
l
IPerf
v
v
l
Globus2 Toolkit
ETT
l
v
v
NI
NFF
M
B
l
S
D
1. Robust?
Middleware Testbed(s)
Testing Activities
“WP specific”
testbeds
“Development”
testbed
WPs add unit
tested code to
CVS repository
Run nightly build
& auto. tests
Install on cert. Testbed
& run back. compat.
tests
WPs
Any
Errors?
yes
yes
ITeam
Fix problems
Candidate public
release
for use by apps.
TSTG
no
Any
Errors?
Fix problems
“Application”
testbed
“Certification”
testbed
no
Candidate beta Release
For testing by apps.
yes
Office hours
Apps
ATG
Any
Errors?
no
24x7
B.Jones– July 2002 - n° 2
Validation/
Maintenance
=>Testbed(s)
EU-wide
development
1. Robust?
Code Development Issues
PRODUCTION TEAM
EXPERIMENT SPECIFIC MODULES
else
GRID
PHYSIC APPLICATION
VO metadata
configuration Catalog
login
If actor is proxy certified
Experiment-wide database selection
 Reverse Engineering (C++ code analysis and
restructuring; coding standards) => abstraction
of existing code to UML architecture diagrams
 Language choice
(currently 10 used in DataGrid)
Get LFNs for database
access
Output files Storage options
preferences (SE, MSS, closest...)
Allocate output
LFNs
Display available
resources/JDL
Job resource
match
Define execution criteria
(CE, priority ...)
Allocate Job Id
Write submission job (JDL?)
-Submit Job to Grid
Submit Physic
Appli
Record job
parameter
Record job parameter
(JDL, input, ...)
Optimize CE
choice /VO
ex: automatic file
replication or file
transfer & file catalog
update
File management &
PFN selection
PRODUCTION: Simulation
VO Job
submission
bookkeeping
service
Application is never
recompiled or relinked to
run on Grid - Access to
data is done via standard
POSIX calls (???????)
Submit job to
CE
Submit Job to
Working Node
Prepare exec environment
-associate PFN-LFN
Execute
Physic Appli
POSIX call -Open (LFN) Read/Write Close
or grid wrapper to POSIX calls
VO Database
access
Grid access via API
VO metadata
data description
catalog
Register/update attributes
(LFN)
Register/Update attributes (LFN)
in VO metadata Catalog
Publish job-related
information
Management of
job-related information
VO replica
catalog
Manage Output Files & update
File catalog LFN-PFN
Record execution info
Job
execution
accounting
service
testbed 1 source code lines
java
cpp
 Java = C++ - - “features” (global variables, pointer
manipulation, goto statements, etc.).
 Constraints (performance, libraries, legacy code)
ansic
python
perl
sh
csh
sed
sql
makefile
 Testing (automation, object oriented testing)
 Industrial strength?
 OGSA-compliant?
 O(20 year) Future proof??
ETT
Extensively Tested in Testbed
UT
Unit Testing
IT
Integrated Testing
NI
Not Installed
NFF
Some Non-Functioning Features
MB
Some Minor Bugs
SD
Successfully Deployed
Data Management on the Grid

“Data in particle physics is centred on events stored in a database…
Groups of events are collected in (typically GByte) files…
In order to utilise additional resources and minimise data analysis time,
Grid replication mechanisms are currently being used at the file level.”

Access to a database via Grid certificates
(Spitfire/OGSA-DAI)
Replication of files on the Grid
(GDMP/Giggle)
Replication and Optimisation Simulation
(Reptor/Optor)


2. Spitfire
HTTP + SSL
Request + client certificate
Servlet Container
SSLServletSocketFactory
Trusted CAs
Is certificate signed
by a trusted CA?
TrustManager
Revoked Certs
repository
Has certificate
been revoked?
“Secure?”
At the level
required in
Particle Physics
No
Security Servlet
Authorization Module
Does user specify role?
RDBMS
Connection
Pool
No
Find default
Yes
Role repository
Role ok?
Translator Servlet
Role
Connection
mappings
Map role to connection id
Request a connection ID
2. Database client API
 A database client API has been defined
 Implement as grid service using
standard web service technologies
 Ongoing development with OGSA-DAI
Talk:
•“Project Spitfire - Towards Grid
Web Service Databases”
3. GDMP and the Replica Catalogue
Replica
Catalogue
TODAY
StorageElement1
Globus 2.0 Replica
Catalogue (LDAP)
StorageElement2
Centralised
LDAP
based
StorageElement3
GDMP 3.0 = File mirroring/replication tool
Originally for replicating CMS Objectivity files for High
Level Trigger studies. Now used widely in HEP.
3. Giggle: “Hierarchical P2P”
“Scalable?”
Trade-off:
Consistency
Versus
Efficiency
RLI
Hierarchical indexing. The higherlevel RLI contains pointers to
lower-level RLIs or LRCs.
RLI
RLI
RLI = Replica Location Index
LRC = Local Replica Catalog
LRC
LRC
LRC
LRC
Storage Storage Storage Storage
Element Element Element Element
LRC
Storage
Element
4. Reptor/Optor: File Replication/
Simulation
User Interface
Replica
Replica
Replica
Location
Location
Location
Index
Index
Index
Replica
Metadata Catalogue
Resource Broker
Site
Core API
Optimisation API
Processing API
Site
Replica Manager
Local
Replica
Catalogue
Replica Manager
Optimiser
Optimiser
Pre-/Postprocessing
Pre-/Postprocessing
Computing
Element
Storage
Element
Computing
Element
Local
Replica
Catalogue
“Efficient?”
Requires
simulation
Studies…
 Tests file replication
strategies: e.g.
economic model
Storage
Element
Reptor: Replica architecture
Optor: Test file replication
strategies: economic model
Demo and Poster:
•“Studying Dynamic Grid Optimisation
Algorithms for File Replication”
Application Requirements
 “The current EMBL
Suggests:
production database is
 Less emphasis on
150 GB, which takes over
efficient data access
four hours to download at
and data hierarchy
full bandwidth capability at
aspects (application
the EBI. The EBI's data
specific).
repositories receive
 Large gains in
100,000 to 250,000 hits
biological applications
per day with 20% from UK
from efficient file
sites; 563 unique UK
replication.
domains with 27 sites
 Larger gains from
have more than 50 hits
application-specific
per day.” MyGrid Proposal
replication?
Events.. to Files.. to Events
Event 1
Event 2
Event 3
RAW
RAW
RAW
ESD
ESD
ESD
AOD
AOD
AOD
TAG
TAG
TAG
“Interesting Events List”
Data
Data
Files
Data
Files
Data
RAW
Files
Files
Data
File
Data
Data
Files
Data
Files
ESD
Files
Data
Data
Files
Data
Files
Data
Files
AOD
Data
Tier-0
(International)
Tier-1
(National)
Tier-2
(Regional)
Tier-3
(Local)
Not all pre-filtered events are
interesting…
Non pre-filtered events may be…
File Replication Overhead.
TAG
Data
Events.. to Events
Event Replication and Query Optimisation
Event 1
Event 2
RAW
RAW
RAW
Tier-0
(International)
ESD
ESD
ESD
Tier-1
(National)
AOD
Tier-2
(Regional)
TAG
Tier-3
(Local)
Distributed
AOD(Replicated)
AOD
Database
TAG
TAG
Event 3
“Interesting Events List”
Knowledge
“Stars in Stripes”
Data Grid for the Scientist
…In order to get
back to the real
(or simulated) data.
@#%&*!
E = mc2
Grid
Middleware
Incremental
Process…
Level of the metadata?
file?… event?… sub-event?…
Summary
Yesterday’s data access issues are still here
 They just got bigger (by a factor 100)
 Data Hierarchy is required to access more data
more efficiently… insufficient
Today’s Grid tools are developing rapidly
 Enable replicated file access across the grid
 File replication standard (lfn:\\, pfn:\\)
 Emerging standards for Grid Data Access..
Tomorrow
“.. never knows”
 Replicated “Events” on the Grid?..
 Distributed databases?..
 or did that diagram look a little too monolithic?