Databases and the Grid - Indico

Download Report

Transcript Databases and the Grid - Indico

Secure Grid Data Management
Technologies in ATLAS
Miguel Branco (CERN)
D. Malon, A. Vaniachine (ANL)
CHEP 2004
Overview
 Introduction
 ATLAS Production System
 ATLAS Databases
 Conclusion
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Introduction
 Security “requirements” from a Data Challenges production manager:
o
A Production manager cares about its data

o
o
o
o
o
o
o
Not so much about the underlying middleware security
we want our data to be available
we don’t want to lose our data
we don’t want it corrupted
we want to remain good friends with site managers
we don’t want to upset our physicists with security restrictions
we want to be in charge of the production and of the data
we want to audit data usage and data access
 … from a physicist:
o
o
o
o
“don’t bother me”
No visible security
Especially no grid certificates
Especially not wanting to request or renew grid certificates
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
ATLAS Data Challenges
 ATLAS decided to undertake a series of Data Challenges in
order to validate its Computing Model, its software, its data
model
 Started summer 2004:
o
ATLAS DC-2
o
Unsupervised production across many sites spread over three
different Grids (US Grid3, NorduGrid, LCG-2)
4 major components:
 Production Database
 Windmill – ATLAS Production Supervisor
 Job Executors – one executor per “grid-flavor”
 Common Data Management system – Don Quijote (see #142)
 Introduced the new ATLAS Automatic Production System
(see #501):
o
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
ATLAS Production System
 Brief overview:
o Windmill (the production supervisor) connects to Oracle
Database at CERN
o
o
o
o

Several Windmill instances running world-wide

On-going work regarding the usage of GSI and Jabber
Job executors connect to Windmill using Jabber
Each grid Executor has the user certificate of the
production manager
Job executors interact with grid middleware and with data
management service using grid certificates
Windmill interacts with data management service – without
grid certificates
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
ATLAS Production System
Prod. DB
Windmill
Don Quijote
Executor
a grid…
Another
grid…
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Don Quijote
 Access service for grid file-based data
o High-level interface for grid data management for the
ATLAS Automatic Production System
o Allow transparent registration and movement of replicas
between all grid “flavors” used by ATLAS


29/09/2004
Across different grid “islands” as well as within a given grid
US Grid3, NorduGrid and LCG-2
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Don Quijote
 Security model:
o Client

o
Server


o
secure version (GSI) and insecure version
Either forwards user credentials (if client using secure version)
Either acts on behalf of user using service certificate
When server does what?



29/09/2004
Security needs depend on the action being taken. e.g:
Search requests can be done with service certificate if end-user
didn’t supply credentials
All other requests require a secure client BUT:
• Service certificate can still be used
– Terrible, but pragmatic decision
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Problems encountered during
ATLAS DC
 ATLAS has some long jobs (over 24 hours)
o Using MyProxy server on LCG

Had some downtime which affected LCG production

Either full or none access is granted

New LCG File Catalog supports ACLs that are mapped to individual
user accounts – does this scale?
 Replica catalogs from all 3 grids:
o Not supporting namespaces, ACLs, …
o
o
Any user can change LCG catalog without any security
requirements
Looking forward for new replica catalogs
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Problems encountered during
ATLAS DC
 LCG:
o ATLAS jobs being submitted as atlassgm
o Problem was pointed out, but no one ever complained as a
possible security breach
o Castor@CERN accessible from “grid” and from outside”grid”

Very useful but…
 Grid3 and NorduGrid:
o Entire ATLAS Data Challenges production ran on behalf of
few users
 Only NorduGrid complained of the existence of the Don
Quijote service certificate for data access
 “Single” ATLAS “VO”
o Not effectively used across 3 grids!!
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Status on security for ATLAS
Production System
 ATLAS Production System will be reviewed soon:
o Security will be taken into consideration for all
components
o So far, the need to have a working system as quickly as
possible delayed the process
 Security “gaps” derive mostly from the existing grid
middleware
 Overall still a bit to go, but the usage of client tools provided
by the Production System protects end-users
o Don Quijote client tools to access data files produced by
the Data Challenges
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
What we need?
 ATLAS will need to develop grid services on top of the
existing ones
 Experiments need:
o Middleware (or at least a set of guidelines) on how to
develop and host secure services
o Hoped to get this quickly from EGEE

o
And from ARDA as well
Not clear yet – no clear “standards”



29/09/2004
WS-Security components not completed yet
GSI delegation not supported commercially
Grid AuthZ needs to be standards-based
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
File-based data
 Despite current limitations, grid security model and data
transport mechanisms suited for handling the file-based data
 Also true for database-resident file cataloguing and file-level
metadata:
o
stored in grid-based Replica Location catalogs and respective
Metadata catalogs
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Databases and the Grid
 In addition to file-based event data, LHC data processing
applications traditionally require access to large amounts of
valuable non-event data stored in relational databases
o
detector conditions, calibrations, etc.
o
For that purpose ATLAS Data Challenges exercise the
Computing Model processing and managing data on three
different grid flavors
 In contrast to the file-based data, this database-resident
data flow has to be detailed further
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
ATLAS Data Challenge 2 Database
Infrastructure
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Securing Database-resident
data on the Grid
 ATLAS is evaluating several technologies for securing
database-resident data
 Secure grid query engine technologies federating
heterogeneous databases on the grid
o used at Fermilab Run II experiments
 Methods utilizing GSI data-transport channel for database
services delivery to the grid clusters behind closed firewalls
 Grid certificate authorization technologies for database
access control where the safety features are pushed into the
database engine code
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Database Grid
Solution
 Prototype of the Database Grid
Solution is in use in Fermilab’s Run
II Data Handling system, servicing
millions of queries per day
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Penetrating Firewalls
 ATLAS applications require open TCP/IP channels to the database
servers
o To deliver database-resident data
 To harness Grid computing resources that are not dedicated to ATLAS
one must address the problem of data delivery to the computing nodes on
the clusters behind the closed firewalls
 As a partial solutions in ATLAS we are implementing:
o
o
database server replica deployment on a dedicated node behind the firewall
Network address translation (NAT) techniques providing TCP/IP conduits to
the listed database servers ports/IP addresses
 All of these require considerable involvement of the cluster support
personnel
 An alternative using GSI data-transfer channels – without requiring
changes on cluster configuration - is presented
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Secure GSI Transport
Channel
Extract-Transport-Install
Extract &
Transport
Main
Server
Transport & Install
 MySQL simplified the delivery
of the extract-transportinstall components of ATLAS
database
 This provides the database
services needed for the Data
Challenges for sites with Grid
Compute Elements behind
closed firewalls
o
some sites on Grid3 and
NorduGrid
Replica
Servers
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Database Access on the
Grid
 Two different security models (and schools)
o Using a separate server:
•
•
•
•
o
Spitfire (EDG WP2) – SOAP/XML text-only data transport
DAI (IBM UK) – Spitfire technologies + XML binary extensions
Perl DBI database proxy (ALICE) – SQL data transport
Oracle 10g (separate authorization layer)
Integrated in database server:
• Instead of surrounding database with external secure layers
the safety features are embedded inside of the code
– Open-source databases (MySQL, PostgreSQL)
– IBM DB2 loadable security modules techniques
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
External Security
 Providing security in a separate layer features
o Advantages
• Proven traditional approach, used everywhere
o
Disadvantages:
• Weak database authorization techniques behind the secure
layer
• Clear-text passwords embedded in the code
• Limited control over the secure transport channel,
cryptographic handshake overhead for every gSOAP message
• Requires protocol extensions (XML with binary attachments)
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Embedded Security
 Instead of surrounding database with external secure layers
the safety features are embedded inside of the code
o Advantages



o
Elimination of the clear-text passwords
Integration of the same grid security model throughout all data
flow channels
Inefficient data transfer bottlenecks are eliminated
Disadvantages:

29/09/2004
Pushing secure authorization into the database engine result in a
monolithic system that are known to be more fragile
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Grid-enabling Databases
 Grid-enabled MySQL server is deployed on the database
development server tier in ATLAS
o
o
Certificate authorization is supported directly on the database server
Only authentication must be encrypted, the TCP/IP data-transfer
channel can be used in un-encrypted mode (for data transfer
efficiency)
 The technology was used extensively in DC2 pre-production
on Grid3
o
grid-proxy certificate authorization was used for processing of 7K
jobs
 In addition, use of certificate credentials provided
capabilities for efficient locking mechanism to support
chaotic mode of job submission on the Grid
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Status on security for ATLAS
Databases
 To overcome database access limitations one must to go
beyond the existing grid infrastructure
 We are evaluating the technologies laying a foundation of a
new hyperinfrastructure:
o
o
o
Secure grid query engine technologies federating heterogeneous
databases on the grid
Methods utilizing Grid Security Infrastructure data-transport
channel for database services delivery to the grid clusters behind
closed firewalls
Grid certificate authorization technologies for database access
control where the safety features are pushed into the database
engine code
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004
Conclusion
 ATLAS taking security heavily on consideration
 Two main areas where security is being addressed:
o
o
File-based data
Database-resident data
 Security improvements for file-based data depend on new
grid middleware; help from grid middleware providers
 Developments on secure database access also on-going;
looking into new projects such as LCG3D
 Urgent need of grid certificates for all users
o where it all begins…
29/09/2004
M. Branco, D. Malon, A. Vaniachine - CHEP 2004