RAL Tier A - PPD - STFC Particle Physics Department

Download Report

Transcript RAL Tier A - PPD - STFC Particle Physics Department

BaBar Grid
Tim Adye
Particle Physics Department
Rutherford Appleton Laboratory
PP Grid Team
Coseners House
8th November 2002
8th November 2002
Tim Adye
Talk Plan
BaBar distributed computing model
RAL Tier A
Remote job submission
BaBar VO and Authorisation
Data distribution
8th November 2002
Tim Adye
The BaBar Collaboration
9 Countries
8th November 2002
74 Institutions
Tim Adye
566 Physicists
PEP-II e+e- Ring and BaBar Detector
LER (e+, 3.1 GeV), I(e+ )=2.1 A
Linear Accelerator
HER (e-, 9.0 GeV), I(e- )=1.0 A
PEP-II ring: C=2.2 km
May 26, 1999: 1st events recorded by BaBar
BaBar’s Distributed Computing Model
• Goal is to spread computing load much more around
the collaboration
• Simulation production is already distributed – 75% in the UK!
• Now have three new “Tier A” centres
• Lyon – Objectivity (database) analysis (since last year)
• RAL – Kanga (ROOT microDST) analysis (from May 2002)
• Padova – Reprocessing (just starting)
• Also several “Tier C” sites (ie. Universities, 9 in UK)
• Analysis data format (Kanga vs Objectivity) is a
matter of heated debate at the moment
• Whatever the future of Objectivity, Kanga (championed
in UK/Germany) looks set to continue
8th November 2002
Tim Adye
RAL Tier A
• UK MoU with BaBar reduces our common fund
contributions in exchange for providing Tier A facility
• RAL has now relieved SLAC of all Kanga analysis
• Impressive takeup from UK and non-UK users
• See Andrew’s talk
• It is the primary repository of Kanga data
• ~20 TB on disk
• BaBar analysis environment tries to mimic SLAC so
external users feel at home
• Grid job submission should greatly reduce this requirement
8th November 2002
Tim Adye
Remote Job Submission
Short term (this month!)
• Allow SLAC or University users to submit BaBar
analysis jobs to RAL or Lyon Tier A sites from their
home machines
• dg-job-submit
• Simplifies local development and debugging, while providing
access to full dataset and large CPU farms
• RAL vs IN2P3 selected explicitly by user
• “canned” JDL Requirements; dataset selection left to user
• Why couldn’t we do this a year ago?
• BaBar authorisation (see later)
• Gatekeeper needed to be able to submit to production farm
• Define which BaBar configuration files to send with job
• Developed a procedure to merge all tcl files into one
• Resource Broker reliability – better with EDG 1.2.
8th November 2002
Tim Adye
Remote Job Submission
Medium term (early next year)
• Allow remote submission to UK Farms and SLAC
• In principle this is already set up
• Select site (CE) based on user requirements
• Eg. Dataset available, software release, etc.
• Split job between sites based on available datasets
• Already have demonstrator for a canned analysis job
• http://www.hep.man.ac.uk/groups/slacb/gridtest.html
8th November 2002
Tim Adye
BaBar VO and Authorisation
• Use certificates from EDG and ESnet CAs for
• Authorisation required to identify BaBar users
• Provides access to BaBar-specific facilities and environment
• Cannot maintain grid-mapfile by hand
• Doesn’t scale to 1202+ users
• Use existing SLAC BaBar user registration
User provides certificate id at SLAC
Automatic procedure checks AFS group and fills VO
CEs use VO for authorisation
Naturally handles people leaving the experiment
8th November 2002
Tim Adye
Analysis Metadata
• Currently have about a million Kanga files in a deep
directory tree
• Need a catalogue to facilitate data distribution and allow
analysis datasets to be defined.
• SQL database
• Locates ROOT files associated with each dataset
• Selections based on decay channel, run range, beam
energy, reconstruction processing version, etc.
• Each site has its own (MySQL or Oracle) database
• Includes a copy of SLAC database with local information (eg.
files on local disk, files to import, local tape backups)
• Some use of SRB for local Objectivity metadata at
SLAC and Lyon
8th November 2002
Tim Adye
Data Distribution
• Kanga and Objectivity distribution currently handled
by homegrown procedures
• Use bbftp. bbcp soon. Will look at GridFTP
• Next step is to run transfers using Grid job
• Web control pages under development
• Authorisation done using Grid certificates
• Looking at SRB and RLS for data distribution
8th November 2002
Tim Adye
• BaBar already has a highly distributed analysis
• RAL Tier A saves BaBar!
• Want to use Grid job submission tools – now
• Looking at SRB and RLS
8th November 2002
Tim Adye