Transcript Document
ILDG Middleware Status
Chip Watson
ILDG-6 Workshop
May 12, 2005
Outline
Status: small changes from Dec 2005
Quick review of architecture
Minimal implementation facts
Next steps
Status (quick look)
Only a small amount of middleware work
has been done in the last 6 months
– development of new metadata catalog prototype
at Adelaide based on XML database
– modifications to metadata catalog prototype at
Fermilab to conform to new interface
– small amount of work on replica catalog
prototypes at several sites (JLab, Adelaide,
Fermilab)
Architecture remains unchanged
Architecture (review)
Web Services
– Metadata Catalog maps meta data to a
global name
– Replica Catalog maps a global name to one
or more instances
– Storage Resource Manager (optional)
manages a disk, or disk + tape resource
Draft schemas (WSDL) for these services exist
Architecture (review)
File based directories contain...
– Master directory of all collaborations’ MDC, RC
and membership lists, stored as XML files
– Distributed group membership lists (XML)
Initial version of schemas (XML) exist
Implementation View
Master Directory
http://www.lqcd.org/<tbd>.xml
contains for each collaboration:
metadata catalog
replica catalog
group membership
MDC for UKQCD
MDC for USQCD
MDC for Japan
RC for UKQCD
RC for USQCD
RC for Japan
Japan group file
UKQCD group file
USQCD group file
subgroup A file
subgroup B file
file X
MetaData Catalog
ILDG schema defines only a query interface
– multiple query languages (syntax) allowed for
now (no clear winner yet)
– queries map from physics metadata values to
Global File Name (GFN)
– proposed minor modification can also return the
full physics metadata
Minimal Implementation
Master XML directory to be held at
www.lqcd.org/<tbd>.xml
For each collaboration, need at least these:
– MetaDataCatalog (e.g. running at
www.usqcd.org/<tbd>)
– trivial Replica Catalog (does 1:1 name mapping)
– standard web or ftp server to serve files
Getting going...
(or, what must a collaboration do?)
First: Deploy a metadata catalog
1. choose an existing prototype & deploy
2. populate the catalog with qcdml v1.1
compliant documents, with ILDG compliant
GFN’s (global file names)
Note: names must have collaboration name as part of
the string; this name matches the entry name in the
master directory:
gfn://collaboration/local-name
3. request [email protected] to add your MDC to
the master directory on www.lqcd.org
Getting going...
(or, what must a collaboration do?)
Second: Deploy a replica catalog
1. (option 1) write a simple function which maps
your collaboration’s GFN naming convention
into a static URL pointing to the file
(i.e., no database, just string shuffling)
OR
2. (option 2) get / implement a true RC, with
multiple instance tracking (a database)
3. request [email protected] to add your RC to the
master directory on www.lqcd.org
Third: Serve the files (http, ftp, srm, ...)
Nice things to also do...
Deploy a real RC, which can track another
collaboration’s copies of your files
Populate a group membership file, to support group
read/write access (otherwise your collaboration is
relegated to “world” status)
Deploy an SRM (with protocol negotiation) and also
at least one file server that supports parallel
streams (gridftp, bbftp, ...) for higher performance
file retrieval
Implement a web interface to your metadata
catalog
Near Term Expectations
Adelaide will deploy an MDC, RC within the next
few months
USQCD will also try to match this within the next 6
months, but is currently distracted with getting
machines into production
others have not committed yet
Middleware Working Group
Near Term Task List
Approve minor changes to MDC interface
Decide on the URL for, and deploy:
master directory file
master membership file
Collect official CA certificates from all collaborations
and post at www.lqcd.org for all to easily retrieve
(for configuring servers for strongly authenticated operations)
Most Significant
Challenges
Get data into ILDG compliant format
– create or automate creation of metadata
compliant with qcdml1.1
– write files in ILDG format (or write translation
program for on-the-fly translation)
will LQCD application developers do this?
or will manpower need to be found for translation
programs?
Get the MDC operational and populated
(other tasks are comparatively easy)
Other Challenges
Manpower to implement a nice user
interface for browsing, and optionally
retrieving files
(once per collaboration, or shared, even
hosted at www.lqcd.org ?)
Manpower to write some simple command
line client tools to be used in workflow
scripting
Goal of reaching an operational status by
June 2006 is still feasible!