Transcript ppt

SRB + Web Services = Datagrid
Management System (DGMS)
Arcot Rajasekar
(Arun Jagatheesan)
San Diego Supercomputer Center
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
Outline
SRB WSDL-based Services:
• What do we plan to accomplish?
• Why are we doing this?
• How do we plan to reach our goals?
• Design, Arch, Implementation
• Who is doing this?
• When?
• What are our timelines
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
What?
• SRB = Storage Resource Broker + MCAT
• Transparent access to distributed storage and services
• Used and proven technology
•
•
•
•
•
•
•
More than 3 years of deployment
over 2 dozen projects
Mol Sci, Neuro Sci, Astronomy, Med, ESS, DigLibs, …
~18 TB of data handled until last year
>7 million files
Multiple Access Methods – APIs, GUIs, Commands, Web-HTTP
Developed at SDSC/NPACI
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
Why
• SRB WSDL Services  Universal DataGrid
Services
• Emergence of Service oriented architectures
• To be compliant with the Grid Requirements
• SRB as a Grid Component
• SRB as a Commodity service
• Data Management as a service in the Grid
• Project-driven
• TeraGrid, NVO, PPDG, GriPhyN, SCEC, …
• BIRN, RoadNet, IT2, ...
• Loose coupling, interoperability, standards based.
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
1 Min Tutorial
• Web Page (HTML)
• Web Service
• Searched and used by human
being
• Any computer
• HTML
• HTTP
• Google
• Searched and used by
computer programs
• Any programming language,
OS etc
• XML/ WSDL – Web •Service
WSDL
Description
• SOAP (HTTP/SMTP)
• –SOAP
Transport/Acces
(HTTP/SMTP)
• UDDI - Discover
• UDDI
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
How - Design
• SRB = Command Channel + Data Channel
• Protocol = set of rules for end points to
communicate with each other
• SRB Web services – Protocol for exchange of
SRB Command Channel ONLY.
• Client flexibility to choose any protocol (SRBRPC, FTP, GFTP, HTTP) for local data channel
• Internal Data grid operations like replication –
always SRB-RPC
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
How – Design II
• All or nothing (Atomic)
• Value addition - Services apart from original
SRB Services
• Asynchronous – UUID based sessions
• Session and event management
• WSCAT – (We ♥ databases?), Status Query Service
• Security – CA, HTTPS, GSI-SOAP etc.,
• Open protocol for SRB Web services client.
• Each operation a service – not a single
aggregated SRB Web service.
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
How - Architecture
SOAP Server
SRB service bean
Java Native Interface
JDBC
Query
Interface and
Trigger
Mechanism
Service
Publisher
and WSDL
documents
SRB Native Methods
Transaction or
Event Database
Current
Implementation.
Will change soon.
SRB Server
Jasmine
GridFTP
SRM
Any DBMS or Data Manager
WSDL- DataGrid
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
Issues + Status
• SOAP client interoperability
• Asynchronous call – Management (UUID??)
• Usage of www.schemas.UniversalDatagrid.org?
• Some preliminary version of WSDL
• Ingest File to Grid, srbOperations.
• Core Datagrid Operations Document in creation
• Schemas for Datagrids
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
Who
• SDSC DAKS/NPACI DICE
• SDSC GridPortal Group
• Related Work:
• Jlab
• LBNL
• …
• We plan to collaborate to get a common service
definitions
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
When
• First Version (GGF5): Data Movement Functions
Put, Get and Replicate
• Second Version (Jan 2003): Data Discovery
• Query, Browse, Attach Metadata, Extract Metadata,…
• Other Versions (2003-2004): More SRB functions
exposed through WSDL framework
• After That : Layers 
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center
What lies ahead?
Layers
Knowledge Management & Information Mediation Services
Data Mining Services
Data Management Services
National Partnership for Advanced Computational Infrastructure
San Diego Supercomputer Center