Transcript Document

Computing Sciences Directorate, L B N L
Storage Resource Management
In the Grid Environment
Alex Sim
Junmin Gu
Arie Shoshani
Scientific Data Management Group
Lawrence Berkeley National Laboratory
http://sdm.lbl.gov/srm
CHEP 2003
1
Outline
Computing Sciences Directorate, L B N L
• What are Storage Resource Managers - Motivation
• General Analysis Scenario and the use of SRMs
• SRM functionality
• Real examples of working SRMs
• Advantages of using SRMs
• Conclusions and Future Work
CHEP 2003
2
Motivation
Computing Sciences Directorate, L B N L
• Grid architecture needs to include reservation &
scheduling of:
• Compute resources
• Storage resources
• Network resources
• Storage Resource Managers (SRMs) role in the
data grid architecture
•
•
•
•
•
Shared storage resource allocation & scheduling
Especially important for data intensive applications
Often files are archived on a mass storage system (MSS)
Wide area networks – minimize transfers
large scientific collaborations (100’s of nodes,
1000’s of clients) – opportunities for file sharing
• File replication and caching may be used
• Need to support non-blocking (asynchronous) requests
CHEP 2003
3
General Analysis Scenario
Computing Sciences Directorate, L B N L
Client’s site
...
client
client
logical
query
Request
Interpreter
Disk
Cache
result
files
Storage
Resource
Manager
Disk
Cache
Compute
Resource
Manager
Compute
Engine
Site 1
CHEP 2003
Execution
DAG
Replica
catalog
request
planning
Network
Weather
Service
Requests for
data placement and
remote computation
Storage
Resource
Manager
Disk
Cache
A set of
logical files
Execution plan
and site-specific
files
Request
Executer
Storage
Resource
Manager
Metadata
catalog
Compute
Resource
Manager
Compute
Engine
Site 2
network
Storage
Resource
Manager
...
Disk
Cache
MSS
Site N
4
SRM is a Service
Computing Sciences Directorate, L B N L
• SRM functionality
• Manage space
• Negotiate and assign space to users
• Manage “lifetime” of spaces
• Manage files on behalf of a user
• Pin files in storage till they are released
• Manage “lifetime” of files
• Manage action when pins expire (depends on file types)
• Manage file sharing
• Policies on what should reside on a storage resource at any one time
• Policies on what to evict when space is needed
• Get files from remote locations when necessary
• Purpose: to simplify client’s task
• Manage multi-file requests
• A brokering function: queue file requests, pre-stage when possible
• Provide grid access to/from mass storage systems
• HPSS (LBNL, ORNL, BNL), Enstore (Fermi), JasMINE (Jlab), Castor
(CERN), MSS (NCAR), …
CHEP 2003
5
Types of SRMs
Computing Sciences Directorate, L B N L
• Types of storage resource managers
• Disk Resource Manager (DRM)
• Manages one or more disk resources
• Tape Resource Manager (TRM)
• Manages access to a tertiary storage system (e.g. HPSS)
• Hierarchical Resource Manager (HRM=TRM + DRM)
• An SRM that stages files from tertiary storage into its disk cache
• SRMs and File transfers
• SRMs DO NOT perform file transfer
• SRMs DO invoke file transfer service if needed
(GridFTP, FTP, HTTP, …)
• SRMs DO monitor transfers and recover from failures
• TRM: from/to MSS
• DRM: from/to network
CHEP 2003
6
Analysis Scenario
for Local Computation
Computing Sciences Directorate, L B N L
Client’s site
...
client
client
logical
query
Metadata
catalog
Request
Interpreter
logical files
DRM
Compute
Engine
site-specific
files requests
Disk
Cache
Request
Executer
site-specific
files
request
planning
Replica
catalog
Network
Weather
Service
pinning & file
transfer requests
network
HRM
Disk
Cache
CHEP 2003
tape system
...
DRM
Disk
Cache
Uniform SRM
interface
7
Computing Sciences Directorate, L B N L
SRM works with disk caches
as well as legacy systems
Denver
client
Logical Request
BIT-MAP
Index
Request
Manager
DRM
File Transfer
Monitoring
SC 2001 Demo
For HENP – STAR
Experiment
Legend:
GridFTP
Control path
Data Path
Disk
Cache
Chicago
server
Livermore
Berkeley
server
server
server
GridFTP
Disk
Cache
CHEP 2003
Berkeley
GridFTP
Disk
Cache
DRM
FTP
Disk
Cache
GridFTP
HRM
Disk
Cache
8
Earth Science Grid Demo - SC 2002
Computing Sciences Directorate, L B N L
LBNL
HPSS
High Performance
Storage System
HRM
Storage Resource
Management
disk
ANL
gridFTP
server
NCAR openDAPg
server
CAS
Community Authorization Services
gridFTP
Striped
server
MyProxy
server
Tomcat servlet engine
disk
MCS client
MyProxy client
RLS client
DRM
Storage Resource
Management
LLNL
DRM
Storage Resource
Management
GRAM
gatekeeper
gridFTP
server
gridFTP
USC-ISI
MCS
Metadata Cataloguing Services
RLS
Replica Location Services
CHEP 2003
SOAP
CAS client
ORNL
gridFTP
server
HRM
Storage Resource
Management
gridFTP
gridFTP
server
RMI
disk
NCAR-MSS
Mass Storage System
disk
HRM
Storage Resource
Management
HPSS
High Performance
Storage System
9
Computing Sciences Directorate, L B N L
Uniformity of Interface 
Compatibility of SRMs
Client
USER/APPLICATIONS
Grid Middleware
SRM
SRM
Enstore
CHEP 2003
SRM
JASMine
SRM
DCache
SRM
CASTOR
SRM
Disk
Cache
10
Computing Sciences Directorate, L B N L
High Level View of SRM
setup in SC 2002
Client
(USER/APPLICATIONS)
SRM
SRM
JASMine
SRM
Enstore
CHEP 2003
11
Screen Dump of Demo at Fermi Booth
Computing Sciences Directorate, L B N L
CHEP 2003
12
Where do SRMs belong
in the Grid architecture?
Request
Interpretation
and Planning
Services
CHEP 2003
Data
Transport
Services
CONNECTIVITY
File Transfer
Service
(GridFTP)
Communication
Protocols (e.g.,
TCP/IP stack)
FABRIC
RESOURCE:
COLLECTIVE 1:
GENERAL
SERVICES FOR
COORDINATING
MULTIPLE
RESOURCES
COLLECTIVE
Computing Sciences Directorate, L B N L
Networks
Workflow or
Request
Management
Services
Data
Federation
Services
ApplicationSpecific Data
Discovery Services
Data Filtering or
Transformation
Services
Storage
Resource
Manager
Community
Authorization
Services
General Data
Discovery
Services
Data Filtering or
Transformation
Services
Consistency Services
(e.g., Update Subscription,
Versioning, Master Copies)
Storage
Management
(Brokering)
Database
Management
Services
Compute
Scheduling
(Brokering)
Compute
Resource
Management
Monitoring/
Auditing
Services
Resource
Monitoring/
Auditing
Authentication and
Authorization
Protocols (e.g., GSI)
Mass Storage
System
(HPSS)
Other
Storage
systems
Compute
Systems
This figure based on the
Grid Architecture paper
by Globus Team
13
Request
Interpretation
and Planning
Services
FABRIC
COLLECTIVE 1:
GENERAL
SERVICES FOR
COORDINATING
RESOURCE:
MULTIPLE
SHARING SINGLE
RESOURCES
RESOURCES
CONNECTIVITY
COLLECTIVE
Computing Sciences Directorate, L B N L
SRMs provide a brokering service
by supporting multi-file requests
CHEP 2003
Data
Transport
Services
File Transfer
Service
(GridFTP)
Workflow or
Request
Management
Services
Data
Federation
Services
ApplicationSpecific Data
Discovery Services
Storage
Management
(Brokering)
Storage
Resource
Manager
General Data
Discovery
Services
Data Filtering or
Transformation
Services
Communication
Protocols (e.g.,
TCP/IP stack)
Networks
Community
Authorization
Services
Consistency Services
(e.g., Update Subscription,
Versioning, Master Copies)
Data Filtering or
Transformation
Services
Database
Management
Services
Compute
Scheduling
(Brokering)
Compute
Resource
Management
Monitoring/
Auditing
Services
Resource
Monitoring/
Auditing
Authentication and
Authorization
Protocols (e.g., GSI)
Mass Storage
System
(HPSS)
Other
Storage
systems
Compute
Systems
This figure based on the
Grid Architecture paper
by Globus Team
14
Computing Sciences Directorate, L B N L
SRMs use in STAR for
Robust Muti-file replication
Anywhere
Recovers from
file transfer failures
Recovers from
archiving failures
HRM-Client
Command-line Interface
Recovers from
staging failures
HRM-COPY
(thousands of files)
Get list
of files
SRM-GET (one file at a time)
LBNL
HRM
HRM
(performs writes)
GridFTP GET (pull mode)
Disk
Cache
(performs reads)
BNL
Disk
Cache
Network transfer
archive files
CHEP 2003
stage files
15
Web-Based File Monitoring Tool
Computing Sciences Directorate, L B N L
Shows:
-Files already
transferred
- Files during
transfer
- Files to be
transferred
Also shows for
each file:
-Source URL
-Target URL
-Transfer rate
CHEP 2003
16
Computing Sciences Directorate, L B N L
GridFTP-HPSS
Access Provided through HRM
Using HRM protocol
New: GridFTP-HPSS
through HRM
Client
Client
GridFTP-API
SRM-API
GridFTP entry
SRM-API
HRM
HRM
GridFTP-API
GridFTP
GridFTP move
• No modifications to the MSS
• Managing queues of multiple requests to the MSS
• Minimizing tape mounts
• Recovers from MSS transient failures
CHEP 2003
17
Computing Sciences Directorate, L B N L
GridFTP-HRM-Layer
implementation detail
Client
GridFTP-API
1a
GridFTP entry
GridFTP move
1b
2a Shared
memory
FTPHRM
Layer
Corba
2b
HRM
3b
3a
GridFTP exit
1a: stor/retv
1b: hrm_get/hrm_put
CHEP 2003
2b: call_back
2a: unblock semaphore
3a: success_code
3b: hrm_release
18
Types of Spaces and Files
Computing Sciences Directorate, L B N L
• Space reservation services
• Spaces and files: volatile, durable, permanent
• Lifetime, action at end of lifetime
• Volatile – SRM owned, files can be removed if space needed
• Durable – files cannot be removed, but administrator notified
• Permanent – can be removed by owner only
• Directory services
• Usual unix semantics
• any type of files in directory
• Access control services
• Support owner/group/world permission
• Can only be assigned by owner
• File sharing for read-only files
• check with source for shared file permission
• File sharing for updatable files
• check with “master copy” for time of last update
CHEP 2003
19
Computing Sciences Directorate, L B N L
File movement functionality:
srmGet, srmPut, srmReplicate
srmGet/srmPut
SRM/
No-SRM
SRM
Client-FTP-put
(push)
FTP-get
SRM/
No-SRM
Client
Client-FTP-get
(pull)
srmReplicate
SRM
Client
SRM-FTP-get
(pull)
SRM-FTP-put
(push)
CHEP 2003
20
SRM Methods
Computing Sciences Directorate, L B N L
File Movement
srm(Prepare)Get:
srm(Prepare)Put:
srmReplicate:
Lifetime management
srmReleaseFiles:
srmPutDone:
srmExtendFileLifeTime:
Terminate/resume
srmAbortRequest:
srmAbortFile
srmSuspendRequest:
srmResumeRequest:
CHEP 2003
Space management
srmReserveSpace
srmReleaseSpace
srmUpdateSpace
srmCompactSpace:
srmGetCurrentSpace:
FileType management
srmChangeFileType:
Status/metadata
srmGetRequestStatus:
srmGetFileStatus:
srmGetRequestSummary:
srmGetRequestID:
srmGetFilesMetaData:
srmGetSpaceMetaData:
21
Advantages of using SRMs
Computing Sciences Directorate, L B N L
• Synchronization between storage resources
• Pinning file, releasing files
• Allocating space dynamically on as “needed basis”
• Insulate clients from storage and network system failures
• Transient MSS failure
• Network failures
• Interruption of large file transfers
• Facilitate file sharing
• Eliminate unnecessary file transfers
• Support “streaming model”
• Use space allocation policies by SRMs: no reservations needed
• Use explicit release by client for reuse of space
• Control number of concurrent file transfers
• From/to MSS – avoid flooding MSS and thrashing
• From/to network – avoid flooding and packet loss
CHEP 2003
22
Ongoing and Future Work
Computing Sciences Directorate, L B N L
• Ongoing work
• Developing Standard SRM interfaces
• Particle Physics Data Grid (PPDG) project
• LBNL, TJNAF, FNAL
• European Data Grid (EDG) project
• WP2 - data management
• WP5 – mass storage (CASTOR)
• Deployment
• LBNL, BNL, ORNL, TJNAF, FNAL, CERN, (SE-England)
• Use of SRM by other agents
• Storage Resource Broker (SDSC) calling HRM to Stage files from HPSS
• GridFTP invoking HRM
• Future work
•
•
•
•
•
CHEP 2003
Access authorization – community access service (CAS)
“On-demand” space allocation, accounting, and charging
Replica management – invoke SRMs and RLS as a single service
Request executer (e.g. DAGMAN) to invoke SRMs
SRMs over NeST (Network STorage)
23