Transcript PPT

Introduction to CNGrid GOS 3.0
OMII-Euro & CNGrid Joint Training Material
刘杰 (Liu Jie) [email protected]
Jan. 11 2008
Outline




CNGrid snapshot
Motivation
Architecture
Components
– Core layer
– HPCG
 Summary
2
CNGrid snapshot
 Project Background
– CNGrid (China National Grid)
– CNGrid GOS 2.0
• Sponsored by China Ministry of Science and
Technology (2002~2005), the tenth five-year plan
– CNGrid GOS 3.0
• Sponsored by China Ministry of Science and
Technology (2006~2009), the eleventh five-year plan
• ICT CAS, Tsinghua U, Beihang U, etc
3
CNGrid snapshot
4
CNGrid snapshot
 International cooperation
– OMII_EU/OMII_UK
• Provide software suite
• Integrated into OMII software stack
• Use OMII leading technology in CNGrid.
– XtreemOS
• Building and Promoting a Linux-based Operating System
to Support Virtual Organizations for Next Generation
Grids.
• WP2.1Virtual Organization support in Linux
• WP3.5 Security in Virtual Organizations
5
Motivation
 Why CNGrid GOS?
– Need for Internet based grid system software
• Manage large scale distributed resource effectively
• provide uniform approach accessing the heterogeneous
resources in grid
• Enable Internet based resource sharing and collaborating
– Need for Easy-to-use grid
• Low cost: Hiding interior details for grid applications
development, deployment, management and using.
• Multiple access mode:
– Client/Server, Browser/Server and other modes
– Batch mode and interactive mode
6
Motivation
 Goals
– Develop a virtualized resource sharing mechanism
and framework on computing, data, software and
combined resources
– Provide secured, unified and friendly interfaces
accessing the scientific computing and information
services
– Support multiple domain specific applications
running on above
7
CNGrid GOS 3.0 Architecture
Other applications
Running &
Mgmt Center
Programming Env.
GSML Browser
IDE Debugger Compiler
GSML
Composer
HPCG
Portal
IDE
Railway Info
Process Grid
Batch mgmt DataGrid
portal
Using Env.
Gsh & cmd
tools
Workflow
Science
Data Grid
VegaSSH
Workflow
Using Env.
Tool/App
System Mgmt Portal
GOS Library (Batch, Message, File, etc)
GOS System Call (Resource mgmt,Agora mgmt, User mgmt, Grip mgmt, etc)
HPCG
Axis Handlers
for Message Level Security
CA Service
metainfo mgmt
File mgmt
BatchJob mgmt
Account mgmt
MetaSchedule
Message
Service
DataGrid
GridWorkflow
Grid Portal, Gsh, GSML
Workshop and Grid
Apps
Core, System and App Level
Services
Tomcat(5.0.28) +
Axis(1.2 rc2)
System
Dynamic
DeployService
J2SE(1.4.2_07, 1.5.0_07)
Grip
Agora
Security
Res AC & Sharing
Grip Instance Mgmt
Resource Space
User Mgmt
Agora Mgmt
Core
Naming
Grip Runtime
Other
RController
Tomcat(Apache)+Axis, GT4, gLite, OMII
ServiceController
Java J2SE
WP2
OS (Linux/Unix/Windows)
Res Mgmt
WP6
PC Server (Grid Server)
Other 3rd
software &
tools
Hosting
Environment
8
Other WPs
Components overview
 Components
– Core layer
– HPCG (High Performance Computing
Gateway )
•
•
•
•
Deployment
Management
Usage: Job , File & Accounting Mgmt
Application Development
9
Components: System software
 Core layer
– Agora service (aka. VO)
• organize and manage related users and resources locally
• serve as trust third part for resource providers and consumers
to negotiate sharing policies
• Provide user mgmt, resource mgmt, agora mgmt functions
based on underlying Naming layer
– A resilience decentralized registry for variety kinds of global
object
– Provide low latency object locating by object GUID
– Provide high success rate searching by multiple attributes match
– provide stable object view based on linked naming services to
enable the effective-virtual-physical address space
• Use RController to provide a uniform resource provision and
management interface
10
Components: System software
 Core layer
– Grip
• Runtime abstraction: a grip is once running of an
application
• Create grips to run applications in a managed way,
interact with an existing grip, kill a grip and release
consuming resources in automatic way
11
Components: HPCG
 HPCG motivation
– Aim to provide a high performance business computing environment for
enterprise users
– Features
•
•
•
•
•
•
Easy to install, configure and use
Provide functions what users really need
High reliability
Professional interface
Based on GOS, but can easy to port to other grid middleware
Standard compliant
–
–
–
–
JSDL (Job Submission Description Language)
BES (OGSA Basic Execution Service)
SAGA (A Simple API for Grid Application)
SOA and plain Web services (WS-related standards )
– RUS: Resource Usage Service (RUS) based on WS-I Basic Profile
1.0
12
HPCG Components
HPCG Client
CML tools
Portal
Mgmt Portal
HPCG Server
Batch job mgmt
Meta schedule
File mgmt
Account mgmt
Metainfo Mgmt
Static metainfo mgmt
Dynamic metainfo mgmt
Environment abstraction
User
Security
Message
Exception
Database
13
Scenarios of HPCG
 Requirements for High
performance computation
gateway
Enterprise
Intranet
Enterprise user
HPC gateway
server
Enterprise user
GOS
GOS
Internet
GOS
Cluster
Grid Site
(Grid Operation &
Mgmt Center)
Grid
Site
Message Subscribe/
Notification
Grid
Site
– Uniformed Web UI for HPC
users and resource
providers
– Many enterprise users share
one HPC account
– Job submission to different
HPC transparently
– Job status acquirement
efficiently
– File transport without relay
– Computation resource
accounting
14
HPCG - Deploy
 Several deploy styles
– Front-end and back-end
– All vs. split
– Relationship with clusters
• Deploy in clusters
• Deploy in a machine outside of the clusters
15
HPCG - Deploy

Pre-require
–
Software
•
•
•
•
•
–
Hardware
•
•
•
–
JDK 1.5
Ant1.6.5 or above
Mysql1.4.12 or above
Standard Ftp server
OpenPBS (PBSPro or Torque) , LSF, etc
Cpu : P4 2.4G
Memory : 4GB (at least 2GB)
Disk Space : 160GB (at least 80GB)
Network
•
•
•
•
•
Double Network Cards
ftp port : 21
ssh port : 22
http port : 8080, 18080
Message port : 61616
16
HPCG - Portal
 HPCG Management portal
– Manage all meta-info, such as cluster info,
jobqueue info, user mapping, software type,
software instance etc.
 HPCG Application portal
– End users to submit and manage jobs, manage
temp files and output files, query history accounting
info, etc
17
HPCG Management
 Several kinds of static meta-info
–
–
–
–
–
Mapping of grid user to local cluster users
Cluster meta-info
Software type info
Software instance info
Jobqueue info
 Dynamic meta-info
– The pending job length of each job queue
– The available count of license
 Support scheduling
18
HPCG - Management
19
HPCG - Application portal
 Batch job management
– Submit job
– Manage job
 File management
 Accounting management
20
HPCG - Batch Job mgmt
 Submit jobs to the grid and schedule
among multiple HPC sites
 Monitor the detailed job status
 Cancel or rerun jobs
 Query history job information
 Job status change subscribe and
notification
 Support both JSDL and BES standard
21
Batch Job management:
Job status transform diagram
Re-run
Re-run
Submitted
Failed
Terminated
Staging In
fail
Staged In
Active:
Suspended
:Suspend
Active:
Running
terminate
Re-run
Done
Active:
Queuing
Staged Out
Executed
Staging Out
22
HPCG - Batch job mgmt
23
HPCG - Batch job mgmt
24
HPCG - File mgmt
 View, create and delete of working directory in
computation node
 With zip and tar support for multiple output files
 Reliable big file (about 2GB) transfer between
gateway server and working directory
 View text files(<0.5MB) and pictures in working
directory with web browsers
 Support multiple ftp servers (wuftp, vsftp) with ipv6
support
 Pause and resume of file transfer process
25
HPCG - File mgmt
26
HPCG - File mgmt
27
HPCG - Accounting mgmt
 Accounting info about jobs come from grid
user and local
 Standard Usage Record format
 Service for query, add, remove, update and
statistics for both local and global
accounting info with ACL
 Global Accounting statistics
28
HPCG - Account mgmt
29
HPCG - Development
 HPCG Template
– function
• Describe the public logic when submitting jobs
• Have nothing with the Grid site
• Every software should have at least one Template
– form
• Xml file
30
HPCG - Development
 Schema of HPCG Template
31
HPCG - Development
 Benefits of the HPCG Template
–
–
–
–
–
Easy to develop(No need to know GOS APIs)
Easy to share the Template
Shield the heterogeneous of the resource
Global job-schedule
Sharing of software license
32
Summary
 Summary of CNGrid GOS 3.0
– A software suite to support multiple domain
applications and enable the sharing resources
among HPC sites
– Major components: System software, HPCG,
– Other components: Programming & using
environment, Grid workflow and Data Grid
 Time schedule
– 2008.1 release of CNGrid GOS 3.0
– 2008.2 deployed on CNGrid
33
Thanks!
34