GRIDS Center Overview John McGee, USC/ISI

Download Report

Transcript GRIDS Center Overview John McGee, USC/ISI

NSF Middleware Initiative
GRIDS Center Overview
John McGee, USC/ISI
June 26, 2002
Internet2 – Base CAMP
Boulder, Colorado
GRIDS Center
Grid Research Integration Development & Support
http://www.grids-center.org
USC/ISI - Chicago - NCSA – SDSC - Wisconsin
Agenda

Vision for Grid Technology

GRIDS Center Operations

Software Components

Packaging and Testing

Documentation and Support

Testbed

Globus Security and Resource Discovery

Campus Enterprise Integration
GRIDS Center overview for Base CAMP
3
Enabling Distributed Science
www.grids-center.org
Vision for Grid Technologies
Enabling Seamless Collaboration
GRIDS
help distributed communities pursue
common goals
Scientific research
 Engineering design
 Education
 Artistic creation

Focus
is on the enabling mechanisms required
for collaboration

Resource sharing as a fundamental concept
GRIDS Center overview for Base CAMP
5
Enabling Distributed Science
www.grids-center.org
Grid Computing Rationale
The
need for flexible, secure, coordinated resource
sharing among dynamic collections of individuals,
institutions, and resources
See “The Anatomy of the Grid: Enabling Scalable Virtual Organizations” by Foster,
Kesselman, Tuecke at http://www.globus.org (in the “Publications” section)
The
need for communities (“virtual organizations”) to
share geographically distributed resources as they
pursue common goals while assuming the absence of:




central location
central control
omniscience
existing trust relationships
GRIDS Center overview for Base CAMP
6
Enabling Distributed Science
www.grids-center.org
Elements of Grid Computing
Resource


sharing
Computers, storage, sensors, networks
Sharing is always conditional, based on issues of trust,
policy, negotiation, payment, etc.
Coordinated

problem solving
Beyond client-server: distributed data analysis,
computation, collaboration, etc.
Dynamic,
multi-institutional virtual organizations

Community overlays on classic org structures

Large or small, static or dynamic
GRIDS Center overview for Base CAMP
7
Enabling Distributed Science
www.grids-center.org
Resource-Sharing Mechanisms
• Should address security and policy concerns of resource
owners and users
• Should be flexible and interoperable enough to deal with
many resource types and sharing modes
• Should scale to large numbers of resources, participants,
and/or program components
• Should operate efficiently when dealing with large
amounts of data & computational power
GRIDS Center overview for Base CAMP
8
Enabling Distributed Science
www.grids-center.org
Grid Applications
Science
portals
Help scientists overcome steep learning
curves of installing and using new software
 Solve advanced problems by invoking
sophisticated packages remotely from Web
browsers or thin clients
 Portals are currently being developed in
biology, fusion, computational chemistry,
and other disciplines

Distributed

computing
High-speed workstations and networks can
yoke together an organization's PCs to form
a substantial computational resource
GRIDS Center overview for Base CAMP
9
Enabling Distributed Science
www.grids-center.org
Mathematicians Solve NUG30



Looking for the solution to the
NUG30 quadratic assignment
problem
An informal collaboration of
mathematicians and computer
scientists
Condor-G delivered 3.46E8
CPU seconds in 7 days (peak
1009 processors) in U.S. and
Italy (8 sites)
14,5,28,24,1,3,16,15,
10,9,21,2,4,29,25,22,
13,26,17,30,6,20,19,
8,18,7,27,12,11,23
MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin
GRIDS Center overview for Base CAMP
10
Enabling Distributed Science
www.grids-center.org
More Grid Applications
Large-scale

data analysis
Science increasingly relies on large datasets that
benefit from distributed computing and storage
Computer-in-the-loop



instrumentation
Data from telescopes, synchrotrons, and electron
microscopes are traditionally archived for batch
processing
Grids are permitting quasi-real-time analysis that
enhances the instruments’ capabilities
E.g., with sophisticated “on-demand” software,
astronomers may be able to use automated
detection techniques to zoom in on solar flares as
they occur
GRIDS Center overview for Base CAMP
11
Enabling Distributed Science
www.grids-center.org
Data Grids for
High Energy Physics
~PBytes/sec
Online System
~100 MBytes/sec
~20 TIPS
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
~622 Mbits/sec
or Air Freight (deprecated)
France Regional
Centre
SpecInt95 equivalents
Offline Processor Farm
There is a “bunch crossing” every 25 nsecs.
Tier 1
1 TIPS is approximately 25,000
Tier 0
Germany Regional
Centre
Italy Regional
Centre
~100 MBytes/sec
CERN Computer Centre
FermiLab ~4 TIPS
~622 Mbits/sec
Caltech
~1 TIPS
Tier 2
~622 Mbits/sec
Institute
Institute Institute
~0.25TIPS
Physics data cache
Institute
Tier2 Centre
Tier2 Centre
Tier2 Centre
Tier2 Centre
~1 TIPS ~1 TIPS ~1 TIPS ~1 TIPS
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more
channels; data for these channels should be cached by the
institute server
~1 MBytes/sec
Tier 4
Physicist workstations
GRIDS Center overview for Base CAMP
12
Enabling Distributed Science
www.grids-center.org
Still More Grid Applications
Collaborative




work
Researchers often want to aggregate not only data and
computing power, but also human expertise
Grids enable collaborative problem formulation and data
analysis
E.g., an astrophysicist who has performed a large, multiterabyte simulation could let colleagues around the world
simultaneously visualize the results, permitting real-time
group discussion
E.g., civil engineers collaborate to design, execute, &
analyze shake table experiments
GRIDS Center overview for Base CAMP
13
Enabling Distributed Science
www.grids-center.org
iVDGL:
International Virtual Data Grid Laboratory
Tier0/1 facility
Tier2 facility
Tier3 facility
10 Gbps link
2.5 Gbps link
622 Mbps link
Other link
U.S. PIs: Avery, Foster, Gardner, Newman, Szalay
GRIDS Center overview for Base CAMP
14
www.ivdgl.org
Enabling Distributed Science
www.grids-center.org
The 13.6 TF TeraGrid:
Computing at 40 Gb/s
Site Resources
26
4
Site Resources
HPSS
HPSS
24
8
External
Networks
External
Networks
Caltech
Argonne
External
Networks
External
Networks
Site Resources
HPSS
5
NCSA/PACI
8 TF
240 TB
SDSC
4.1 TF
225 TB
TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne
GRIDS Center overview for Base CAMP
15
Site Resources
UniTree
www.teragrid.org
Enabling Distributed Science
www.grids-center.org
Portal Example

NPACI HotPage

https://hotpage.npaci.edu/
GRIDS Center overview for Base CAMP
16
Enabling Distributed Science
www.grids-center.org
Software Components
General Approach



Define Grid protocols & APIs

Protocol-mediated access to remote resources

Integrate and extend existing standards

“On the Grid” = speak “Intergrid” protocols
Develop a reference implementation

Open source Globus Toolkit

Client and server SDKs, services, tools, etc.
Grid-enable wide variety of tools


Globus Toolkit, FTP, SSH, Condor, SRB, MPI, …
Learn through deployment and applications
GRIDS Center overview for Base CAMP
18
Enabling Distributed Science
www.grids-center.org
Software Components


GRIDS Center software is a collection of
packages developed in the academic research
community

Protocol and architecture approach

Reference implementations
Each package has at least 2 production level
implementations before inclusion into the Grids
Center Software Suite
GRIDS Center overview for Base CAMP
19
Enabling Distributed Science
www.grids-center.org
The Hourglass Model

Focus on architecture issues
Applications
Propose set of core services
as basic infrastructure
 Use to construct high-level,
domain-specific solutions
Diverse global services


Design principles
Keep participation cost low
 Enable local control
 Support for adaptation
 “IP hourglass” model

GRIDS Center overview for Base CAMP
20
Core
services
Local OS
Enabling Distributed Science
www.grids-center.org
Software Components

Globus Toolkit


Condor-G


Advanced job submission and management
infrastructure
Network Weather Service


Core Grid computing toolkit
Network capability prediction
KX.509 / KCA (NMI-EDIT)

Kerberos to PKI
GRIDS Center overview for Base CAMP
21
Enabling Distributed Science
www.grids-center.org
The Globus Toolkit™
The



de facto standard for Grid computing
A modular “bag of technologies” addressing key
technical problems facing Grid tools, services and
applications
Made available under liberal open source license
Simplifies collaboration across virtual organizations




Authentication
 Grid Security Infrastructure (GSI)
Scheduling
 Globus Resource Allocation Manager (GRAM)
 Dynamically Updated Request Online Coallocator (DUROC)
Resource description
 Monitoring and Discovery Service (MDS)
File transfer
 Global Access to Secondary Storage (GASS)
 GridFTP
GRIDS Center overview for Base CAMP
22
Enabling Distributed Science
www.grids-center.org
Condor-G

NMI-R1 will include Condor-G, an enhanced
version of the core Condor software optimized to
work with Globus Toolkit™ for managing Grid jobs
GRIDS Center overview for Base CAMP
23
Enabling Distributed Science
www.grids-center.org
Network Weather Service



From UC Santa Barbara, NWS monitors and
dynamically forecasts performance of network and
computational resources
Uses a distributed set of performance sensors
(network monitors, CPU monitors, etc.) for
instantaneous readings
Numerical models’ ability to predict conditions is
analogous to weather forecasting – hence the name



For use with the Globus Toolkit and Condor, allowing
dynamic schedulers to provide statistical Quality-of-Service
readings
NWS forecasts end-to-end TCP/IP performance (bandwidth
and latency), available CPU percentage and available nonpaged memory
NWS automatically identifies the best forecasting technique
for any given resource
GRIDS Center overview for Base CAMP
24
Enabling Distributed Science
www.grids-center.org
KX.509 for Converting
Kerberos Certificates to PKI
Stand-alone





client program from the University of Michigan
For a Kerberos-authenticated user, KX.509 acquires a shortterm X.509 certificate that can be used by PKI applications
Stores the certificate in the local user's Kerberos ticket file
Systems that already have a mechanism for removing
unused kerberos credentials may also automatically remove
the X.509 credentials
Web browser may then load a library (PKCS11) to use these
credentials for https
The client reads X.509 credentials from the user’s Kerberos
cache and converts them to PEM, the format used by the
Globus Toolkit
GRIDS Center overview for Base CAMP
25
Enabling Distributed Science
www.grids-center.org
GRIDS Software Packaging
Grids
Center software uses the Grid Packaging
Technology 2.0


Perl-based tool eases user installation and
setup
GPT2: new version enables creation of RPMs


Lets users install from binaries with familiar packaging
Includes database of all packages, useful for verifying
installations
Packaging
enables:

Dependency checking

User customization of configuration

Easy upgrades, patches
GRIDS Center overview for Base CAMP
26
Enabling Distributed Science
www.grids-center.org
Software Testing
University
of Wisconsin is in charge of testing the
GRIDS software for NMI releases
 Platforms to date:



RedHat 7.2 on IA 32
Solaris 8.0 on SPARC
Release 2 additions:


RedHat 7.2 on IA 64
AIX-L
Testing
includes:
 Builds
 Quality assurance
 Interoperability of GRIDS components
GRIDS Center overview for Base CAMP
27
Enabling Distributed Science
www.grids-center.org
Technical Support
First-level

tech support handled at NCSA
One-stop-shop address for users:

[email protected]
All queries go to NCSA, which responds within 24
hours
Help requests that NCSA can’t answer get forwarded to
people responsible for each of the components:
 Globus Toolkit (U.of Chicago/Argonne/ISI)
 Condor-G (U. of Wisconsin)
 Network Weather Service (UC Santa Barbara)
 KX.509 (Michigan)
 PubCookie (U. Washington)
 CPM

GRIDS Center overview for Base CAMP
28
Enabling Distributed Science
www.grids-center.org
Integration Issues
NMI
testbed sites will be early adopters, seeking integration
of campus infrastructure and Grid computing
Via
NMI partnerships, GRIDS will help identify points
of intersection and divergence between Grid and enterprise
computing

Authorization, authentication and security

Directory services

Emphasis is on open standards and architectures
as the route to successful collaboration
GRIDS Center overview for Base CAMP
29
Enabling Distributed Science
www.grids-center.org
A few specifics on the
Globus Toolkit
Grid Security Infrastructure (GSI)



Globus Toolkit implements GSI protocols and
APIs, to address Grid security needs
GSI protocols extends standard public key
protocols

Standards: X.509 & SSL/TLS

Extensions: X.509 Proxy Certificates & Delegation
GSI extends standard GSS-API
GRIDS Center overview for Base CAMP
31
Enabling Distributed Science
www.grids-center.org
Generic Security Service API

The GSS-API is the IETF draft standard for
adding authentication, delegation, message
integrity, and message protection to apps


GSS-API separates security from communication,
which allows security to be easily added to
existing communication code.


For secure communication between two parties
over a reliable channel (e.g. TCP)
Effectively placing transformation filters on each
end of the communication link
Globus Toolkit components all use GSS-API
GRIDS Center overview for Base CAMP
32
Enabling Distributed Science
www.grids-center.org
Delegation


Delegation = remote creation of a (second
level) proxy credential

New key pair generated remotely on server

Proxy cert and public key sent to client

Clients signs proxy cert and returns it

Server (usually) puts proxy in /tmp
Allows remote process to authenticate on behalf
of the user

Remote process “impersonates” the user
GRIDS Center overview for Base CAMP
33
Enabling Distributed Science
www.grids-center.org
Limited Proxy

During delegation, the client can elect to
delegate only a “limited proxy”, rather than a
“full” proxy


GRAM (job submission) client does this
Each service decides whether it will allow
authentication with a limited proxy


Job manager service requires a full proxy
GridFTP server allows either full or limited proxy
to be used
GRIDS Center overview for Base CAMP
34
Enabling Distributed Science
www.grids-center.org
Sample Gridmap File


Gridmap file maintained by Globus
administrator
Entry maps Grid-id into local user name(s)
# Distinguished name
Local
#
username
"/C=US/O=Globus/O=NPACI/OU=SDSC/CN=Rich Gallup”
rpg
"/C=US/O=Globus/O=NPACI/OU=SDSC/CN=Richard Frost”
frost
"/C=US/O=Globus/O=USC/OU=ISI/CN=Carl Kesselman”
u14543
"/C=US/O=Globus/O=ANL/OU=MCS/CN=Ian Foster”
itf
GRIDS Center overview for Base CAMP
35
Enabling Distributed Science
www.grids-center.org
Security Issues

GSI handles authentication, but authorization is
a separate issue.



Management of authorization on a multiorganization grid is still an interesting problem.
The grid-mapfile doesn’t scale well, and works
only at the resource level, not the collective level.
Data access exacerbates authorization issues,
which has led us to CAS…
GRIDS Center overview for Base CAMP
36
Enabling Distributed Science
www.grids-center.org