Tools and Services for Interactive Applications in CrossGrid

Download Report

Transcript Tools and Services for Interactive Applications in CrossGrid

TAT
CrossGrid After the First Year:
A Technical Overview
Marian Bubak, Maciej Malawski, and Katarzyna Zając
X# TAT
Institute of Computer Science & ACC CYFRONET
AGH, Kraków, Poland
www.eu-crossgrid.org
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Main Objectives
 A new category of Grid-enabled applications
•
•
•
•




Compute- and data-intensive
distributed
near real-time response (person in a loop)
layered
New programming tools
Grid more user-friendly, secure and efficient
Interoperability with other Grids
Implementation of standards
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
CrossGrid in a Nutshell
Interactive, Compute and Data Intensive Applications
 Interactive simulation and visualization of
a biomedical system
 Flooding crisis team support
 Distributed data analysis in HEP
 Weather forecasting and air pollution modeling
Application Specific Services
User Interactive Services
Grid Visualization Kernel
DataGrid Services
Tool Environment
 MPI code debugging and
verification
 Metrics and benchmarks
 Interactive and semiautomatic
performance evaluation tools
New Generic Grid Services
 Portals and roaming access
 Scheduling agents
 Application and Grid monitoring
 Optimization of data access
Globus Middleware
Fabric
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Key Features of CG Applications
 Data
• Data generators and databases geographically
distributed
• Selected on demand
 Processing
• Interactive
• Requires large processing capacity; both HPC &
HTC
 Presentation
• Complex data requires versatile 3D visualisation
• Support interaction and feedback to other
components
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Biomedical Application
LB flow
simulation
 Adding small modifications to the
proposed structure results in
immediate changes in the blood flow.
Visualization
 Online presentation of simulation
results via a 3D environment.
 The progress of the simulation and
the estimated time of convergence
should be available for inspection.
CrossGrid Yearly Review, Brussels, March 12, 2003
Interaction
VE
WD
PC
PDA
TAT
Basic Characteristics of Flood Simulation
 Meteorological
• Intensive simulation
(HPC), large input/output
data sets, high availability
of resources
Data sources
Meteorological
simulations
 Hydrological
• Parametric simulations
(HTC) may require
different models
(heterogeneous
simulations)
Hydrological
simulations
Users
Hydraulic
simulations
 Hydraulic
• Many 1-D simulations
HTC, 2-D hydraulic
simulations require HPC
Output visualization
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Distributed Data Analysis in HEP
 Objectives
• Distributed data access
• Distributed data mining
techniques with neural networks
 Issues
• Typical interactive requests will
run on o(TB) of distributed data
• Transfer/replication times for the
whole data on the order of one
hour
• Data transfers once and in
advance of the interactive session.
• Allocation, installation and setup
the
corresponding
database
servers before the interactive
session starts
Interactive
Session
Resource
Broker
Replica
Manager
Portal
XML in/out
On-line
output
Interactive
Session
Manager
Interactive
DISTRIBUTED
Interactive
Session
Interactive
Session
Interactive PROCESSING
Worker
Session
Interactive
Worker
Session
Worker
Session
Worker
Worker
CrossGrid Yearly Review, Brussels, March 12, 2003
DB Installation
Interactive Session
Database server
TAT
Weather Forecasting and Air Pollution
Modeling
 Distributed/parallel code on Grid
• Coupled Ocean/Atmosphere Mesoscale Prediction
System
• STEM-II Air Pollution Code
• Integration of distributed databases
 Data mining applied to downscaling weather forecasts
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Initial version of X# architecture
Applications
Supporting Tools
1.1
BioMed
2.2 MPI
Verification
1.2
Flooding
2.3
Metrics and
Benchmarks
Applications
Development
Support
App. Spec
Services
Generic
Services
Fabric
MPICH-G
1.1
User Interaction
Services
3.2
Scheduling
Agents
DataGrid Job
Submission
Service
Resource
Manager
(CE)
CPU
1.3 Interactive
Distributed
Data Access
1.3 Data
Mining on
Grid (NN)
2.4
Performance
Analysis
3.1 Portal &
Migrating
Desktop
1.1, 1.2 HLA
and others
1.3
Interactive
Session Services
3.4
Optimization of
Grid Data Access
GRAM
Resource
Manager
(SE)
Secondary
Storage
1.4
Meteo
Pollution
GridFTP
1.1 Grid
Visualisation
Kernel
3.3
Grid
Monitoring
GIS / MDS
Resource
Manager
3.4
Optimization of
Local Data Access
3.1
Roaming
Access
GSI
Resource
Manager
Instruments
( Satelites,
Radars)
Tertiary Storage
CrossGrid Yearly Review, Brussels, March 12, 2003
Globus-IO
DataGrid
Replica
Manager
Globus
Replica
Manager
Replica
Catalog
Replica
Catalog
Project Phases
TAT
M 4 - 12: first development phase: design,
1st prototypes, refinement of requirements
M 25 - 32: third development phase:
complete integration, final code versions
M 1 - 3: requirements
definition and merging
M 33 - 36: final phase:
demonstration and documentation
M 13 - 24: second development phase:
integration of components, 2nd prototypes
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Tools
Benchmarks
G-PM
High Level
Analysis
Component
Applications
executing
on Grid testbed
MPI Verification
MARMOT




Grid
Monitoring
RMD
Performance
Measurement
Component
PMD
Performance
Prediction
Component
MPI code debugging and verification
Metrics and benchmarks for the Grid environment
Grid-enabled Performance Measurement
Performance Prediction Component
CrossGrid Yearly Review, Brussels, March 12, 2003
User Interface and
Visualization
Component
Application
source
code
TAT
MPI Verification
 verifies the
correctness of
parallel, distributed
Grid applications
(MPI)
 technical basis: MPI
profiling interface
which allows a
detailed analysis of
the MPI application
Application or
Test Tool
Additional
Process
(Debug
Server)
Profiling Interface
Core Tool
MPI
CrossGrid Yearly Review, Brussels, March 12, 2003
Client Side
Server Side
TAT
Benchmark Categories
 Micro-benchmarks
• For identifying basic
performance properties of
Grid services, sites, and
constellations
 Micro-kernels
Portal
Embedding
gbView
Invocation
gbControl
• Generic HPC/HTC kernels,
gbRMP Direct
including general and oftenused kernels in Grid
Invocation
Invocation/
environments
Collection through
 Application kernels
GPM
• Characteristic of
Grid
representative CG
applications
Bench suite
CrossGrid Yearly Review, Brussels, March 12, 2003
Retrieval
gbARC
Storage/
Retrieval
SE
storage
TAT
Performance Measurement Tool G-PM
 Components:
• performance measurement
component (PMC),
• component for high-level
analysis (HLAC),
Measurement
• component for
Interface
performance prediction
(PPC) based on analytical
PMC
performance models of
application kernels,
• user interface and
visualization component
UIVC.
CrossGrid Yearly Review, Brussels, March 12, 2003
UIVC
Interface
HLAC
OCM-G
Interface
OCM-G
TAT
User Interactive Service
Interaction
GidService
Visualisation
GridService
Simulation
GridService
RTIExec
GridService
Registry
OGSA WSDL RTI Tuple Space functionality description
+Dynamic discovery of OGSA Services
Large On-line Data transfer
Short Messages and Events
GridFTP
SOAP/IIOP
TCP or UDP/IP
 enables end users to run distributed simulations in
the Grid environment and to steer those simulations
in near real time
 uses OGSA mechanisms to call external resource
brokers, job submission services (efficient and
transparent execution of the simulation on the Grid).
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Grid Visualization Kernel
 addresses the problems of
distributed visualization on
heterogeneous devices
 allows easily and transparently
interconnect Grid applications with
existing visualisation tools (AVS,
OpenDX, VTK, ...)
 handles multiple concurrent input
data streams
 multiplexes compressed data and
images efficiently across longdistance networks
CrossGrid Yearly Review, Brussels, March 12, 2003
GVK
Visualization
Planner
GVK
Portal Server
GRAM
GASS
MDS
GVK Visualization pipeline
Simulation Data
Init Visualization
Simulation
Update Visualization
TAT
New Grid Services
 Portals and roaming access
 Grid resource management
 Grid monitoring
 Optimization of data access
CrossGrid Yearly Review, Brussels, March 12, 2003
Roaming Access – Current Design
Web Browser
LDAP
DataBase
Application
Portal Server
Web Browser
Desktop
Portal Server
TAT
Roaming
Access Server
Replica
Manager
Scheduling
Agent
Command
Line
Benchmarks
 Portal - easier access and use of the Grid by applications
 Migrating Desktop - a transparent, independent user
environment
 Roaming Access Server - responsible for managing user
profiles, job submission, file transfers and Grid monitoring
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Scheduling Agents - Current Design
 scheduling user jobs
over the CrossGrid
testbed infrastructure,
 submition based on
Condor-G,
 support for sequential
and MPI parallel jobs,
batch jobs and
interactive jobs,
 priorities and
preferences determined
by the user for each job
Web Portal
Resource list
Resource
Broker
Scheduling
Agent
Job monitoring
JSS commands
Logging
&
Bookkeping
JSS /
CondorG
CE
CrossGrid Yearly Review, Brussels, March 12, 2003
CE
CE
TAT
Application Monitoring
 OCM-G Components
Tool
• Service Managers
• Local Monitors
OMIS
ServiceManager
ExternalLocalization
 Application processes
OMIS
LocalMonitor
 Tool(s)
 External name service
• Component
discovery
SharedMemory
ApplicationProcess
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Infrastructure Monitoring
Infrastructure
Static info
Instruments
Jiro info
Jiro
Services
MDS
Globus
Non-invasive
Monitoring
MDS info
Information DB
Performance
Information
Post-processing
System
 Infrastructure monitoring
• Invasive monitoring (based on Jiro technology)
• Non-invasive monitoring (Santa-G)
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Data Access Design
 Selection of
specialized
components best
suited for data
access operations
 Estimation of data
access latency and
bandwidth inside
the storage
elements
 Faster access to
large tape-resident
through
fragmentation
Portal
Application
Storage Element
Replica
Catalog
Replica Manager
GridFTP
GridFTP
Plugin
Data Access
Estimator
HSM
Components for
Component
Expert Subsystem
Storage
Configuration
Component-Expert
Subsystem
Metadata
Catalog
TRLFM
Tape Storage Disk Cache MO Disk storage
CrossGrid Yearly Review, Brussels, March 12, 2003
Secondary Storage
System
TAT
Current status of CG Architecture
Application
Applications
Supporting
Tools
Portal and
Benchmarks
Tools
Migrating Desktop
Application Specific
Services
Infrastructure
Roaming Access
OCM-G
Monitoring
User Interaction
Grid Visualization
Services
Kernel
Scheduling
DataGrid
Data
Management
Agent
Generic
Services
DataGrid Job
Globus
Management
Toolkit
CrossGrid Yearly Review, Brussels, March 12, 2003
Data Access
TAT
Application-centric view
Application
Grid
Visualization
Kernel
Application Plugin
User Interaction
Services
Application Container
Portal and Roaming
Access
Data
Access
Globus
Toolkit
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
The Current Testbed
 The current CrossGrid testbed is based on:
• EDG distribution release 1.2.2 and 1.2.3 (production)
• EDG distribution release 1.4.3 (validation)
 The current infrastructure permits:
• installation of initial prototypes of CrossGrid software
releases
(described in M12 Deliverables)
• testing applications using:
• Globus and EDG middleware
• MPI
• achieving compatibility with DataGrid and therefore
extending Grid coverage in Europe
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Grid Service
 Transient, stateful Web Service (created
dynamically)
 Described by WSDL
 Identified by Grid Service Handle (GSH) in the
form of URI
 Can be queried for configuration and state in
standard way – Service Data mechanism
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Why use OGSA
 Standards
 „to be part of the Grid = to implement OGSA
Grid protocols”
 Interoperability in heterogeneous environments
 Possible contribution to future Grid activities
CrossGrid Yearly Review, Brussels, March 12, 2003
Grid Services – where?
TAT
 Dynamic service creation and lifetime management to
control the state of some process, e.g.:
• user session in a portal
• data transfer
• running simulation.
 Service data model can be applied to monitoring
systems that can be used as information providers for
other services.
 Service discovery – to solve the bootstrap problem:
• to connect the modules of a distributed simulation
• to connect the application to a monitoring system
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Steps towards OGSA
 Using Web Service interfaces and XML where
possible
 Experimenting with prototyping services using
OGSA alpha releases
 Applying Grid Service extensions to services
 Solving GT2 - GT3 transition and compatibility
issues
CrossGrid Yearly Review, Brussels, March 12, 2003
TAT
Summary
 Achievements of the first project year :
• Software Requirements Specifications together
with use cases written
• CrossGrid Architecture defined
• Detailed Design documents for tools and new
Grid services (OO approach, UML) written
• First prototype of software running and
documented
• Detailed description of the test and integration
procedures created
• Testbed set up
CrossGrid Yearly Review, Brussels, March 12, 2003