iSERVOGrid_AGUDec15-04

Download Report

Transcript iSERVOGrid_AGUDec15-04

iSERVO
International Solid Earth
Research Virtual Observatory
Grid/Web Services and Portals Supporting
Earthquake Science
December 15 2004
AGU Fall Meeting San Francisco
Geoffrey Fox, Marlon Pierce
(Community Grids Lab, Pervasive Technologies Laboratories,
Indiana University)
John Rundle (UC Davis),
Andrea Donnellan, Robert Granat, Greg Lyzenga, Jay Parker (JPL)
Don McLeod (USC), Lisa Grant (UC Irvine)
Repositories
Federated Databases
Database
Sensors
Streaming
Data
Field Trip Data
Database
Sensor Grid
Database Grid
Research
SERVOGrid
Education
Compute Grid
Data
Filter
Services Research
Simulations
?
GIS
Discovery Grid
Services
Customization
Services
From
Research
to Education
Analysis and
Visualization
Portal
Grid of Grids: Research Grid and Education Grid
Education
Grid
Computer
Farm
iSERVO in a nutshell

Designed to link data-sets (repositories and real time),
computations and earthquake scientists in ACES (Asia
Pacific) Cooperation
• Australia China Japan USA




Exemplified by SERVOGrid in USA led by JPL
Supports simulation and datamining as services
Adopts conservative WS-I+ Web Service
Interoperability standards
Builds full “Grid” in a library fashion as a Grid of Grids
• GIS (Geographic Information System) Grid built as a set of OGC
compatible Web Services “talking” GML
• iSERVO federates separate Grids in each
country/organization/function
• A Grid is “just” a collection of Services aka distributed programs


Multi-scale simulations supported by Grid workflow
Portals based on NSF Middleware Initiative NMI Open
Grid Computing Environment OGCE
Characteristics of Computing for
Solid Earth Science



Widely distributed datasets in various formats
• GPS, Fault data, Seismic data sets, InSAR satellite data
• Many available in state of art tar files that can be FTP’d
• Provenance problems: faults have controversial parameters like
slip rates which have to be estimated.
Distributed models and expertise
• Lots of codes with different regions of validity, ranging from
cellular automata to finite element to data mining applications
(HMM)
• Simplest challenges are just making these codes useable for
other researchers.
• And hooking this codes to data sources
• Some codes also have export or IP restrictions
• Other codes are highly specialized to their deployment
environments.
Decomposable problems requiring interoperability for linking full
models
• The fidelity of your fault modeling can vary considerably
• Link codes (through data) to support multiple scales
(i)SERVO Web (Grid) Services







Programs: All applications wrapped as Services using proxy strategy
Job Submission: support remote batch and shell invocations
• Used to execute simulation codes (VC suite, GeoFEST, etc.), mesh
generation (Akira/Apollo) and visualization packages (RIVA,
GMT).
File management:
• Uploading, downloading, backend crossloading (i.e. move files
between remote machines)
• Remote copies, renames, etc.
Job monitoring
Workflow: Apache Ant-based remote service orchestration
• For coupling related sequences of remote actions, such as RIVA
movie generation.
Data services: support remote data bases and query construction
• XML data model being adopted for common formats with
translation services to “legacy” formats.
• Migrating to Geography Markup Language (GML) descriptions.
Metadata Services: for archiving user session information.
SERVOGrid Applications

Codes range from simple “rough estimate” codes to
parallel, high performance applications.
• Disloc: handles multiple arbitrarily dipping dislocations
(faults) in an elastic half-space.
• Simplex: inverts surface geodetic displacements for fault
parameters using simulated annealing downhill residual
minimization.
• GeoFEST: Three-dimensional viscoelastic finite element
model for calculating nodal displacements and tractions.
Allows for realistic fault geometry and characteristics,
material properties, and body forces.
• Virtual California: Program to simulate interactions
between vertical strike-slip faults using an elastic layer over
a viscoelastic half-space
• RDAHMM: Time series analysis program based on Hidden
Markov Modeling. Produces feature vectors and probabilities
for transitioning from one class to another.


Preprocessors, mesh generators: AKIRA suite
Visualization tools: RIVA, GMT, IDL
SERVOGrid Codes, Relationships
Elastic Dislocation Inversion
Viscoelastic FEM
Viscoelastic Layered BEM
Elastic Dislocation
Pattern Recognizers
Fault Model BEM
This linkage called Workflow in Grid/Web Service parlance
Role of Workflow
Service-1
Service-3
Service-2






Programming the Grid: Workflow describes linkage
between services
As distributed, linkage must be by messages
Linkage is two-way and has both control and data
Apply to multi-scale (complexity) linkage, multiprogram linkage, link visualization to simulation, GIS
to simulations and viz filters to each other
Microsoft-IBM specification BPEL is current preferred
Web Service XML specification of workflow
SERVOGrid uses ANT (well known XML build tool) to
perform workflow and this works well in our relatively
simple cases)
Applications and Observational Data


Several SERVO codes work directly with observational
data.
Scenarios include
• GeoFEST, VirtualCalifornia, Simplex, and Disloc all depend
upon fault models.
• RDAHMM and Pattern Informatics codes use seismic
catalogs.
• RDAHMM primarily used with GPS data

Problem: We need to provide a way to integrate these
codes with the online data repositories.
• QuakeTables Fault Database
• Existing GPS and Earthquake Catalogs

Solution: use databases to store catalog data; use XML
(GML) as exchange data format; use OGC and WS-I+
Compatible Web Services for data exchanges, invoking
queries, and filtering data.
• Use Web Feature Service, Web Map Service from OGC
• Use UDDI (Discovery), WS-DAI (Database),WS-Context
(Dynamic metadata) from WS-I+
SERVOGrid and Semantic Grid


SERVOGrid has many types of metadata
We are designing RDFS descriptions for the
following components:
•
•
•
•
•

Simulation codes, mesh generators, etc.
Visualization tools
Data types
Computing resources
…
These are easily expressed as RDFS (actually
DAML) “nuggets” of information.
• Create instances of these
• Use properties to link instances.
Some Sample Relationships
installedOn
Danube
Computer
installedOn
GMT
Viz Appl
Disloc
Application
visualizedBy
createsOutput
usesInput
USC Fault DB
Data Storage
storedIn
Fault
DataType
Stress Map
DataFormat
Expanding to iSERVO Strategy
• Agree on what (type of) resources and capabilities need to put on the
ISERVO Grid
– Computers, instruments, databases, visualization, maps, job
submittal ….
• Agree on interfaces to resources from OGSA-DAI (databases) to
particular data structures (GML/OpenGIS) – specify in XML
• Implement Resources and Capabilities as Services
– User Interface should be a portlet that can be integrated by the
portal into web interface
• Make certain overarching Grid capabilities such as workflow,
federation and metadata are sufficient
• SERVO Grid is a prototype of this strategy using several US sites
rather than several countries
– Can be naturally extended to iSERVO, education, emergency
response by extending resources
• WS-I+ Web Service Architecture ensures continued interoperability
and extensibility
Grid Syntax Controversies
• There are several proposals for the Web Service extensions
needed for Grids – OGSI (GT3), WSRF (GT4), WS-GAF
(Newcastle)
– We adopt a wait and see philosophy
• We use WS-I+ Pure Web Services approach that adopts
minimum set of ~7 Web Service specifications choosing
from 60 or so proposed in last few years
– Those adopted by Industry wide WS-I Web Service
Interoperability group
– Those declared by IBM and Microsoft
– Any extra absolutely essential
– This approach adopted by next phase of UK e-Science Program
Performance and Streaming
WS-1
WS-2
• Web Services are meant to exchange messages using SOAP
which is very interoperable but very slow
– Drastically reduces effective bandwidth
• Most real programs exchanges data via reading and writing
binary files
– Increases latency
• All Control Messages should use classic SOAP
• All data messages use optimal binary
– Respect “SOAP Infoset” (Header and Body of Message)
• Use streaming not file-based infrastructure to give better latency
and same technology for files and streaming sensors
– Similar to using UNIX Pipes not directly files
– http://www.naradabrokering.org
SERVOGrid Web Portal



Package every Web
Service with its own user
interface as a document
fragment
Portlets are underlying
technology
OGCE Open Grid
Computing Environment
is developing lots of
useful portlets
• Computing
• GIS
• Access Grid etc.
Aggregate Portals
Portlet User Interface
Components
Application Web Services
and Workflow
Core Web Services
Portal Architecture
Clients (Pure HTML, Java Applet ..)
Aggregation and Rendering
Portlet Class:
WebForm
Clients
Portal
Portlet Class
Portlet Class
Portlet Class
Portal
Internal
Services
Portlets
SERVOGrid
(IU)
Remote
or Proxy
Portlets
Web/Grid
service
Computing
Web/Grid
service
Data Stores
Web/Grid
service
Instruments
GridPort
etc.
(Java)
COG Kit
Local
Portlets
Libraries
Hierarchical
arrangement
Services
Resources
Each Service
has its own
portlet
Individual
portlet for the
Proxy Manager
Use tabs or
choose
different
portlets to
navigate
through
interfaces to
different
services
2 Other Portlets
OGCE
Consortium
SERVOGrid Portal Screen
Shots