Pierce-ScienceGateways-CGB

Download Report

Transcript Pierce-ScienceGateways-CGB

Building Science Gateways
Marlon Pierce
Community Grids Laboratory
Indiana University
What Is a Web Portal?
 Web container that
aggregates content from
multiple sources into a
single display.
 “Start Pages”
 Typically consume
RSS/Atom news feeds.
 More powerful versions
these days support Flickr,
calendars, games, etc.
 Gadgets, widgets
 Examples: iGoogle,
Netvibes, My Yahoo!
Grid Computing Overview
 Grid computing software is designed to integrate large
supercomputing facilities.
 TeraGrid, Open Science Grid, EGEE, etc.
 This is done via network services
 Key Service Components
 Authentication and authorization framework (MyProxy)
 Remote process access and control (GRAM, Condor)
 Remote file, I/O access (GridFTP)
 Additional Services
 Information services, replica management, database federation,
storage management, schedulers, etc.
 Example Grid Software Stacks: CTSS and VDT
TeraGrid Supercomputing Resources (GPIR)
Science Portals and Gateways
 Science Gateways adapt Web portal
technology to build user interfaces to the
Grid.
 Science portals resemble standard portals, but
must also
 Support access to computing and storage resources.
 Allow users remote, Unix-like access to these
resources.
 Provide access to science applications and data sets.
 And we must provide value added services as
well as user interfaces.
My 2002 “octopus”
SOA diagram, from
the archives.
Browser Interface
HTTP(S)
Portlets + Client Stubs
SOAP/HTTP
WSDL
DB Service
WSDL WSDL WSDL WSDL
WSDL WSDL
WSDL
Job Sub/Mon
And File
Services
WSDL
Visualization
Service
JDBC
DB
Operating and
Queuing
Systems
DB
Host 1
Host 2
Host 3
Terminology
 Portlet: this is a standard Java component that generates
HTML and can also act as a client to a remote service.
 Lives in a portal container.
 I will also use this term generically.
 Web Service: a remotely invokeable function on the
Internet.
 SOAP: the XML message envelop for carrying commands over
HTTP.
 WSDL: describes the service’s API in XML.
 REST: A variation of this approach.
 Lots more info:
http://grids.ucs.indiana.edu/ptliupages/presentations/I590
WebService.ppt
But Why?
 Three-tiered Service Oriented Architecture is the
network equivalent of the the famous Model-ViewController design pattern.
 View: the user interface components.
 Controller: Web service middleware
 Model: the backend resources.
 Independence of tiers gives flexibility
 Services can be reused with alternative user interfaces
 Workflow composers like Taverna
 User interfaces can work with different service
implementations.
 Drawback: reliability and robustness are issues.
Two Approaches to the Middle Tier
Fat Client
Thin Client
Portal Client
Portal Client
Grid Client
Grid Protocol
(SOAP)
HTTP + SOAP
Web Service
Grid Client
Grid Protocol
(SOAP)
Grid Service
Backend
Resource
Grid Service
Backend
Resource
Disloc output
converted to
KML and
plotted.
GeoFEST Finite
Element Modeling
portlet and plotting
tools
What’s In the Screenshots?
 GeoFEST and Disloc Portlets
 Live on gf7.ucs.indiana.edu
 Manage the user’s display: Web forms, links to output,
graphics.
 Save user session state persistently.
 QuakeTables Fault DB Web Service
 Lives on gf2.ucs.indiana.edu
 Contains geometric fault models.
 GeoFEST and Disloc Execution Web Services
 Lives on gf19.ucs.indiana.edu
 Generates input files from fault models.
 Runs and manages codes.
Best Practice for Scientific Web
Services
 There are many tools to choose from.
 .NET, Apache Axis, Sun WS, Ruby on Rails, etc.
 Make them self-contained.
 If possible, generate input files within the service.
 Or have an input file generating service.
 Remember that they may be used by other people with other
client tools.
 Communicate data files with URLs.
 Be very careful about exposing the state of the service.
 Don’t assume persistent connections.
Components for Portals
Open Grid Computing Environments
Examples. See http://www.collabogce.org/
Components for Science Portals
 OGCE is founded on the principal that portals
should be built out of reusable parts.
 Key standard in our first phase: the JSR 168 portlet
specification.
 Portlets can run in multiple containers
 uPortal, Sakai, GridSphere, LifeRay, etc.
 Allows us to build Grid specific components and
deploy along side other goodies: Sakai
collaboration tools, contributed portlets, etc.
 Future: Open Social compliant Google Gadgets
OGCE GPIR portlet can interoperate
with TeraGrid and your own GPIR
services.
Manage TeraGrid MyProxy
credentials with the OGCE
ProxyManager portlets.
OGCE file management client
portlets interact with TeraGrid
GridFTP servers.
General purpose batch and interactive job
submission to GRAM, WS-GRAM is supported.
Dashboard Portlet
The dashboard portlet allows users to track jobs on the
selected resource. The user can view either his own set
of jobs or get information on all submitted jobs.
20
Queue forecasting portlets work
with the NWS QBETS to predict
wait times and deadlines.
PURSe portlets manage user requests for
portal accounts and Grid credentials.
Condor and Condor-G
OGCE IFrame Portlet can be
used to integrate external
sites.
Client Libraries for Grid
Computing
Two Major Grid Client Efforts
 The Java COG Kit
 Supports several versions of Globus and SSH.
 Condor-G
 Has a Web Service interface (BirdBath) and Java
client libraries.
 Supports Globus (v2 and v4) and several other Grid
middleware systems.
 You can build either portlets or Web services with
either of these.
 OGCE portlets use primarily COG
 We prefer Condor-G based Web services for long
running jobs.
CoG Abstraction Layers
Nano
materials
BioDisaster
Informatics
Management
Applications
Portals
Development
Support
CoGGridfaces
GridfacesLayer
Layer
CoG
CoG
CoG GridIDE
GridIDE
CoGData
Dataand
andTask
TaskManagement
ManagementLayer
Layer
CoG
CoGAbstraction
AbstractionLayer
Layer
CoG
CoG
CoG
CoG
CoG
CoG
CoG
GT2
GT3
(X)
GT4
WS-RF
CoG
CoG
CoG
CoG
Condor Unicore
CoG
CoG
CoG
CoG
SSH
Others
Task
Handler
The class diagram
is the
same for all grid
tasks (running jobs,
modifying files,
moving data).
Task
Task
Specification
Service
Security
Context
Classes also abstract toolkit
provider differences. You set
these as parameters: GT2,
GT4, etc.
Service
Contact
Coupling CoG Tasks
 The COG abstractions
also simplify creating
coupled tasks.
 Tasks can be
assembled into task
graphs with
dependencies.
 “Do Task B after
successful Task A”
 Graphs can be nested.
Problems with Grid Client
Development
 Grid portlets typically wrap each single Grid capability in a
separate portlet
 Problem is that Grid portlets need to combine these operations
 Portlets are entire web applications, so we need a component model for
portlets: reusable portlet parts
 Even with the COG Abstraction Layer, we must still do a lot of
coding to build new applications.
 To address these problems we have adopted Java Server Faces
 Provides several nice Model-View-Controller features
 JSF provides an extensible framework (tag libraries) for making
reusable components.
 Apache JSF portlet bridge allows you to convert standalone JSF
applications (development phase) into portlets (deployment phase).
GTLAB Example
<html>
<body>
<f:form>
<o:submit id=”test” action=”next_page” />
<o:myproxy id=”pr”
hostname=”gf1.ucs.indiana.edu”
port=”7512” lifetime=”2” username=“mnacar”
password=”***” />
<o:jobsubmit id=”task”
hostname=”cobalt.ncsa.teragrid.org”
provider=”GT4” executable=”/bin/ls”
stdout=”tmp/result
stderr=”tmp/error” />
</o:submit>
</f:form>
</body>
</html>
32
Grid Tags
Associated Grid Beans Features
<submit/>
ComponentBuilderBean
Creating components, job
handlers, submitting jobs
<handler/>
MonitorBean
Handling monitoring page actions
<multitask/>
MultitaskBean
Constructing simple workflow
<dependency/>
MultitaskBean
Defining dependencies among sub
jobs
<myproxy/>
MyproxyBean
Retrieving myproxy credential
<fileoperation/>
FileOprationBean
Providing Gridftp operations
<jobsubmission/>
JobSubmitBean
Providing GRAM job submissions
<filetransfer/>
FileTransferBean
Providing Gridftp file transfer
ResourceBean
Describes common properties
among all tags and beans. Passing
values given by standard visual
JSF components.
Managing Scientific
Workflows
Scientific Workflows
 Portal interfaces encode scientific use cases.
 If you have a rich set of services, it is a lot of work
to make portlets for all possible use cases.
 And power users will have always want
something more.
 Example: our CICC project has dozens of chemical
informatics Web services.
 http://www.chembiogrid.org.wiki
 Workflow composers can simplify this.
 Allow users to encode and execute their own use
cases.
Web Services and Workflows
 Perform a similarity
search on the NIH DTP
Human Tumor data.
 Filter the results based
on Pharmacokinetic
properties (FILTER)
 Convert to 3D (OMEGA)
 Docking into a predefined protein (FRED)
 Visualize (JMOL).
Taverna workflow connects
remote services.
OGCE’s XBaya
Workflow Composer
Future of Science
Gateways
Updating the Octopus
Browser Interface
HTTP(S)
Social Gadgets+AJAX
RSS,JSON/HTTP
REST
DB Service
REST REST REST REST
REST WSDL
REST
Job Sub/Mon
And File
Services
REST
Visualization
Service
JDBC
DB
Operating and
Queuing
Systems
DB
Host 1
Host 2
Host 3
Enterprise Approach
Web 2.0 Approach
JSR 168 Portlets
Gadgets, Widgets
Server-side integration and
processing
AJAX, client-side integration and
processing, JavaScript
SOAP
RSS, Atom, JSON
WSDL
REST (GET, PUT, DELETE, POST)
Portlet Containers
Open Social Containers (Orkut,
LinkedIn, Shindig); Facebook;
StartPages
User Centric Gateways
Social Networking Portals
Workflow managers (Taverna,
Kepler, etc)
Mash-ups
Grid computing: Globus, condor, etc Cloud computing: Amazon WS Suite,
Xen Virtualization
Semantic Web: RDF, OWL,
ontologies
Microformats, folksonomies
Microformats,
KML, and GeoRSS
feeds used to
deliver SAR data to
multiple clients.
More Information
 Contact me: [email protected]
 See what I’m up to:
http://communitygrids.blogspot.com/
 OGCE software: http://collab-ogce.org/
 QuakeSim: http://www.quakesim.org/
 CICC: http://www.chembiogrid.org/wiki/
 Lots of people worked on all of these.