Web 2.0 and Grids - Digital Science Center

Download Report

Transcript Web 2.0 and Grids - Digital Science Center

Web 2.0 and Grids
March 4 2007
Geoffrey Fox
Computer Science, Informatics, Physics
Pervasive Technology Laboratories
Indiana University Bloomington IN 47401
[email protected]
http://www.infomall.org
1
Old and New (Web 2.0) Community Tools


del.icio.us, Connotea, Citeulike, Bibsonomy, Biolicious manage
shared bookmarks
MySpace, YouTube, Bebo, Hotornot, Facebook, or similar sites
allow you to create (upload) community resources and share
them; Friendster, LinkedIn create networks
• http://en.wikipedia.org/wiki/List_of_social_networking_websites
• http://www.slideshare.net http://www.gliffy.com



Google documents, Wikis and Blogs are powerful specialized
shared document systems
ConferenceXP and WebEx share general applications
Google Scholar tells you who has cited your papers while
publisher sites tell you about co-authors
• Windows Live Academic Search has similar goals


Kazaa, Instant Messengers, Skype, Napster, BitTorrent for P2P
Collaboration – text, audio-video conferencing, files
Note sharing resources creates (implicit) communities
• Social network tools study graphs to both define communities
and extract their properties
Connotea



Connotea is run
by Nature and
is useful for
collecting
research links
Here is 177
parallel
computing links
selected on
Meeting
Useful
extension of
del.icio.us
“Best Web 2.0 Sites” -- 2006

Extracted from http://web2.wsj2.com/
Social Networking

Start Pages

Social Bookmarking

Peer Production News

Social Media Sharing

Online Storage
(Computing)

4
Why Web 2.0 is Useful

Captures the incredible development of interactive Web
sites enabling people to create and collaborate
5
Web 2.0 v Grid I


Web 2.0 allows people to nurture the Internet Cloud and such
people got Time’s person of year award
Platt in his Blog (courtesy Hinchcliffe
http://web2.wsj2.com/the_state_of_web_20.htm) identifies key
Web 2.0 features as:
• The Web and all its connected devices as one global platform of reusable
services and data
• Data consumption and remixing from all sources, particularly user
generated data
• Continuous and seamless update of software and data, often very rapidly
• Rich and interactive user interfaces
• Architecture of participation that encourages user contribution

Whereas Grids support Internet scale Distributed Services
• Maybe Grids focus on (number of) Services (there aren’t many scientists)
and Web 2.0 focuses on number of People
• But they are basically same!
6
Web 2.0 v Grid II

Web 2.0 has a set of major services like GoogleMaps or Flickr
but the world is composing Mashups that make new composite
services
• End-point standards are set by end-point owners
• Many different protocols covering a variety of de-facto standards





Grids have a set of major software systems like Condor and
Globus and a different world is extending with custom services
and linking with workflow
Popular Web 2.0 technologies are PHP, JavaScript, JSON,
AJAX and REST with “Start Page” e.g. (Google Gadgets)
interfaces
Popular Grid technologies are Apache Axis, BPEL WSDL and
SOAP with portlet interfaces
Robustness of Grids demanded by the Enterprise?
Not so clear that Web 2.0 won’t eventually dominate other
application areas and with Enterprise 2.0 it’s invading Grids
The world does itself in large numbers!
Mashups v Workflow?





Mashup Tools are reviewed at http://blogs.zdnet.com/Hinchcliffe/?p=63
Workflow Tools are reviewed by Gannon and Fox
http://grids.ucs.indiana.edu/ptliupages/publications/Workflow-overview.pdf
Both include
scripting in PHP,
Python, sh etc. as
both implement
distributed
programming at level
of services
Mashups use all
types of service
interfaces and do not
have the potential
robustness (security)
of Grid service
approach
Typically “pure”
HTTP (REST)
8
Grid Workflow Datamining in Earth Science

NASA GPS

Work with Scripps Institute
Grid services controlled by workflow process real time
data from ~70 GPS Sensors in Southern California
Earthquake
Streaming Data
Support
Archival
Transformations
Data Checking
Hidden Markov
Datamining (JPL)
Real Time
Display (GIS)
9
Web 2.0 uses all types of Services

Here a Gadget Mashup uses a 3 service workflow with
a JavaScript Gadget Client
10
Web 2.0 APIs


http://www.programmableweb.com/apis currently
(March 3 2007) 388 Web 2.0 APIs with GoogleMaps the
most used in Mashups
This site acts as a “UDDI” for Web 2.0
The List of
Web 2.0 API’s




Each site has API
and its features
Divided into
broad categories
Only a few used a
lot (34 API’s used
in more than 10
mashups)
RSS feed of new
APIs
3 more Mashups
each day



Growing number of commercial Mashup Tools
For a total of 1609
March 3 2007
Note ClearForest
runs Semantic Web
Services Mashup
competitions (not
workflow
competitions)
Some Mashup
types: aggregators,
search aggregators,
visualizers, mobile,
maps, games
Indiana Map Grid (Mashup)
GIS Grid of “Indiana Map” and ~10 Indiana counties with accessible Map (Feature)
Servers from different vendors. Grids federate different data repositories (cf Astronomy
VO federating different observatory collections)
14
Google Maps Server
Marion County
Map Server
(ESRI ArcIMS)
Must provide adapters
for each Map Server
type .
Tile Server requests
map tiles at all zoom
levels with all layers.
These are converted
to uniform projection,
indexed, and stored.
Overlapping images
are combined.
Hamilton County
Map Server
(AutoDesk)
Adapter
Adapter
Adapter
Tile Server
Cache Server
Browser +
Google Map API
Cass County Map
Server
(OGC Web Map
Server)
Browser client fetches
image tiles for the
bounding box using
Google Map API.
The cache server
fulfills Google map
calls with cached tiles
at the requested
bounding box that fill
the bounding box.
15
Mash
Planet
Web 2.0
Architecture
http://www.imagine
-it.org/mashplanet
Display too large to
be a Gadget
16
Searched on Transit/Transportation
17
Grid-style portal as used in Earthquake Grid
The Portal is built from portlets
– providing user interface
fragments for each service
that are composed into the
full interface – uses OGCE
technology as does planetary
science VLAB portal with
University of Minnesota
18
Note the many competitions powering Web 2.0
Mashup Development
Portlets v. Google Gadgets





Portals for Grid Systems are built using portlets with
software like GridSphere integrating these on the
server-side into a single web-page
Google (at least) offers the Google sidebar and Google
home page which support Web 2.0 services and do not
use a server side aggregator
Google is more user friendly!
The many Web 2.0 competitions is an interesting model
for promoting development in the world-wide
distributed collection of Web 2.0 developers
I guess Web 2.0 model will win!
19
Typical Google Gadget Structure
Google Gadgets are an example of
Start Page technology
See http://blogs.zdnet.com/Hinchcliffe/?p=8

… Lots of HTML and JavaScript </Content> </Module>
Portlets build User Interfaces by combining fragments in a standalone Java Server
Google Gadgets build User Interfaces by combining fragments with JavaScript on the client
APIs/Mashups per Protocol Distribution
google
maps
Number of
APIs
Number of
Mashups
del.icio.us
virtual
earth
411sync
yahoo! search
yahoo! geocoding
technorati
netvibes
yahoo! images
trynt
yahoo! local
amazon
ECS
google
search
flickr
SOAP
ebay
youtube
amazon S3
REST
live.com
XML-RPC
REST,
XML-RPC
REST,
XML-RPC,
SOAP
REST,
SOAP
JS
Other
HTTP v SOAP v WS-* v Grid

Quote from user trying to use ClearForest SOAP API
when first released:
• “How about a REST interface or at least a simpler web
interface with a GET or POST form (minus the frames). This
would be a preferable option for many mashup environments,
compared to SOAP.”
• ClearForest offered a REST API within the week.

Microsoft DSS is an interesting high performance
service infrastructure supporting SOAP and HTTP
http://msdn.microsoft.com/robotics/.
• Runs well on multicore as well as distributed systems

Mashups can support multiple protocols but
“equilibrium” is an evolution to simplest protocols as
advantage of complicated protocols gets thrown away
Average run time (microseconds)
350
DSS Service Measurements
300
250
200
150
100
50
0
1
10
100
1000
10000
Round of
trips
Timing of HP Opteron Multicore as a function
number of simultaneous two-way
service messages processed (November 2006 DSS Release)

Measurements of Axis 2 shows about 500 microseconds – DSS is substantially faster
Mashups are workflow (and vice versa)
Portals are start pages and portlets could be gadgets
So there is more or less no architecture
difference between Grids and Web 2.0 and we
can build e-infrastructure or Cyberinfrastructure
with either architecture (or mix and match)
We should bring Web 2.0 People capabilities to Grids (eScience,
Enterprises)
We should use robust Grid (motivated by Enterprise) technologies in
Mashups
See Enterprise 2.0 discussion at http://blogs.zdnet.com/Hinchcliffe/
24
OGF Activities


http://www.semanticgrid.org/OGF/ogf19/
White paper on Web 2.0 and Grids
• Use Web 2.0 Services like YouTube, MySpace, Maps
• Build e(Cyber)infrastructure with Web 2.0 Technologies like
Ajax, JSON, Gadgets

Two Web 2.0 OGF21 workshops on
• Commercial Web 2.0 (Catlett)
• Web 2.0 and Grids (De Roure, Fox, Gentzsch, Kielmann)
• Sessions (each one invited plus contributed papers) on:
 Implications of Web2.0 on eScience
 Implications of Web2.0 on OGSA (Grids)
 Implications of Web2.0 on Enterprise
 Implications of Web2.0 on Digital Libraries/repositories
25