Grid Computing – Overview and Research Issues
Download
Report
Transcript Grid Computing – Overview and Research Issues
Grid Computing
Overview and Research Issues
Peter Kelly
Adelaide University, Australia
[email protected]
Supervisors:
Paul Coddington
Andrew Wendelborn
What is grid computing?
Grid computing is many things to many people
At its core, it’s about
Sharing computing resources between
organisations
Enabling more complex and demanding
applications by providing widespread access
to powerful computers and storage
Integrating existing systems together
What is grid computing?
In some respects it’s similar to cluster computing,
however each computer may
Be located in a different country
Use a different CPU architecture
Run a different operating system
Be owned by a different organisation
Have a different amount of memory, disk space, and
computing power, and network bandwidth
Not be available all of the time
Thus grids are much more complex than clusters!
Why is it useful?
Demand for computing power is growing
rapidly
– In industry, science, government, engineering,
entertainment, defence… everywhere
Need ways to harness the large amount of
computing power available around the world
Organisations often want to collaborate on
projects and share resources with each other
Grids provide the infrastructure to integrate
different applications that need to collaborate
with each other to get useful work done
Types of grid computing
Service Oriented Architecture (SOA)
Job submission (supercomputer access)
Cycle stealing
Service Oriented Architecture
(SOA)
Applications are exposed as services, which provide
a well-defined interface and are accessed through
standard protocols
Clients use remote procedure calls to access these
services
Request
Client
Service
Response
Benefits of SOA
SOA is platform agnostic
– Client doesn’t need to know how service is implemented
– Service doesn’t need to know how client is implemented
SOA is vendor independent
– Based on open standards – no “lock in”
– All SOA vendors support the same standards to enable
interoperability
SOA is widely supported
– Many companies are getting behind it
– Being adopted widely in commercial and scientific
organisations
Job submission
Many organisations have large supercomputers (SMP or
clusters) that they want users to be able to submit jobs to
This can be achieved by installing middleware on each
supercomputer which interfaces to the local job queue
– e.g. Globus GRAM - allows users to submit to job queues
such as PBS, LSF, etc.
Users submit jobs to a superscheduler which manages a
“higher level” queue and dispatches jobs to resources
The grid middleware handles tasks such as copying files to
and from the execution node, monitoring job progress, and
abstracts the details of these away from clients
Job submission
Cluster
SMP machine
Cluster
Superscheduler
Client
Client
Client
Benefits of Job Submission Grids
Users do not have to worry about differences
between job submission systems running on
different resources
Superschedulers make it possible to automatically
find resources that will execute the job quicker
A user submits a job to a grid, it runs, and they get
the results back later
Job submission can be implemented on top of SOA
by providing a service with methods for submitting
and monitoring jobs, as well as notifying clients of
failures or completion
– e.g. Globus MMJFS – provides a web service interface to
allow users to submit jobs
Cycle stealing
The use of large numbers of desktop PCs to run
“embarrassingly parallel” applications
A master node coordinates execution and hands out
tasks to workers
The worker process on each machine polls the
master for work to do, and then executes the tasks
as they become ready
Worker detects when the machine is being used by
a user and suspends/aborts the active task
This model is inherently fault tolerant; if a machine
dies or a task is aborted it can just be sent to
another worker
Cycle stealing
Master
Worker
Worker
Worker
Worker
Benefits of cycle stealing
Organisations can use their existing infrastructure to
run computationally demanding applications
– No need to invest in large SMP systems or clusters
Large-scale internet projects can get free computing
power
– …provided they can convince users to donate CPU time
– e.g. SETI@Home
Cheap supercomputing
Generally easy to deploy
So what really is grid computing?
Not really one specific technology or concept
More of an umbrella term, like “networking” or “operating
system”
Any (concrete) discussion of grid computing requires all
parties involved to agree on a definition of what features they
are focusing on
Very much dependent upon what you want to do – different
types of organisations have different requirements
Sometimes the lines are blurred and numerous systems
support multiple “types” of grid computing
Lots of hype – can be very confusing at first!
– it took me about a year to understand it enough to be able
to figure out what I wanted to do in my project
Web services
Web services are a particular type of SOA
Based on standards from W3C and others:
– WSDL - language for defining service interfaces
– SOAP - format used for exchange of messages
– UDDI - directory mechanism for locating services
– XML - used as standard encoding mechanism used by
WS protocols
– … and many more
Web services are supported by all major programming
languages
– either as part of built-in APIs or add-on libraries
Today web services are the most popular mechanism for
integrating systems together in and between organisations
Web service composition
A programming model based on composing together
functionality provided by multiple web services
Similar to the use of shared libraries/DLL files
– common functionality provided by shared entity (service)
– composition program builds additional functionality by
making use of one or more services
Service composition programs can themselves be exposed
as web services
– Can then be accessed by clients
– Or used as part of even higher-level service compositions
Most popular language at present is BPEL (Business Process
Execution Language)
SOA programming vs. remote
execution
Web services allow you to invoke programs already
installed on a remote machine
Remote code execution allows you to execute
arbitrary code on a remote machine
The latter is used for job submission and cycle
stealing systems
Our research investigates a combination of these
approaches
– Provides ability to invoke and expose web services
– Provides a distributed execution environment
Execution Environments
Problem: Need a standard way of executing
arbitrary code remotely
SOA doesn’t give you this
– it only standardises the protocols for different applications
to interact with each other
Job submission systems don’t give you this
– only standardise the means of submitting and monitoring
jobs – but not how they are actually executed
Cycle stealing requires this
– existing cycle stealing systems these days typically
specify Java or .NET, or use app-specific worker code
– but there is no standard which allows us to do this on an
internet scale
What is an execution environment
Instruction set
– e.g. x86, PPC, SPARC, Java bytecode, .NET bytecode
API library
– e.g. WIN32, POSIX, Java class libraries, .NET class
libraries
Applications are always compiled for a specific execution
environment
Can have different implementations of that environment
– x86 - AMD, Intel
– Java - Sun, IBM, various open source efforts
– .NET - Microsoft, Mono project
Applications compiled for a particular execution environment
can run on any implementation of it
Virtual machines
Common way of implementing an execution environment
Abstracts away from underlying hardware/OS, providing
platform independence
In a grid containing machines of different CPU architectures
and operating systems this is necessary to provide seamless
access
To enable code to be executed anywhere, each machine on
the grid must provide the same execution environment
Currently popular virtual machines:
– Java Virtual Machine (JVM)
– .NET Common Language Runtime (CLR)
A grid execution environment?
Problem: No standard execution environment
supported by the popular grid middleware
Standardisation efforts (GGF) to date have focused
only service interfaces, not implementation
Each grid middleware system provides its own set of
APIs, and is targeted at different VMs/OSs
Applications are not yet portable between different
middleware systems
– At least not in the same sense that bytecode-compiled
code is portable
– Compatibility exists only at the service interface level
Standardisation?
My belief:
We won’t see the full potential of grid computing until we
have agreement on a standard execution environment
Currently only SOA aspects are standardised
– But this goes only half way to solving the problem
This is is very much an open research issue
Obvious candidates are Java and .NET
– But are they sufficient? Should they be extended?
What about other alternatives?
– Much research already done into VM technology
– But not so much in the grid community
– IMHO a very important issue! More research needed here
Standardisation?
It’s just like the web
Early web pages were static, as there was no support for
executing code in the browser; code only ran server-side
– In the grid world this corresponds to SOA
Then came early versions of JavaScript/DHTML
– Lack of standardisation, browsers were incompatible
Now we have a standard, widely supported, platform
independent execution environment on 300+ million
computers worldwide (JavaScript/ECMAScript)
– And look what happened… client side web apps, AJAX,
Google maps, “Web 2.0” and the rest
I predict grid will go through the same evolution
Our current research
Investigating how to combine SOA and remote code
execution programming models
Development of a new virtual machine + language
implementation targeted at grid applications
GridXSLT
An implementation of the XSLT programming
language
– Supports web service composition
– Executes programs across a grid in parallel
– Provides a natural way to deal with XML data
Why XSLT?
Ideal for manipulating XML data
Has a “semantic match” with many properties of web
services
Is a functional language and can be automatically
parallelised
W3C standard with a sizable existing user base
– We wanted to avoid the challenges of trying to
design a new language and introduce it to the
world
– Better to just develop a new implementation of an
existing one which is already popular
Support for XML data
XSLT is specifically designed for dealing with XML data
All web services exchange data in XML format
Java, C#, C++ etc. are less suitable for manipulating XML
because they are not designed for this (and in fact pre-date
XML)
– XML data is a “second class citizen” in these languages
and must be accessed through library functions or
converted into objects
– APIs like DOM, SAX, etc. are less intuitive than built-in
language constructs
– Conversion to objects carries significant overheads and
risks losing information (e.g. element ordering)
We argue that XSLT is therefore a useful approach to
developing composite web services
Pass by value semantics
Another mismatch between OO languages and web services
is the way in which function arguments are handled
OO languages use pass by reference semantics - allowing a
function to modify its arguments and the caller to see those
changes
Web services use pass by value - where a new copy of each
argument is made and a function can only transfer
information to its caller through the return value
When using an OO language for WS development, the
programmer must be aware of this and it can sometimes lead
to mistakes
As a side effect-free functional language, XSLT uses pass by
value, avoiding this problem
Parallel execution
XSLT is a functional language
Functions and loops do not have side effects - there
is no global state that can be modified
This enables automatic parallelisation
– All arguments to a function call can be evaluated in
parallel
– All iterations of a loop can be evaluated in parallel
The programmer never needs to even know that
their program will be run in parallel
– No dealing with threads, synchronisation, critical sections,
message passing, race conditions etc…
– The underlying runtime system deals with all these issues
Implementing XSLT
We use a technique called graph reduction, a
common way if implementing functional
languages
A program is represented as a graph
Execution proceeds by performing a series of
transformations on the graph
Graph reduction: Example
2*(3+4)
@
@
*
@
2
@
+
4
3
Graph reduction: Example
2*(3+4)
@
@
*
@
2
@
+
4
3
Graph reduction: Example
2*7
@
@
*
7
2
Graph reduction: Example
2*7
@
@
*
7
2
Graph reduction: Example
14
14
Parallel graph reduction
Graph reduction permits the possibility of
parallel execution by allowing multiple parts of
the graph to be reduced in parallel
Each processor in a parallel computer or
cluster can manipulate a separate portion of
the graph
Parallel graph reduction
+ (nprime 2000) (nprime 2001)
@
@
+
@
@
nprime
nprime
2000
2001
Parallel graph reduction
+ (nprime 2000) (nprime 2001)
@
@
+
@
@
nprime
2001
Processor 2
nprime
2000
Processor 1
Parallel graph reduction
+ 17389 17393
@
@
+
17393
17389
Parallel graph reduction
34782
34782
Functional programming for grids?
It permits
Automatic, seamless parallelism
Automatic, seamless fault tolerance
Automatic, seamless distribution
But…
Some programs are based on state, which is in conflict with
the pure functional programming model
– Although there are ways to get around this, e.g. monads
Different programming style to what most people are used to
– Involves a learning curve
– But might be worth it to get the above benefits
– …depending on your needs
Summary
Grid computing is a very diverse area
– Many different types of systems
– Many different requirements
– Useful in many areas
Different “types” of grid computing
– SOA, job submission, cycle stealing
– Others as well that I haven’t discussed here
Lots of challenges and open research questions
– e.g. defining a suitable execution environment for grid
applications
– This is just one of many!
Summary
Our research project - GridXSLT
An attempt to combine different grid
computing models
– SOA
– Remote code execution/cycle stealing
Aims to make the programmer’s job easier
– Parallelisation handled by the compiler
– Suited to dealing with XML data exchanged by
web services and stored in XML databases
– High-level language which hides underlying
details
Websites of interest
Global Grid Forum
– http://www.ggf.org
Grid Café (introduction to grid computing)
– http://www.gridcafe.org
IBM - grid computing
– http://www.ibm.com/grid
GridXSLT
– http://gridxslt.sourceforge.net
Updates on my research
– http://pmkelly.blogspot.com