Slide - Electronics and Computer Science

Download Report

Transcript Slide - Electronics and Computer Science

COMP3019
Coursework:
Introduction to M-grid
Steve Crouch
[email protected], stc@ecs
School of Electronics and Computer Science
Objectives

To equip students to drive a lightweight grid
implementation to solve a problem that can
benefit from using grid technology.

To develop an understanding of the basic
mechanisms used to solve such problems.

To develop a general architectural and
operational understanding of typical
production-level grid software.

To develop the programming skills required
to drive typical services on a productionlevel grid.
Overview
 Part 1: m-grid
– m-grid: lightweight software illustrating grid concepts
in use
– Develop a program with m-grid’s Java API to solve a
simple problem, submit it to m-grid with input data,
collect results
 Part 2: Google MapReduce & GridSAM
– MapReduce: framework for distributed processing of
large datasets using many computers
– GridSAM: job submission web service interface to a
computational resource (e.g. compute cluster, single
machine)
– Extend code stubs to submit jobs to GridSAM and
monitor them to completion
– Extend pseudocode that implements a basic
MapReduce framework
Where to get stuff/help?
 Can obtain coursework materials from
website
– Ready for Wednesday
 Software documentation
 Coursework help lecture 19th March
 Myself: [email protected]
 Building 32: Level 4 lab 4067 Bay 23
Background
The Problem
 Basically, want to run compute-intensive
task
 Don’t have enough resources to run job
locally
– At least, to return results within sensible
timeframe
 Would like to use another, more capable
resource
• Small number of ‘fast’ computers
– Very expensive
– Centralised
– Used nearly all the time
– Time allocations for users
– Not updated often
• Punched cards
Cray X-MP
(Cray -1 successor)
brewbooks
1710
• Wait Univac
time
huge
• MailNet, SneakerNet,
TyperNet, etc…
• Mainframes
• Cray-1 1976 - $8.8
million, 160 megaflops,
8MB memory
Michael L. Umbricht and Carl R.
Friend
Distributed Computing in Olden
Times
The Present…
 Now… large number of slow computers:
– Cheap
– Distributed
 Computation
 Ownership
– Not used all the time
– Exclusive access to users
– Updated often
– e.g. desktop computers, PDAs, mobile phones
 Low utilisation of computing power
 e.g.: institutional/university resources…
It’s About Scaling Up…
• Then… the march towards localisation of
computation, the Personal Computer
• Computational Science develops in laboratories
• Is this changing again?
Institutional
Local
National
International
• Compute and data – you need more, you
go somewhere else to get it
Images: nasaimages, Extra Ketchup, Google Maps, Dave Page
The Grid - a Reminder
 The grid – many definitions!
“Grid computing offers a model for solving massive computational
problems by making use of the unused CPU cycles of large numbers of
disparate, often desktop, computers treated as a virtual cluster
embedded in a distributed telecommunications infrastructure” –
Wikipedia
“A service for sharing computer power and data storage capacity over
the Internet.“ – CERN (European Organisation for Nuclear Research)
 Two components of grid computing:
– Computational/data resource – e.g. computational cluster,
supercomputer, desktop machine
– Infrastructure for externalising that resource to others
Some Examples…
 Grid (i.e. internet-accessible)
examples:
– SETI@Home http://setiathome.ssl.berkeley.edu/
 Process data from Arecibo Radio
Telescope, Puerto Rico
 2 million volunteers installed
software
– Univa.org- http://www.univa.org/
 Projects such as Cancer Research,
Smallpox
 2.5 million volunteer systems
 Sells processing time to organisations
 Computational resource (i.e. intranetaccessible):
– Cluster managers, supercomputer, single
machine
The Idea - as a Provider…
 Goal: I want others to access my resources &
applications
 I want to provide secure controlled access to:
– My applications:
 Specify who can access which applications
– My computational or data resources:
 I can limit external usage of my resources
 Provides an interface that allows remote users
to access my resources
 Enable collaboration with other partners
The Idea - as a User (or Client)
 Goal: I want to use other resources &
applications
 Through a network of service providers I
can…:
– Gain access to applications that I do not have
installed locally
– Use remote machines [larger resource] with more
CPU, memory or storage
 Process larger problem sizes
– Transparently switch between different service
providers
 No exposure to underlying OS, queuing policy, disk
layout etc.
Cluster Computing & the Grid
University B
Grid
Cluster
Computing
University A
University C
Grid is predominantly built on Cluster
Computing solutions
The General Idea…
Client
Client
…
Executor
Coordinator
Executor
…
 Abstract ‘virtualisation’ of local network
resources
– Infrastructure manages many machines
– Visualisation as a single resource
– Submitted jobs get put on queue(s)
Condor – Background
 Begun in 1988, based on Remote-Unix (RU)
project
 Predominantly makes use of idle cycles on
machines
Condor Components
 Four main machine ‘roles’ (daemons):
– Submit Client (condor_schedd): used to submit
resource requests, monitor, modify and delete
jobs.
– Central Manager, Server
 condor_collector: collects information about
pool resources.
 condor_negotiator: negotiates (match-makes)
between resources and resource requests.
– Job Executor (condor_startd): executes jobs,
advertises resources. Enforces local policy.
– (Checkpoint Server (condor_ckpt_server):
services requests to store and retrieve checkpoint
files.)
Condor Architecture
Submit client
(condor_schedd,
condor_shadow…)
Server
2
Client
Negotiator (condor_negotiator)
Queue
Collector (condor_collector)
4
Executor
(condor_startd,
condor_starter…)
5
7
1
Queue
…
3
Executor
Shared
Disk
…
1.
Client submits job (executable + input data) to local queue
2.
Client schedd advertises job request to server collector
3.
Server negotiator gets next priority request from collector
4.
Negotiator negotiates w/ client schedd to match resource/job
5.
Client removes job from queue and sends it to executor
6.
Job runs on executor
7.
Job output results returned to client
6
M-grid
An overview
Computational Grids - in
General
Client
Client
Executor
Coordinator
Executor
…
…
 Users supply tasks to be performed via client
 Execution nodes contribute processing power
 Coordinator node sends tasks to execution nodes,
ensuring results returned
 Existing grid tech. sophisticated -> significant
complexity
– To what extent can this be reduced?
Java Applets?
 How about Java applets as a program unit?
– Browsers could act as execution nodes
 Security concerns?
– Web browsers execute foreign code
– Java applets executed within a ‘sandbox’ virtual
machine
– Stringent security restrictions imposed
– In-built security configuration in browsers
– Applet can only contact originating server
 Risk significantly reduced
M-grid: A Lightweight Grid I
Client
Client
 M-grid:
–
–
–
–
…
Executor
Coordinator
Executor
…
Execution node = Java-applet enabled browser
Client = browser
Coordinator = web server
Tasks distributed as Applets in web pages
 Execution node browser opens web page
on server
 Runs task applet
 Uploads results to server
M-grid: Overview

Implemented on:
–
–

Client
–
–
–
–
–

Develops applet class as extension to MGridApplet class
Can run applet locally in appletviewer for testing
Compiles and packages applet with input parameters file into a
jar file
Submits jar to web server via JobSubmit web page
Eventually collects results via ViewJobs web page
Execution node
–
–

Microsoft’s IIS (Internet Information Server) using ASP
Apache Tomcat – we’ll use this one!
Requests a job via JobRequest page
Applet submits results from job using SubmitResults page
Security provided by session authentication
Architecture