Globus Virtual Workspaces

Download Report

Transcript Globus Virtual Workspaces

Cloud Computing with Nimbus
FNAL, January 2009
Kate Keahey
([email protected])
University of Chicago
Argonne National Laboratory
Cloud Computing
Elastic computing,
Pay-as-you-go,
Capital expense
Science Clouds
operational expense
10/20/08
The Nimbus Toolkit: http//workspace.globus.org
Everything-as-a-Service
SaaS
PaaS
IaaS
10/20/08
The Nimbus Toolkit: http//workspace.globus.org
The Quest Begins
10/20/08

Code complexity

Resource control
The Nimbus Toolkit: http//workspace.globus.org
“Workspaces”


10/20/08
Dynamically provisioned environments

Environment control

Resource control
Hardware implementations vs virtualization
The Nimbus Toolkit: http//workspace.globus.org
A Brief History of Nimbus
STAR production
runs on EC2
Xen released
2003
Research on
agreement-based
services
10/20/08
EC2 goes online
2006
First
Workspace Service
release
Nimbus Cloud
comes online
2009
EC2 gateway
available
The Nimbus Toolkit: http//workspace.globus.org
Support for
EC2 interfaces
Nimbus Overview

Goal: open source, extensible, IaaS
implementation and tools




Tools



10/20/08
Specifically targeting scientific community
A platform for experimentation with features for
scientific needs
Set up private clouds (privacy, expense
considerations)
IaaS layer (Workspace Service)
Orchestration layer (Context Broker, gateway)
http://workspace.globus.org/
The Nimbus Toolkit: http//workspace.globus.org
The Workspace Service
VWS
Service
10/20/08
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
The Nimbus Toolkit: http//workspace.globus.org
The Workspace Service
The workspace service publishes
information on each workspace
as standard WSRF Resource
Properties.
VWS
Service
Users can query those
properties to find out
information about their
workspace (e.g. what IP
the workspace was
bound to)
Users can interact
directly with their
workspaces the same
way the would with a
physical machine.
10/20/08
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Trusted Computing Base (TCB)
The Nimbus Toolkit: http//workspace.globus.org
Workspace Service
Interfaces and Clients


Web Services based
Web Service Resource Framework (WSRF)


Elastic Computing Cloud (EC2)



Supported: ec2-describe-images, ec2-run-instances, ec2describe-instances, ec2-terminate-instances, ec2-rebootinstances, ec2-add-keypair, ec2-delete-keypair
Unsupported: availability zones, security groups, elastic IP
assignment, REST
Used alongside WSRF interfaces

10/20/08
GT-based
E.g., the University of Chicago cloud allows you to connect
via the cloud client or via the EC2 client
The Nimbus Toolkit: http//workspace.globus.org
Security


GSI authentication and authorization

PKI credential required

Works with Grid proxies

VOMS, Shibboleth (via GridShib), custom PDPs
Secure access to VMs


Validating images and image data

10/20/08
EC2 key generation or accessed from .ssh
Collaboration with Vienna University of Technology
The Nimbus Toolkit: http//workspace.globus.org
Networking

Network configuration



Internal: private network via a local cluster
network
Each VM can specify multiple NICs mixing
private and public networks (WSRF only)

10/20/08
External: public IPs or private IPs (via VPN)
E.g., cluster worker nodes on a private
network, headnode on both public and
private network
The Nimbus Toolkit: http//workspace.globus.org
The Back Story
Workspace WSRF front-end
that allows clients
to deploy and manage
virtual workspaces
VWS
Service
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Workspace back-end:
Resource manager for
a pool of physical nodes
Deploys and manages
Workspaces on the nodes
Each node must have a
VMM (Xen) installed, as
well as the workspace control
program that manages
individual nodes
Trusted Computing Base (TCB)
10/20/08
The Nimbus Toolkit: http//workspace.globus.org
EC2
WSRF
Workspace Components
workspace
service
workspace
resource
manager
workspace
control
workspace
pilot
workspace
client
10/20/08
The Nimbus Toolkit: http//workspace.globus.org
Workspace Control


VM image propagation
Image management and reconstruction


VM control








10/20/08
Starting, stopping, pausing, etc.
Integrating a VM into the network


Creating blank partitions, sharing partitions
Assigning MAC addresses and IP addresses
DHCP delivery tool
Building up a trusted (non-spoofable) networking layer
Contextualization information management
Talks to the workspace service via ssh
Standalone component
Some functionality overlap with libvirt
Implementations in Xen and KVM (queued up for release)
The Nimbus Toolkit: http//workspace.globus.org
The Workspace Resource
Manager





Basic slot fitting
Implements “immediate leases”
Extensible vehicle to experiment with different
leases
Open source resource manager for multiple
different VMMs
Datacenter technology equivalent


Deployment

10/20/08
Can be replaced by OpenNebula or other datacenter
technologies
University of Chicago, University of Florida, Purdue,
Masaryk University and all the other Science Cloud
sites
The Nimbus Toolkit: http//workspace.globus.org
The Workspace Pilot


Challenge: how can I provide a virtualization
solution without disrupting the current operation of
my cluster?
Flying Low: the Workspace Pilot





Deployment


10/20/08
Integrates with popular LRMs (such as PBS, SGE)
Implements “best effort” leases
Glidein approach: submits a “pilot” program that claims a
resource slot
Includes administrator tools
Testing @ U of Victoria (Atlas), Ian Gable and collaborators
Adapting for the use of the Atlas experiment @ CERN,
Omer Khalid
The Nimbus Toolkit: http//workspace.globus.org
Cloud Closure
EC2
WSRF
storage
service
workspace
service
cloud
client
10/20/08
workspace
resource
manager
workspace
control
workspace
pilot
workspace
client
The Nimbus Toolkit: http//workspace.globus.org
IaaS Gateway

Goals







10/20/08
Access to different IaaS infrastructures
Account management
Facilitate movement between academic and
commercial clouds and creation of meta-clouds
Combine higher-level tools and IaaS
Released as service, not as code
First online in June 2007, currently in a rewrite
Used to move e.g., HEP STAR experiments
between Science Clouds and EC2
The Nimbus Toolkit: http//workspace.globus.org
The IaaS Gateway
EC2
WSRF
storage
service
workspace
service
10/20/08
workspace
control
workspace
pilot
IaaS
gateway
cloud
client
workspace
resource
manager
EC2
potentially other providers
workspace
client
The Nimbus Toolkit: http//workspace.globus.org
One-click Virtual Clusters

Parameterizable appliance

Tightly-coupled clusters
IP1
HK1
IP2
HK2
IP3
HK3
MPI
Reciprocal exchange of information:
networking and security
10/20/08
The Nimbus Toolkit: http//workspace.globus.org
Context Broker
IP1
HK1
IP1
IP2
HK1
HK2
HK3
HK1
IP1
HK1
IP1
HK1
IP1
IP1
HK1
IP1
IP2
HK2
IP1
IP2
HK2
IP1
IP2
IP1
HK2
IP1
IP3
HK3
IP1
IP3
HK3
IP1
IP3
IP1
HK3
IP1
Context
Broker
10/20/08
IP3
IP1
The Nimbus Toolkit: http//workspace.globus.org
Goals for Context Broker

Can work with every appliance


Can work with every cloud provider


10/20/08
Appliance schema, can be implemented in
terms of many configuration systems
Simple and minimal conditions on generic
context delivery
Can work across multiple cloud providers,
in a distributed environment
The Nimbus Toolkit: http//workspace.globus.org
Status for Context Broker




10/20/08
Release history:

In alpha testing since August ‘07

First released summer July ‘08 (v 1.3.3)

Latest update January ‘09 (v 2.2)
Used to contextualize 100s of nodes for EC2 STAR
runs
Contextualized images on workspace marketplace
Working with rPath to make contextualizatin easier
for the user
The Nimbus Toolkit: http//workspace.globus.org
End of Nimbus Tour
EC2
context broker
WSRF
storage
service
workspace
control
workspace
pilot
IaaS
gateway
context
client
10/20/08
workspace
service
workspace
resource
manager
cloud
client
EC2
potentially other providers
workspace
client
The Nimbus Toolkit: http//workspace.globus.org
Science Clouds

Make it easy for scientific projects to experiment with
cloud computing


Evolve software in response to the needs of scientific
projects


10/20/08
Can cloud computing be used for science?
Start with EC2-like functionality and evolve to serve
scientific projects: virtual clusters, diverse resource leases
Federating clouds: moving between cloud resources in
academic and commercial space
The Nimbus Toolkit: http//workspace.globus.org
Science Cloud Resources

University of Chicago (Nimbus):



University of Florida






Online since 05/08
16-32 nodes, access via VPN
Other Science Clouds


first cloud, online since March 4th 2008
16 nodes of UC TeraPort cluster, public IPs
Masaryk University, Brno, Czech Republic (08/08), Purdue
(09/08)
Installations in progress: IU, Grid5K, others
Using EC2 for overflow
Minimal governance model
http://workspace.globus.org/clouds
10/20/08
The Nimbus Toolkit: http//workspace.globus.org
Cloud Use

~100 DNs
60
Utilization:



Nimbus utilization
50
Overall: 16%
Peak pw: 86%
(week of 7/14)
Requests rejected:

None untill 7/14

Lots afterwards ;-)
utilization (%)

40
30
20
10
0
3/08 4/08 5/08 6/08 7/08 8/08 9/08 10/08 11/08 12/08 1/09
Data scaled to the number of days
10/20/08
The Nimbus Toolkit: http//workspace.globus.org
Who Runs on Nimbus?
Hadoop
AliEn
GT-scalability
STAR
Montage workflows
GridFTP testing
workspace-team
Testing
OSG
geofest
bioinformatics
Other
Project diversity: Science, CS, education, build&test…
10/20/08
The Nimbus Toolkit: http//workspace.globus.org
Hadoop over ManyClouds
U of Florida
U of Chicago
ViNE
router




10/20/08
ViNE
router
CS research: investigate latency-sensitive apps, e.g. Hadoop
Need access to distributed resources, and high level of privilege
to run a ViNE router
Virtual workspace: ViNE router + application VMs
Paper: “CloudBLAST: Combining MapReduce and Virtualization
on Distributed Resources for Bioinformatics Applications” by
Andréa Matsunaga, Maurício Tsugawa and José Fortes. eScience
2008.
The Nimbus Toolkit: http//workspace.globus.org
Alice HEP Experiment at
CERN

CHEP paper in preparation
10/20/08
The Nimbus Toolkit: http//workspace.globus.org
STAR


STAR: a high-energy physics experiment
Need resources with the right configuration



Complex environments: correct versions of operating
systems, libraries, tools, etc all have to be installed.
Consistent environments: require validation
A virtual OSG STAR cluster

OSG cluster



STAR worker nodes: SL4 + STAR conf
Requirements


10/20/08
OSG CE (headnode), gridmapfiles, host certificates, NSF, PBS
One-click virtual cluster deployment
Migration: Science Clouds -> EC2
The Nimbus Toolkit: http//workspace.globus.org
STAR (cntd)

From proof-of-concept to production runs




Performance


10/20/08
~2 years ago: proof-of-concept
Last September: EC2 runs of up to 100 nodes (production
scale, non-critical codes)
Testing for critical production deployment
Within 10% of expected performance for applications
Work by Jerome Lauret, Doug Olson, Leve Hajdu, Lidia
Didenko
The Nimbus Toolkit: http//workspace.globus.org
Scalability Testing

Motivation



Workspaces




10/20/08
Globus 101 + others
Requirements


Test scalability of various Globus components
Test on a different platforms
very short-term but flexible access to diverse platforms
Work by various members of the Globus community
(Tom Howe and John Bresnahan)
Resulted in provisioning a private cloud for Globus
Typically very short-lived communities of one
The Nimbus Toolkit: http//workspace.globus.org
Montage Workflows

Evaluating a cloud from user’s perspective

10/20/08
Paper: “Exploration of the Applicability of Cloud
Computing to Large-Scale Scientific Workflows”,
C. Hoffa, T. Freeman, G. Mehta, E. Deelman, K.
Keahey, SWBES08: Challenging Issues in
Workflow Applications
The Nimbus Toolkit: http//workspace.globus.org
Cloud Computing
Ecosystem
Appliance Providers
marketplaces
commercial providers
communities
Deployment Orchestrator
orchestrate the deployment of
environments across possibly
many cloud providers
User Environments
VMM/datacenter/IaaS
10/20/08
The Nimbus Toolkit: http//workspace.globus.org
Open Source IaaS
Implementations

OpenNebula



Eucalyptus




Open source implementation of EC2
Monash University, MeSsAGE Lab, 01/2009
Industry efforts

10/20/08
Open source implementation of EC2
UCSB, R. Wolski & team, 06/2008
Cloud-enabled Nimrod-G


Open source datacenter implementation
University of Madrid, I. Llorente & team, 03/2008
openQRM, Enomalism
The Nimbus Toolkit: http//workspace.globus.org
Friends and Family



Committers: Kate Keahey & Tim Freeman
(ANL/UC), Ian Gable (UVIC)
A lot of help from the community, see:
http://workspace.globus.org/people.html
Collaborations:





10/20/08
Cumulus: S3 implementation (Globus team)
EBS implementation with IU
Appliance management: rPath and Bcfg2 project
Virtual network overlays: University of Florida
Security: Vienna University of Technology
The Nimbus Toolkit: http//workspace.globus.org
To the Future and Beyond

Increasing Importance of Appliance Providers

Cloud computing tools

Increased interest in cloud interoperability



10/20/08
Standards: “rough consensus & working code”
Image formats, contextualization capabilities, cloud
interfaces, etc.
Cloud markets
The Nimbus Toolkit: http//workspace.globus.org