Transcript Document

PlanetLab Architecture
Larry Peterson
Princeton University
Issues
• Multiple VM Types
– Linux vservers, Xen domains
• Federation
– EU, Japan, China
• Resource Allocation
– Policy, markets
• Infrastructure Services
– Delegation
Need to define the PlanetLab Architecture
Key Architectural Ideas
• Distributed virtualization
– slice = set of virtual machines
• Unbundled management
– infrastructure services run in their own slice
• Chain of responsibility
– account for behavior of third-party software
– manage trust relationships
Trust Relationships
Princeton
Berkeley
Washington
MIT
Brown
CMU
NYU
ETH
Harvard
HP Labs
Intel
NEC Labs
Purdue
UCSD
SICS
Cambridge
Cornell
…
Trusted
Intermediary
NxN
(PLC)
princeton_codeen
nyu_d
cornell_beehive
att_mcash
cmu_esm
harvard_ice
hplabs_donutlab
idsl_psepr
irb_phi
paris6_landmarks
mit_dht
mcgill_card
huji_ender
arizona_stork
ucb_bamboo
ucsd_share
umd_scriptroute
…
Principals
• Node Owners
– host one or more nodes (retain ultimate control)
– selects an MA and approves of one or more SAs
• Service Providers (Developers)
– implements and deploys network services
– responsible for the service’s behavior
• Management Authority (MA)
– installs an maintains software on nodes
– creates VMs and monitors their behavior
• Slice Authority (SA)
– registers service providers
– creates slices and binds them to responsible provider
Trust Relationships
(1) Owner trusts MA to map network
activity to responsible slice
MA
(2) Owner trusts SA to map slice to
responsible providers
6
1
(3) Provider trusts SA to create VMs on
its behalf
4
Owner
Provider
2
(5) SA trusts provider to deploy
responsible services
3
5
SA
(4) Provider trusts MA to provide
working VMs & not falsely accuse it
(6) MA trusts owner to keep nodes
physically secure
Architectural Elements
MA
Node
Owner
slice
database
Owner
VM
NM +
VMM
SCS
VM
SA
node
database
Service
Provider
Narrow Waist
• Name space for slices
< slice_authority, slice_name >
• Node Manager Interface
rspec = < vm_type = linux_vserver,
cpu_share = 32,
mem_limit - 128MB,
disk_quota = 5GB,
base_rate = 1Kbps,
burst_rate = 100Mbps,
sustained_rate = 1.5Mbps >
Node Boot/Install Process
Node
Boot Manager
PLC Boot Server
1. Boots from BootCD
(Linux loaded)
2. Hardware initialized
3. Read network config
. from floppy
4. Contact PLC (MA)
6. Execute boot mgr
5. Send boot manager
7. Node key read into memory from floppy
8. Invoke Boot API
9. Verify node key, send
current node state
10. State = “install”, run installer
11. Update node state via Boot API
13. Chain-boot node (no restart)
14. Node booted
12. Verify node key,
change state to “boot”
PlanetFlow
• Logs every outbound IP flow on every node
– accesses ulogd via Proper
– retrieves packet headers, timestamps, context ids
(batched)
– used to audit traffic
• Aggregated and archived at PLC
Chain of Responsibility
Join Request
PI submits Consortium paperwork and requests to join
PI Activated
PLC verifies PI, activates account, enables site (logged)
User Activated
Users create accounts with keys, PI activates accounts (logged)
Slice Created
PI creates slice and assigns users to it (logged)
Nodes Added to
Slices
Slice Traffic
Logged
Traffic Logs
Centrally Stored
Users add nodes to their slice (logged)
Experiments run on nodes and generate traffic (logged by Netflow)
PLC periodically pulls traffic logs from nodes
Network Activity
Slice
Responsible Users & PI
Slice Creation
.
.
.
PI
SliceCreate( )
SliceUsersAdd( )
User
SliceNodesAdd( )
SliceAttributeSet( )
SliceInstantiate( )
PLC
(SA)
SliceGetAll( )
slices.xml
NM VM VM … VM
VMM
.
.
.
Slice Creation
.
.
.
PI
SliverCreate(ticket)
SliceCreate( )
SliceUsersAdd( )
User
PLC
(SA)
NM VM VM … VM
SliceAttributeSet( )
SliceGetTicket( )
VMM
.
.
.
(distribute ticket to slice creation service)
Brokerage Service
.
.
.
PI
rcap = PoolCreate(ticket)
SliceCreate( )
SliceUsersAdd( )
Broker
PLC
(SA)
NM VM VM … VM
SliceAttributeSet( )
SliceGetTicket( )
VMM
.
.
.
(distribute ticket to brokerage service)
Brokerage Service (cont)
.
.
.
PoolSplit(rcap, slice, rspec)
PLC
(SA)
User
BuyResources( )
NM VM VM
VM … VM
VMM
Broker
.
.
.
(broker contacts relevant nodes)
Policy Proposals
• Suspend a site’s slices while a its nodes are down
• Resource allocation to
– brokerage services
– long-running services
• Encourage measurement experiments via ScriptRoute
– lower scheduling latency for select slices
• Distinguish PL versus non-PL traffic
– remove per-node burst limits
– replace with sustained rate caps
– limit slices to 5GB/day to non-PL destinations (with exceptions)