Transcript ppt

CS7701: Research Seminar on Networking
http://arl.wustl.edu/~jst/cse/770/
Review of:
Operating System Support for
Planetary-Scale Network Services
• Paper by:
– Andy Bavier, Scott Karlin, Steve Muir, Larry Peterson, Tammo Spalink, Mike
Wawrzoniak (Princeton)
– Mic Bowman, Brent Chun, Timothy Roscoe (Intel)
– David Culler (Berkeley)
• Published in:
– First Symposium on Network Systems Design and Implementation (NSDI),
2004
• Additional presentation information:
– www.planet-lab.org
• Reviewed by:
– Chip Kastner
• Discussion Leader:
– Manfred Georg
1 - CS7701 – Fall 2004
Outline
• Introduction and Virtualization
• The Operating System
– Motivation and overview
– Virtualization
– Isolation / Resource Management
– Evolution
• Evaluation
• Conclusions
2 - CS7701 – Fall 2004
Introduction
• This paper is more or less an examination of
PlanetLab’s operating system
• What is PlanetLab?
– 441 machines at 201 sites in 25 countries
– Supporting 450 research projects
– Each machine runs a Linux-based OS and other tools
• Why does it exist?
– Mainly a testbed for planetary-scale network services
– Lets researchers test services in real-world conditions
at a large scale
– Also a deployment platform
• Who uses it?
– AT&T labs, HP labs, Intel
– Many prestigious universities – including Wash U
3 - CS7701 – Fall 2004
Introduction
• A PlanetLab user sees a private set of
machines (nodes) on which he can run his
applications
User 1
4 - CS7701 – Fall 2004
User 2
Introduction
• Reality is a bit different
• PlanetLab is an overlay network
Intel Labs
Interne
t
Princeton
5 - CS7701 – Fall 2004
Wash U
Tel-Aviv
University
Introduction
• Users are granted slices – Sets of nodes on
which users get a portion of the resources
Intel Labs
Interne
t
Princeton
6 - CS7701 – Fall 2004
Wash U
Tel-Aviv
University
Introduction
• So, a PlanetLab user is managing a set of
distributed Virtual Machines (VMs)
User 1
7 - CS7701 – Fall 2004
User 2
The Operating System
• Operating system design was designed
around two high-level goals
– Distributed virtualization: Each service runs in
an isolated section of PlanetLab’s resources
– Unbundled management: The OS is separate
from services that support the infrastructure
8 - CS7701 – Fall 2004
The Operating System
• Researchers wanted to use PlanetLab as
soon as the first machines were set up
• No time to build a new OS
• PlanetLab designers chose to deploy Linux
on the machines
– Linux acts as a Virtual Machine Monitor
– Designers slowly transform Linux via kernel
extensions
9 - CS7701 – Fall 2004
The Operating System
• A node manager runs on top of Linux
– A root VM that manages other VMs on a node
– Services create new VMs by calling the node
manager
– Services can directly call the only the local
node manager
• The node manager is hard-coded
– Local control can be added
• Infrastructure services can be given
privileges to perform special tasks
10 - CS7701 – Fall 2004
The Operating System
Unprivileged Slices
P2P Networks,
Embedded network
storage
Privileged Slices
Slice creation,
monitoring,
environment service
Node Manager
Resource allocation,
Auditing, Bootstrapping
Linux w/ kernel extensions
Virtual Machine
Monitor
11 - CS7701 – Fall 2004
The Operating System
• PlanetLab Central (PLC) is a service
responsible for creating slices
– It maintains a database of slice information on
a central server
– Users request slices through the PLC
• The PLC communicates with a resource
manager on each node to start up slices
and bind them to resources
12 - CS7701 – Fall 2004
Virtualization
• Virtualization can be done at several levels
• Virtualization of hardware would allow each
VM to run its own OS
– This is infeasible due to performance/space
constraints
• PlanetLab uses system-call level
virtualization
– Each VM sees itself as having exclusive access
to an OS
– All VMs on a node are actually making system
calls to the same OS
13 - CS7701 – Fall 2004
Virtualization
• How is virtualization maintained?
– The OS schedules clock cycles, bandwidth,
memory, and storage for VMs, and provide
performance guarantees
– It must separate name spaces – such as
network addresses and file names – so VMs
can’t access each other’s resources
– It must provide a stable base so that VMs can’t
affect each other (no root access)
14 - CS7701 – Fall 2004
Virtualization
• Virtualization is handled by a Linux utility
called vserver
• Vservers are given their own file system
and a root account that can customize that
file system
• Hardware resources, including network
addresses, are shared on a node
• Vservers are sufficiently isolated from one
another
• VMs are implemented as vservers
15 - CS7701 – Fall 2004
Isolation
• Buggy/malfunctioning or even malicious
software might be run on PlanetLab
• This can cause problems at many levels
Yahoo
Wash U
Google
Intel Labs
Princeton
Interne
t
SlashDot
Tel-Aviv U
AOL
16 - CS7701 – Fall 2004
Isolation
• The problem of isolation is solved in two ways
– Resource usage must be tracked and limited
– Resource usage must be able to be audited
later to see what actions were performed by
slices
17 - CS7701 – Fall 2004
Resource Management
• Non-Renewable Resources are monitored
and controlled
– System calls are wrapped and intercepted
– The OS keeps track of resources allocated to
VMs
– Resource requests are accepted or denied
based on a VM’s current resource usage
18 - CS7701 – Fall 2004
Resource Management
• Renewable Resources can be guaranteed
– A VM that requires a certain amount of a
renewable resource will receive it
– A fair best-effort method allocates resources to
remaining VMs
• If there are N VMs on a node, each VM receives at
least 1 / N of the available resources
19 - CS7701 – Fall 2004
Resource Management
• Scheduling & Management are performed
by Scout on Linux Kernel (SILK)
– Linux CPU scheduler cannot provide fairness
and guarantees between vservers
• Uses a Proportional Sharing method
– Each vserver receives a number of CPU shares
– Each share is a guarantee for a portion of the
CPU time
20 - CS7701 – Fall 2004
Auditing
• The PlanetLab OS provides “safe” sockets
– Each socket must be specified as TCP or UDP
– Each socket must be bound to a port
– Only one socket can send on each port
– Outgoing packets are filtered to ensure this
• Privileged slices can sniff packets sent
from each VM
– Anomalies can be caught before they become
disruptive
21 - CS7701 – Fall 2004
Evolution
• PlanetLab can be easily extended
• Researchers can create their own
infrastructure services for PlanetLab
– Services that help maintain PlanetLab itself
– Slice creation, performance monitoring,
software distribution, etc.
• It is possible for multiple researchers to
develop similar services in parallel
– A rather unique system evolution problem
22 - CS7701 – Fall 2004
Evolution
• What does this mean for PlanetLab’s OS?
– It must be responsible only for the basic tasks
– It must be fair in granting privileges to
infrastructure services
– It must provide an interface for creating a VM
23 - CS7701 – Fall 2004
Evolution
Unprivileged Slices
Privileged Slices
Node Manager
Linux w/ kernel extensions
24 - CS7701 – Fall 2004
• New services
should be
implemented at
the highest level
possible
• Services should
be given only the
privileges
necessary to
perform their
task
Evaluation
• Scalability
– A VM’s file system requires 29 MB
– 1,000 VMs have been created on a single node
• Slice creation
– Nodes look for new slice info every 10 minutes
– Creating a new VM from scratch takes over a
minute
• Service initialization
– Upgrading a package takes 25.9 sec on a
single node
– Slower when many nodes update a package at
once
25 - CS7701 – Fall 2004
Conclusion
• PlanetLab is an excellent option for
researchers to test new network services
• Virtualization provides an easy interface
• Services are reasonably well-protected
from one another
• Opportunities exist to expand PlanetLab’s
functionality
26 - CS7701 – Fall 2004