ECI, July 2005
Download
Report
Transcript ECI, July 2005
Process Migration
Checkpoint/Restart
ECI, July 2005
Process Migration
Process migration benefits:
Tool for load balancing
Data access locality
Improved system administration
Mobile computing
ECI – July 2005
2
Process Migration Issues
Execution model: home, remote
Migrating virtual memory
Minimizing downtime
Cost of migration
Run time cost (home, remote)
Migration operation
Limitations of migration
ECI – July 2005
3
Checkpoint / Restart
Checkpoint/restart benefits:
Like migration plus …
Fault resilience
Fault recovery
High availability
Gang scheduling
Debugging, testing, developing
Security (honey-pot)
ECI – July 2005
4
Checkpoint/restart goals
Transparency
Support parallel programs
Multi-process
Multi-node
Security
Minimize required state
Minimize required storage
ECI – July 2005
5
CKPT: Application Level
Application level
Efficient
Non-preemptive
Lack of common API
Source code changes
Possible compiler support
Examples ?
ECI – July 2005
6
CKPT: Library Level
Library level
Typically use a signal handler (callback)
Common API
Restricts functionality (e.g., no IPC)
Relatively portable
Examples…
ECI – July 2005
7
CKPT: Library (contd)
Libckpt
Condor
Memory exclusion, incremental, forked
Modify source code, link statically
Support memory mapping, shared libraries
Relink to special library (needs object file)
Score, co-check
Parallel applications
Modify communication layer
ECI – July 2005
8
Implementation (contd)
Kernel level
Loadable kernel module vs. change kernel
Preemptive / cooperative
Access to entire process state
Complex, less portable
Examples: Sprite, Zap
Virtual machines
(soon)
ECI – July 2005
9
Multi-process Checkpoint
Global state
A set of states from all processes
Consistent global state
If the state of A reflects a message
received from B, then the state of B
reflects sending
If the state of A reflect a message sent
to B but not yet received, it must be part
of the channel state
ECI – July 2005
10
Consistent Global State
ECI – July 2005
11
Multi-process Checkpoint
Uncoordinated checkpoint
Inspect data to find recovery line
Processes are independent, efficient
Domino effect, much storage
ECI – July 2005
12
Multi-process Checkpoint
Coordinated checkpoint
Centrally managed
Blocking
All processes suspended
Flush communication channels
Non blocking
Delay in triggers may yield inconsistency
ECI – July 2005
13
Multi-process Checkpoint
Communication-induced
Piggyback process checkpoint status and
requests on messages
May require enforcing global checkpoint
Unpredictable checkpoint times
ECI – July 2005
14
Multi-process Checkpoint
Summary:
Uncoordinated
Coordinated
Communication
induced
Domino
effect
Possible
No
No
Management
overhead
None
More
Less
Decision
making
Local
Central
Local/central
Checkpoint
data stored
All
Latest only
Several
ECI – July 2005
15
Virtual Machines
“Any problem in computer science can be
solved by another layer of indirection”
ECI, July 2005
What is a Virtual Machine ?
An indirection layer below the execution
environment seen by applications and OS
Decouple architecture and user perceived
behavior of SW and HW resources from
their physical implementation
Provide a uniform view of the underlying
resources
Multiplex multiple virtual systems on a
single (physical) resource
ECI – July 2005
17
VM History
1960’s – Hypervisors (mainframes)
1980-90’s – Obsolete
Proliferation of cheap hardware
Hardware support neglected
Later 1990’s – Reincarnation
Time-share expensive hardware
No change to legacy software
For complex MPP lacking OS infrastructure
2000 - Today: Renaissance
Consolidation, isolation, reliability
ECI – July 2005
18
VM Benefits
Performance
Server consolidation
Efficient HW utilization
Adaptive resource balancing
Checkpoint/restart and migration
Security
Simple (reduced complexity)
Encapsulation and isolation
Mediation
ECI – July 2005
19
VM benefits (contd)
Reliability
Redundancy through replication
Disaster recovery
Deployment testing
And…
Quality of service
Transparent (for legacy SW)
Enhanced interoperability
Development & testing
ECI – July 2005
20
Server utilization
Cumulative usage of 28 servers:
Memory
45% of RAM not used 99.9% of time
25% of RAM never used concurrently
CPU
85% of CPU not used 99.9% of time
81% of CPU never used concurrently
Disk
68% of storage space never used
ECI – July 2005
21
Virtualization levels
HOST entity: encapsulates the guest
GUEST entity: managed by the host
Application programs
Libraries
Operating system
Hardware
ECI – July 2005
API
ABI
ISA
22
Process & System VM
Application
VMM
OS
Hardware
Application
OS
VMM
Hardware
ECI – July 2005
Application
Process
virtual
machine
Application
OS
Virtual
machine
23
VM at different levels
HW level
OS level
Virtual Servers, BSD Jail, Zap
Programming language level
VMware, Xen, Denali, Virtual PC, UML
Java, .NET
Network
VLAN, VPN
ECI – July 2005
24
VM Taxonomy
Process VM - virtual platform that
exists solely to support the process
Unix
Emulators (interpreters)
Dynamic binary translators
Optimize by block translation and caching
Java – “compile once run everywhere”
Intermediate machine code
Optimize by native compilation on-the-fly
ECI – July 2005
25
VM Taxonomy (contd)
System VM - complete persistent
system environment providing access to
virtual hardware
Classic - bare HW
Hosted VM
Easy install and maintenance
Leverage native services of underlying OS
Multiprocessor virtualization
ECI – July 2005
26
Hardware Virtualization
Challenges to build virtual machines
Performance isolation
Scheduling priority
Memory demand
Network traffic
Disk Access
Support for various OS platforms
Small performance overhead
ECI – July 2005
27
Lack of Hardware Support
Ring aliasing
Non-faulting access to privileged state
Address space compression
Where does the VMM reside ?
Impact on transitions
Does the guest see the right state ?
Traps, SYSENTER, SYSEXIT
Interrupts masking
Hidden state
ECI – July 2005
28
Now What ?
Hardware extensions
Software virtualization
Change semantics to support VM
Intel, AMD
Translate code to emulate desired behavior
VMware
Paravirtualization
Xen, Denali
ECI – July 2005
29
Hardware Extensions for VM
Root mode
Non-Root mode
Runs VMM
Like ring-0 before
Runs guest OS
Less privileged
Mask of events to trap
ECI – July 2005
30
VMware
Hardware virtualization
CPU, memory, I/O
Suspend/resume
Live migration
Design goals:
Compatibility
Performance
Simplicity
ECI – July 2005
31
VMware: CPU Virtualization
CPU Virtualization
Challenge: lack of HW support
Execute guest on bare hardware while
retaining control by the VMM
Traps privileged ops & emulates their action
POPF and read access to privileged state
Solution: fast binary translation
Only kernel mode code
Eliminate unnecessary traps
ECI – July 2005
32
VMware: Memory Virtualization
Memory virtualization
Challenges:
Shadow page tables
Inefficient page replacement
Oversized due to replication
Solutions:
Ballooning
Content based sharing
ECI – July 2005
33
VMware: I/O Virtualization
Challenge: wide variety of devices and
interfaces
Solution:
Hosted architecture
Trap through the VMM
Export special devices
ECI – July 2005
34
Xen: Paravirtualization
Provide some exposure to the
underlying hardware
Better performance
Must modify OS to adapt
No modifications to applications
ECI – July 2005
35
Xen (contd)
Downgrade privilege of guest OS
Guest registers syscall and page-fault
handlers with Xen
Partial access to page tables
Fast handlers for most exceptions
Expose set of simple device abstractions
ECI – July 2005
36
Xen (contd)
The cost of porting an OS to Xen:
Privileged instructions
Page table access
Network driver
Block device driver
<2% of code-base
ECI – July 2005
37
Denali
Lightweight protection domains
Minimalistic method geared for performance
Changes:
Idle loops - avoid busy wait
Interrupt queueing - save context switch
Interrupt semantics – “just”/”recent”
No virtual memory (!)
No BIOS – no legacy “crap”
Generic I/O devices
ECI – July 2005
38
Virtual Machine Migration
Optimizations:
Reduce memory state before snapshot
Reduce total cost by incremental updates
ballooning
COW hierarchy
Reduce start-up time by paging on-demand
Reduce transfer time relying on common data
Use hash functions to identify common blocks
ECI – July 2005
39
Virtual Machine Migration
Minimizing down time
Reduce size of VM state
Pre-copy static parts (or..)
Demand-copy static parts
Hot-copy dynamic parts
ECI – July 2005
40
OS Virtualization
Confine applications in containers
Advantages:
Fine granularity
Low overhead
Easier maintenance
Challenges
Transparency
Correctness
Extend OS:
Modify kernel, loadable module, library
ECI – July 2005
41
Isolation – BSD Jail
Create an isolated existing environment
via software means.
Uses chroot (private root per jail)
Processes in a jail are isolated from
files, processes, or network services in
other jails.
A jail can be restricted to a single IP
address.
ECI – July 2005
42
Specialized Virtualization –
Linux VServer
Hosting (consolidation)
Experimentation
Education (do you trust students … ?)
Personal security box
Manage several "versions“
Applications
Virtual servers
Per user firewall
Fail over servers
Honey-pots
ECI – July 2005
43
Specialized Virtualization –
Linux VServer
Isolation
Kernel patch
Processes, file system, IPC, network,
super user capabilities
Add a “context” tag per process/resource
syscalls to handle contexts (irreversible)
Challenges
Capture all holes (indirect access !)
Efficient storage
ECI – July 2005
44
General Virtualization – Zap
Virtualization for isolation
POD – PrOcess Domain
Private namespace
Virtualization for migration
Decouple process from OS
Capture state and reconstruct state
ECI – July 2005
45
Zap – virtualization
Process environment
File system
Rely on “chroot” environment
Network
Interpose on system calls
Per protocol methods
Challenges
Race conditions (smp)
Life-span of objects
Fast translation
ECI – July 2005
46
Zap – Migration
Checkpoint – outside process context
Restart – inside process context
Capture process tree
Capture pod state
Capture per-process state
Restore process tree
Restore processes
Example issues
Sharing
Deleted files
ECI – July 2005
47