Live Migration of Virtual Machines

Download Report

Transcript Live Migration of Virtual Machines

Live Migration of Virtual Machines
Christopher Clark, Keir Fraser, Steven Hand,
Jacob Gorm Hansen, Eric Jul, Christian
Limpach, Ian Pratt, Andrew Warfield
University of Cambridge Computer Laboratory,
Department of Computer Science University of Copenhagen
Presentation by Marty Krogel
Outline

Goal

Why VM migration?

Related works

Design Choices


Implementation

Handling Local Resources

Design Overview
Testing the Implementation

Benchmarks

Future Work

Conclusion
Goal
To find a quick and efficient way to transfer
services between physical servers.
Challenges

Minimizing downtime.

Keeping total migration time down.

Avoid disrupting active services.
Benefits of Migrating Virtual Machines
Instead of Processes

Avoids 'residual dependencies'.

Can transfer in-memory state information.
Allows separation of concern between users
and operator of a data center or cluster.

Related Work

The Collective project

Zap

NomadBIOS

VMotion

Process migration

Sprite, MOSIX

Java and .NET even suffer
Memory Migration Options

Push phase

Stop-and-copy phase

Pull phase
Implementation

Pre-copy migration

Bounded iterative push phase




Rounds
Writable Working Set
Short stop-and-copy phase
Careful to avoid service degradation
Handling Local Resouces

Open network connections

Migrating VM can keep IP and MAC address.

Broadcasts ARP new routing information



Some routers might ignore to prevent spoofing
A guest OS aware of migration can avoid this problem
Local storage

Network Attached Storage
Design Overview
Tracking Writable Working Set





Xen inserts shadow pages under the guest OS,
populated using guest OS's page tables.
The shadow pages are marked read-only.
If OS tries to write to a page, the resulting page
fault is trapped by Xen.
Xen checks the OS's original page table and
forwards the appropriate write permission.
If the page is not read-only in the OS's PTE,
Xen marks the page as dirty.
Writable Working Set
Linux Kernel Compile
OLTP Database
Quake 3
SPECweb
Implementation Issues

Managed Migration vs. Self Migration

Dynamic Rate-Limiting

Rapid Page Dirtying

Paravirtualized Optimizations

Stunning Rogue Processes

Freeing Page Cache Pages
Stunning Rogue Process
Test Setup

2 Dell PE-2650 server-class machines

Dual Xeon 2GHz CPUs

Only used 1 CPU

HyperThreading enabled

2GB Memory

Broadcom TG3 network interface

Gigabit Ethernet switch


NetApp F840 Network attached storage server using iSCSI
protocol
XenLinux 2.4.27 OS
Simple Web Server
Complex Web Workload:
SPECweb99
Low-Latency Server: Quake 3
Diabolical Workload: MMuncher
Conclusion
By integrating live OS migration into the Xen
virtual machine monitor, rapid movement of
interactive workload across a cluster or data
center is possible. Using pre-copying and
dynamically adapting network-bandwith total
downtime can be reduced to imperceptible
levels, even for complex interactive services.
Future Work

Cluster Management

Wide Area Network Redirection

Migrating Block Devices
Questions?