Transcript Document

Disco: Running Commodity
Operating Systems on Scalable
Multiprocessors
Bugnion et al.
Presented by: Ahmed Wafa
Overview
•
Shared Memory Multiprocessor Machines
•
System Software for Shared Memory Multiprocessor
•
Disco as a virtual machine monitor
•
Performance
•
Conclusions
Shared Memory Multiprocessors
•
UMA (Uniform Memory Access)
•
NUMA (Non Uniform Memory Access)
•
ccNUMA (Cache-Coherent NUMA)
The Stanford FLASH Multiprocessor
Fig1. The FLASH system architecture
System Software for Shared
Memory Multi-processors
•
Invest a large development effort for
designing/developing custom OS
•
Statically Partition the machine and run
commodity systems that communicate
together using distributed protocols
•
Run multiple commodity systems on top of a
virtual machine monitor that provide dynamic
resource sharing
Disco: A Virtual Machine Monitor
Fig.2 Disco's Architecture
Challenges facing Virtual Machines

Overheads.

Resource Management.

Communication and Sharing.
Disco

Interface.

Implementation.

Running Commodity Operating systems.
Disco's Interface

Processors
–


Virtual CPUs that abstracts the MIPS R10000
Physical Memory
–
Contiguous space starting at address 0
–
Near Uniform memory access time
I/O Devices
–
Virtualize access to devices.
–
Virtual subnet to allow communication
Disco's Implementation

Virtual CPUs

Virtual Physical Memory

NUMA Memory Management

Virtual I/O Devices

Copy-on-write Disks

Virtual Network Interfaces
Implementation: Virtual CPUs
•
Direct Execution when possible
•
MIPS modes
–
Kernel
–
Supervisor
–
User
Implementation: Virtual Physical
Memory
•
Fast and Correct Mapping from the VM virtual
address space to the real machine address
space
•
Handling of TLB misses and the pmap data
structures
•
TLB miss cost and Disco's Second-Level TLB
Implementation: NUMA Memory
Management
•
Satisfying Cache misses from local memory
•
Page Migration and Replication
•
FLASH cache miss counter and Hot Pages
•
Disco memmap and TLB shootdowns
Fig.3 Page Replication
Implementation: I/O
•
•
Virtual I/O Devices
–
Intercepting device access and DMA requests
–
Special device drivers, single trap per request
Copy-on-Write Disks
–
Used for shared disks
–
Speed up disk reads by sharing memory
Fig.4 Copy-on-Write Page Sharing
Virtual Network Interface
•
Virtual subnet to allow VM communication
•
No MTU limit for packets
•
Aligned messages that span complete pages
will be re-mapped rather than copied
•
Sharing vs Replicating for maximizing data
locality
Fig.5 NFS sharing, replacing bcopy to remap data
Running Commodity Operating
Systems

IRIX 5.3 a SVR4 based UNIX from SG.

Changes for IRIX

MIPS and KSEG0

Device Drivers

HAL

Network mbuf sharing
Experimental Results

Setup

Execution Overheads

Memory Overheads

Scalability

Dynamic Page Migration and Replication
Experimental Setup
•
The FLASH machine was not available at the
time of development
•
Hardware emulation using SimOS
Execution Overheads
•
3-16% slowdown
–
TLB misses
–
System calls
Fig.2 Virtualization Overhead
Fig.6 Service Breakdown for pmake workload
Memory Overheads
Fig.7 Service Breakdown for pmake workload
Scalability
•
IRIX and memory management
•
Using 8 VM(s) reduces delay by 40%
Fig.8 Workload Scalability Under Disco
Dynamic Page Migration and
Replication
•
The NUMA problem
•
IRIX(non NUMA aware) vs DISCO
Fig.9 Performance Benefits of Page Migration
Conclusions
•
Developing system for software for Shared
Memory Multiprocessor machines can cost
much
•
Disco can present such a machine as a
cluster of connected machines through
virtualization
http://groups.csail.mit.edu/cag/raw/documents/R4400_U man_book_Ed2.pdf
•
Disco will allow commodity operating system
to be used on such machines
Conclusions
•
Virtualization overhead can be as low as 3%16%
•
Optimization techniques can enhance
scalability and hide NUMAness from
commodity OS
References
•
“The Stanford FLASH multiprocessor”, kuskin
et al.
•
MIPS R4400 manual
http://groups.csail.mit.edu/cag/raw/documents/R4400_Uman_book_Ed2.p
d