Transcript Disco
Disco: Running Commodity
Operating Systems on
Scalable Multiprocessors
Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali,
Yazen Ghannam, Tzu-Wei Kuo
Paper by: Edouard Bugnion, Scott Devine, Kinshuk Govil,
Mendel Rosenblum
Introduction
Pierre LaBorde
Introduction
• CC-NUMA
o Cache-Coherent Non-Uniform Memory Access
• Coupling with standard distributed protocols
TCP/IP
NFS
Global Buffer Cache
o
o
•
Introduction
• Hide NUMA-ness
o Page placement
o Dynamic page migration
o Dynamic page replication
Problem
• Operating systems for innovative hardware
o Scalable shared memory multiprocessors
• Significant changes required
o OS typically have millions of lines of code
Solution
Virtual Machine Monitors
• Instead of modifying existing OS
o Additional layer of software between hardware and OS
o Multiple copies of existing operating systems
Support a variety of workloads
o Virtualizes all of the resources
Exports conventional hardware interface
o
Schedules virtual resources on the physical
Processor
Memory
Virtual Machine Monitor
• Monitor and distributed protocols need to scale
o Simplicity of the monitor
o Fault-containment
o NUMA memory management issues
• Global policies
o Fine-grained resource sharing
Challenges
• Overheads
o Privileged instructions
o I/O Devices
• Resource Management
o Instruction execution stream
Idle loop
Lock busy-waiting
• Communication and Sharing
o Virtual disk
Disco: A Virtual Machine
Monitor
Jordan Deveroux
Disco's Interface
• Processors
o Abstraction of MIPS R10000 processor
o Does not support complete virtualization of kernel
virtual address space
o Extends architecture to support efficient access to
some processor functions
• Physical Memory
o Abstraction of main memory that resides in contiguous
physical address space
o Uses dynamic page migration and replication to export
nearly uniform memory architecture to the software
• I/O Devices
o Each virtual machine has specified set of I/O devices
o Intercepts communication from all of it's I/O devices for
translation or emulation
o Virtualizes access to the networking devices of the
underlying system
Implementing Disco
• Multithreaded, shared memory program
• Disco vs. Other Systems
o NUMA memory placement
o cache-aware data structures
o interprocessor communication patterns
• NUMA memory management
o Copy DISCO into all memories of FLASH
machine
• Cache-aware data structures
o Partitioned so that parts accessed only by a
certain processor are in memory near that
processor
• Interprocessor communication patterns
o Very few locks
o Wait-free synchronization
Implementing Disco: Virtual CPU's
• Emulates virtual CPU's by using direct
execution of real CPU's
• Same execution speed as running on real
CPU's
• Each virtual CPU has a data structure like
a process table entry in traditional O.S.
o Contains state of virtual CPU
• Runs in kernel mode with full access
• Simple scheduler allows virtual
processors to be shared
Implementing Disco: Virtual
Physical Memory
• Add a level of address translation and maintains
physical-to-machine address mappings
• Translation performed using translation-lookaside
buffer
• Memory references are translated through this
mapping from now on
• Each TLB entry is marked with an address space
identifier to avoiding the flushing the TLB on
context switches
• Each miss is more expensive
o emulation of trap architecture
o emulation of privileged instructions
o remapping of physical addresses
Implementing Disco: NUMA
Memory Management
• Optimization that enhances data locality
• Fast translation of virtual-to-physical
addresses
• Allocation of real memory to virtual
machines
• Only moves pages that will have
performance benefit
• Contains a memmap data structure with
an entry for each real machine memory
page
Two different virtual processors of the same virtual
machine logically read-share the same physical page,
but each virtual processor accesses a local copy
Implementing Disco: Virtual I/O
• Intercepts all device access from the
virtual machine and forwards them to the
physical devices
• Each disco device defines a monitor call
used by the device driver to pass all
command arguements
• Disks and network interfaces include a
map as part of their arguements
o list of address pairs that specify the
source and destination of I/O
operations
VM Sharing
Imran Ali
Copy-on-Write Disks
• Uses Virtual Memory Addressing to Map
Data to physical Memory
• Multiple Virtual Machines(VM) Share
Machine Memory
• Copy on write means that VM is unaware
of Machine Memory being shared
VM Sharing Pages
Virtual Network Interfaces
• Virtual Machines are not allowed to
communicate with each other
• Uses Standard Protocols to communicate
through Ethernet- type addressing
• All read only pages can be shared through
virtual machines reducing memory
overhead
• Pages are shared whenever possible and
are replicated when needed to improve
proformance
Transparent Sharing of Pages
Experimental Results
Yazen Ghannam
Experimental Setup
• Experiments are Simulated, not using real
hardware
• Used four different workloads
o Software Development (Pmake)
OS, I/O Intensive
o Hardware Development (Engineering)
OS light; Large memory footprint
o Scientific Computing (Raytrace, Radix)
OS light; uses shared memory regions
o Commercial Database
I/O light; Single memory intensive
Execution Overheads
Memory Overheads
Scalability
Page Migration and
Replication
Experiences and Related
Work
Tzu-Wei Kuo
Experiences on Real Hardware
• Disco was ported to run on a real
hardware in order to confirm the
simulation test results
• Run on SGI Origin200 board which forms
the basis of the FLASH machine
o Single - 180MHz MIPS R10000
processor
o 128MB of memory
Experiences on Real Hardware
• Overheads of Virtualization
• Two workloads
o Pmake: compiles Disco itself using the
SGI development tools, two files at a
time
o Engineering: simulates the memory
system of the FLASH machine
Experiences on Real Hardware
• This table shows a breakdown of the execution
time for the two workloads and a comparison
between IRIX and Disco running IRIX. The
execution time is broken down into the user,
system, and idle time.
Related Work
• System Software for Scalable Shared
Memory Machines
• Virtual Machine Monitors
• Other System Software Structuring
Techniques
• CC-NUMA Memory Management
Conclusion
• Developing system software for scalable
shared memory multiprocessors without
massive development effort
• Experimental results shows that the overhead
of virtualization is modest in both processing
time and memory footprints
• Disco provides simple solution for scalability
and reliability
• Lower implementation cost
Disco: Running Commodity
Operating Systems on
Scalable Multiprocessors
Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali,
Yazen Ghannam, Tzu-Wei Kuo
Paper by: Edouard Bugnion, Scott Devine, Kinshuk Govil,
Mendel Rosenblum
Title
• Text