The Multikernel: A new OS architecture for scalable

Download Report

Transcript The Multikernel: A new OS architecture for scalable

The Multikernel:
A new OS architecture for scalable multicore systems
SOSP’09
Andrew Baumann et al.
2009. 10. 08.
CS530 Graduate Operating System
Presented by Jaeung Han, Changdae Kim
1
Introduction
• Mix of cores, caches, interconnect links..
– Increase scalability & correctness challenges for OS designers
– No longer acceptable to tune a general-purpose OS design
• Rethinking the structure of the OS
– build the OS as a distributed system
• Multikernel
– Allow us to apply insights from distributed system
2/26
Observation
• The architecture of future computer
– Rising core counts
– Increasing hardware diversity
3/26
Future computer - Many cores
• Many cores
– Sharing within the OS is becoming a problem
• Cache-coherence protocol limits scalability
– Prevents effective use of heterogeneous cores
• Scaling existing OSes
– Increasingly difficult to scale conventional OSes
• Removal of dispatcher lock in Windows7 6k line of code in 58
files
– Optimizations are specific to hardware platforms
• Cache hierarchy, consistency model, access costs
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
4/26
Future computer – Increasing hardware diversity
• Non-uniformity
– Memory hierarchy becomes more complicated
• NUMA..
• Many levels of cache sharing
– Device access
– Interconnect increasingly looks like a network
• Core diversity
– Architectural differences on a single die:
• Streaming instructions(SIMD, SSE, etc)
• Virtualization support, power management
– Within a system
• Programmable NICs
• GPUs
• FPGAs (in CPU sockets)
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
5/26
Future computer – Increasing hardware diversity
• System diversity
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
6/26
Future computer – Increasing hardware diversity
• System diversity
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
7/26
Future computer – Increasing hardware diversity
• System diversity
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
8/26
Observation
• The architecture of future computer
– Rising core counts
– Increasing hardware diversity
→ Monolithic OS need to delicate balancing between resources
• Increasing node heterogeneity
– Prevents memory structure optimization at source code level
– Need to adapt its communication patterns at run time
→ Future general-purpose system will have limited support for
cache coherence or shared memory
→ Time to reconsider how the OS should be reconstructed
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
9/26
The multikernel model
• Multikernel:
– OS as a distributed system of
cores
• Communicate using messages
• No memory is shared
– Design principle
• Make all inter-core
communication explicit
• Make OS structure hardwareneutral
• View state as replicated
instead of shared
10/26
Traditional OS vs. multikernel
• Traditional OSes scale up by:
– Reducing lock granularity
– Partitioning state
• Multikernel
– State partitioned/replicated by default rather then shared
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
11/26
Why message-passing?
• Decouples system structure from inter-core
communication mechanism
– Communication patterns explicitly expressed
– Naturally supports heterogeneous cores
– Naturally supports non-coherent interconnects
• Better match for future hardware
– With cheap explicit message passing
– Without cache-coherence
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
12/26
Make inter-core communication explicit
• No memory is shared between each core
• Explicit communication facilitates..
– Reasoning about the use of the system interconnect
– The OS to deploy well-known networking optimization
– The OS to provide isolation and resource management on
heterogeneous cores
– Decoupling the requests and responses
– The human or automated analysis
13/26
Make OS structure hardware-neutral
• Separate the OS structure as much as possible from
the hardware
– Adapting the OS to run on hardware with new performance
characteristics will not require extensive changes to the code
base
– Isolate the distributed communication algorithms from
hardware implementation details
– Enable late binding of both the protocol implementation &
message transport
14/26
View state as replicated
• The state is replicated and consistency is maintained
by exchanging messages
– Improve system scalability
• By reducing load on the system interconnect
• Contention for memory
• Overhead for synchronization
• Replication is..
– Required to support domains that do not share memory
– A useful framework within which to support changes to the
set of running cores in an OS
15/26
Barrelfish
• A substantial prototype operating system structured
according to the multikernel model
• Goals for Barrelfish
–
–
–
–
–
Give comparable performance
Demonstrates evidence of scalability
Can be re-targeted to different hardware without refactoring
Can exploit the message-passing abstraction
Can exploit the modularity of the OS
16/26
Implementation of Barrelfish (1/4)
• System structure
– Factored the OS instance on each core into a privileged-mode
CPU driver and a distinguished user mode monitor process
17/26
Implementation of Barrelfish (2/4)
• CPU drivers
– Enforces protection, performs authorization, time-slices
processes, mediates access to the core and hardware
– Serially handles traps and exceptions
– Shares no state with other cores
• Completely event-driven, single-threaded, nonpreemptable
• Monitors
– Collectively coordinate system-wide state
– Encapsulate much of the mechanism and policy that would be
found in the kernel of a traditional OS
– Mediates local operations on global state
– Replicated data structures are kept globally consistent
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
18/26
Implementation of Barrelfish (3/4)
• Process structure
– Represented by a collection of dispatcher objects
– Communication is occur between dispatchers
• Inter-core communication
– All communication occurs with messages
• Cache-coherent shared memory
• Memory management
– Physical memory must be managed as a global resource
• All memory management is performed explicitly through system
calls
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
19/26
Implementation of Barrelfish (4/4)
• System knowledge base
– Maintains knowledge of the underlying hardware
– Runs as on OS service
– Used by OS to derive system policies
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
20/26
Evaluation
• Unmap (TLB shootdown)
– Send a message to every core with a mapping, wait for all to
be acknowledged
– Linux/Windows:
1. Kernel sends IPIs
2. Spins on acknowledgement
– Barrelfish:
1. User request to local monitor
2. Single-phase commit to remote monitors
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
21/26
Results of unmap (TLB shootdown)
22/26
Evaluation
• IP loopback
23/26
Evaluation
• Compute-bound workloads
– NAS OpenMP, SPLASH-2
24/26
Evaluation
• IO workloads
– Network throughput
• 951.7 Mbit/s vs. 951 Mbit/s UDP echo
– Web server and relational DB
• 18697 requests per second vs. 8924 requests per second for
lihttpd/Linux
25/26
Conclusion
• Current OS structure is poorly suited for future
hardware architectures
– Poor at managing diversity and scale
• Multicore machines resemble networked system
– Need to view the OS as a distributed system
• Concurrency, communication, heterogeneity
• Tailor messaging mechanisms and algorithms to the machine
• Hide sharing as an optimization
Borrowed from The Barrelfish operating system for heterogeneous multicore systems
26/26
• Heterogeneous core works well rather than
homogeneous core
– http://cseweb.ucsd.edu/users/tullsen/isca04.pdf
• Heterogeneous core consume less power than
homogeneous core
– http://cseweb.ucsd.edu/users/tullsen/micro03.pdf
27/26