Multiple Processor Systems I

Download Report

Transcript Multiple Processor Systems I

Multiple Processor Systems I
CS 423
Klara Nahrstedt/Sam King
4/3/2016
1
Content
• Administrative
• Introduction
• Multiprocessors
–
–
–
–
4/3/2016
Hardware
OS Types
Synchronization
Scheduling
2
Administrative
• MP4 is out
• If you are changing groups for MP4, let staff
(profs, TAs) and your partners know
– We have to change multiple state machines to get
to a consistent state about your group status
– If you take the extra days, you don’t have to
ask/inform TAs/Profs about taking them since we
see the status in the compass
4/3/2016
3
Introduction
• Goal of computer architects, engineers and scientists to get more and more
•
computing power
Problem with the goal:
– Previous approaches: make clock run faster to speed up
– We are beginning to hit some fundamental limits on clock speed
– Reason: Einstein’s special theory of relativity
•
•
No electrical signal can propagate faster than the speed of light: 30 cm/ns in a
vacuum and 20 cm/ns over copper or optical fiber
This means:
– with 10 GHz clock, signals cannot travel more than 2 cm total,
– with 100 GHz clock, total signal path length at most 2 mm, and
– at 1000 GHz (1 THz) computer would have to be smaller than 100 µm
– Reason: heat dissipation
•
4/3/2016
making computers this small may be possible, but the faster the computer runs, the
more heat it generates, and the smaller the computer is, the harder it is to get rid of
the heat.
4
Computer Speedup
Moore’s Law: “The density of transistors on a chip doubles every 18 months, for the
same cost” (1965)
Image: Tom’s Hardware
Multiprocessor Systems
Tightly coupled
• Continuous need for faster computers
– shared memory model
– message passing multiprocessor
4/3/2016 – wide area distributed system
Loosely coupled
6
Multiprocessors
• Shared-memory multiprocessor:
A computer system in which two or more CPUs share
full access to a common RAM
• Inter-processor communication: one CPU writes some
data into memory and another one reads the data
out
• Multiprocessor OS performs:
•
– Regular OS functions: handling system calls, memory management, file
system access, management of I/O devices
– Unique to multiprocessor OS: process synchronization, resource
management, scheduling
Uniform vs. non-uniform memory access (UMA,
NUMA) multiprocessors
4/3/2016
7
Multiprocessor Hardware (1)
UMA Bus-based Multiprocessors
Problem: If the bus is busy when a CPU wants to read or write memory,
(a) the CPU waits until the bus becomes idle. As CPU speeds scale up, this
imposes an unacceptable performance penalty.
What is the solution?
4/3/2016
8
Multiprocessor Hardware (2)
UMA Multiprocessors Using Crossbar Switch
Problem: with best caching we can only support 16 or 32 CPUs
Solution: crossbar switch with non-blocking network
4/3/2016
9
Multiprocessor Hardware (3)
• (a) 256-node directory based multiprocessor
• (b) Fields of 32-bit memory address
• (c) Directory at node 36
4/3/2016
10
Multiprocessor OS Types (1)
Each CPU has its own OS
bus
Characteristics:
• share single copy of OS code
• partition multiprocessor memory among four CPUs
• sharing of set of disks and other I/O devices
Design aspects:
• when a process makes a system call, system call is handled by its own CPU
• no sharing of processes – consequence: no load balancing on CPUs
• no sharing of pages – consequence: no load balancing on memory
• independent buffer caching of disk blocks
• problem: no load balancing on CPU, memory and inconsistencies
4/3/2016
11
Multiprocessor OS Types (2)
Master-Slave Multiprocessors
Characteristics:
bus
• one copy of OS and its tables are present on one CPU and not the others
• all system calls are redirected to the one CPU for processing
• CPU with OS is called master, all other CPUs running applications are called slaves
Design aspects:
• there is only data structure that keeps track of ready processes – good for load
balancing
• pages can be allocated among all processes dynamically
• there is only one buffer cache so inconsistencies never occur
What are the problems with this OS type?
4/3/2016
12
Multiprocessor OS Types (3)
Symmetric Multiprocessors
Characteristics:
bus
• eliminates the master/slave asymmetry
• there is one copy of OS, but any CPU can run it
• when a system call is made, the CPU traps to the kernel, which processes the system
call
Design aspects:
• dynamic balance of processes and memory, since there is only one set of OS tables
• eliminates master as a bottleneck
What are the problems with this OS Type?
4/3/2016
13
Parallel Programming
Parallelization Idea
• Parallelization is “easy” if processing can be
cleanly split into n units:
work
Partition
problem
w1
w2
w3
Parallelization Idea (2)
w1
w2
w3
Spawn worker threads:
thread
thread
thread
In a parallel computation, we would like to have as many threads
as we have processors. e.g., a four-processor computer would be
able to run four threads at the same time.
Parallelization Idea (3)
Workers process data:
thread
w1
thread
w2
thread
w3
Parallelization Idea (4)
thread
w1
thread
w2
thread
w3
Report
results
results
Parallelization Pitfalls
But this model is too simple!
•
•
•
•
•
How do we assign work units to worker threads?
What if we have more work units than threads?
How do we aggregate the results at the end?
How do we know all the workers have finished?
What if the work cannot be divided into completely
separate tasks?
Parallelization Pitfalls (2)
• Each of these problems represents a point at
which multiple threads must communicate
with one another, or access a shared resource.
• Golden rule: Any memory that can be used by
multiple threads must have an associated
synchronization system!
The Moral: Be Careful!
• Synchronization is hard
– Need to consider all possible shared state
– Must keep locks organized and use them
consistently and correctly
• Knowing there are bugs may be tricky; fixing
them can be even worse!
• Keeping shared state to a minimum reduces
total system complexity
Multiprocessor
Synchronization (1)
• To ensure atomic execution of TSL,
• we have to be able to lock the bus
4/3/2016
21
Multiprocessor
Synchronization (2)
• Multiple locks used to avoid cache thrashing
4/3/2016
22
Multiprocessor
Synchronization (3)
• Spinning versus Switching
• In some cases CPU must wait
– waits to acquire ready list
• In other cases a choice exists
– spinning wastes CPU cycles
– switching uses up CPU cycles also
– possible to make separate decision each time
locked mutex encountered
4/3/2016
23
Multiprocessor Scheduling (1)
Time Sharing
• Scheduling of independent processes
–
use of single data structure for scheduling
• Priorities and quantum expiration need to be carefully considered
–
What are the differences between uniprocessor and multiprocessor when quantum expires?
• Smart scheduling, affinity scheduling/two-level scheduling
4/3/2016
24
Multiprocessor Scheduling (2)
Space Sharing
• Processes are related to each other, or
• Process consisting of multiple threads
• Simple algorithm:
•
•
•
– all threads are created at the same time,
– each thread is given a dedicated CPU and start at same time across multiple CPUs
– if there are not enough CPUs to start all threads, wait until CPUs become available
Consider a central server that keeps track of available CPUs
Advantage: no multi-programming
Disadvantage: Waste of CPU if it blocks
4/3/2016
25
Multiprocessor Scheduling (3)
Gang Scheduling
• Problem: communication between two threads
– both belong to process A
– both running out of phase
4/3/2016
26
Multiprocessor Scheduling (4)
Gang Scheduling
• Solution: Gang Scheduling
1.Groups of related threads scheduled as a unit, or
gang
2.All members of gang run simultaneously on
different timeshared CPUs
3.All gang members start and end their time slices
together
4/3/2016
27
Multiprocessor Scheduling (5)
Gang Scheduling
4/3/2016
28
Conclusion
• Multiprocessors
– Popular and attractive
– Offer simple communication model—all CPUs share a
common memory
– Processes write messages to memory which can be
read by others
– Synchronization using mutexes, semaphores, monitors
and other well-established schemes
• Problem:
– Large multiprocessors are difficult to build and are
expensive
4/3/2016
29