CS 519 -- Operating Systems -

Download Report

Transcript CS 519 -- Operating Systems -

CS519: Lecture 7
Uniprocessor and Multiprocessor Scheduling
What and Why?
 What is processor scheduling?
 Why?
At first to share an expensive resource – multiprogramming
Now to perform concurrent tasks because processor is so
powerful
Future looks like past + now
Rent-a-computer approach – large data/processing centers use
multiprogramming to maximize resource utilization
Systems still powerful enough for each user to run multiple
concurrent tasks
CS 519 – Operating Systems Theory
2
DCS, Rutgers University
Assumptions
 Pool of jobs contending for the CPU
CPU is a scarce resource
 Jobs are independent and compete for resources
(this assumption is not always used)
 Scheduler mediates between jobs to optimize some
performance criteria
CS 519 – Operating Systems Theory
3
DCS, Rutgers University
Types of Scheduling
We’re mostly
concerned with
short-term
scheduling
CS 519 – Operating Systems Theory
4
DCS, Rutgers University
What Do We Optimize?
 System-oriented metrics:
Processor utilization: percentage of time the processor is
busy
Throughput: number of processes completed per unit of time
 User-oriented metrics:
Turnaround time: interval of time between submission and
termination (including any waiting time). Appropriate for
batch jobs
Response time: for interactive jobs, time from the
submission of a request until the response begins to be
received
Deadlines: when process completion deadlines are specified,
the percentage of deadlines met must be promoted
CS 519 – Operating Systems Theory
5
DCS, Rutgers University
Design Space
 Two dimensions
Selection function
Which of the ready jobs should be run next?
Preemption
Preemptive: currently running job may be interrupted and moved
to Ready state
Non-preemptive: once a process is in Running state, it continues to
execute until it terminates or it blocks for I/O or system service
CS 519 – Operating Systems Theory
6
DCS, Rutgers University
Job Behavior
CPU
 I/O-bound jobs
 Jobs that perform lots of
I/O
 Tend to have short CPU
bursts
 CPU-bound jobs
 Jobs that perform very
little I/O
 Tend to have very long CPU
bursts
 Distribution tends to be
hyper-exponential
 Very large number of very
short CPU bursts
 A small number of very long
CPU bursts
Disk
CS 519 – Operating Systems Theory
7
DCS, Rutgers University
Scheduling Algorithms
 FIFO: non-preemptive
 Round-Robin: preemptive
 Shortest Job Next: non-preemptive
 Shortest Remaining Time: preemptive at arrival
 Highest Response Ratio (turnaround/service time)
Next: non-preemptive
 Priority with feedback: preemptive at time quantum
CS 519 – Operating Systems Theory
8
DCS, Rutgers University
Example Job Set
Arrival
Time
Service
Time
1
0
3
2
2
6
3
4
4
4
6
5
5
8
2
Process
CS 519 – Operating Systems Theory
9
DCS, Rutgers University
Behavior of Scheduling Policies
CS 519 – Operating Systems Theory
10
DCS, Rutgers University
Behavior of Scheduling Policies
CS 519 – Operating Systems Theory
11
DCS, Rutgers University
Priority with Feedback Scheduling
After each preemption,
process moves to
lower-priority queue
CS 519 – Operating Systems Theory
12
DCS, Rutgers University
Scheduling Algorithms
 FIFO is simple but leads to poor average response times. Short
processes are delayed by long processes that arrive before
them
 RR eliminate this problem, but favors CPU-bound jobs, which
have longer CPU bursts than I/O-bound jobs
 SJN, SRT, and HRRN alleviate the problem with FIFO, but
require information on the length of each process. This
information is not always available (although it can sometimes be
approximated based on past history or user input)
 Feedback is a way of alleviating the problem with FIFO without
information on process length
CS 519 – Operating Systems Theory
13
DCS, Rutgers University
It’s a Changing World
 Assumption about bi-modal workload no longer holds
Interactive continuous media applications are sometimes
processor-bound but require good response times
 New computing model requires more flexibility
How to match priorities of cooperative jobs, such as
client/server jobs?
How to balance execution between multiple threads of a
single process?
CS 519 – Operating Systems Theory
14
DCS, Rutgers University
Lottery Scheduling
 Randomized resource allocation mechanism
 Resource rights are represented by lottery tickets
 Have rounds of lottery
 In each round, the winning ticket (and therefore the
winner) is chosen at random
 The chances of you winning directly depends on the
number of tickets that you have
P[wining] = t/T, t = your number of tickets, T = total number
of tickets
CS 519 – Operating Systems Theory
15
DCS, Rutgers University
Lottery Scheduling
 After n rounds, your expected number of wins is
 E[win] = nP[wining]
 The expected number of lotteries that a client must wait
before its first win
 E[wait] = 1/P[wining]
 Lottery scheduling implements proportional-share resource
management
 Ticket currencies allow isolation between users, processes, and
threads
 OK, so how do we actually schedule the processor using lottery
scheduling?
CS 519 – Operating Systems Theory
16
DCS, Rutgers University
Implementation
CS 519 – Operating Systems Theory
17
DCS, Rutgers University
Performance
Allocated and observed
execution ratios between
two tasks running the
Dhrystone benchmark.
With exception of 10:1
allocation ratio, all observed
ratios are close to allocations
CS 519 – Operating Systems Theory
18
DCS, Rutgers University
Short-term Allocation Ratio
CS 519 – Operating Systems Theory
19
DCS, Rutgers University
Isolation
Five tasks running the Dhrystone
benchmark. Let amount.currency
denote a ticket allocation of amount
denominated in currency. Tasks
A1 and A2 have allocations 100.A and
200.A, respectively. Tasks B1 and B2
have allocations 100.B and 200.B,
respectively. Halfway thru experiment
B3 is started with allocation 300.B.
This inflates the number of tickets in B
from 300 to 600. There’s no effect on
tasks in currency A or on the aggregate
iteration ratio of A tasks to B tasks.
Tasks B1 and B2 slow to half their
original rates, corresponding to the
factor of 2 inflation caused by B3.
CS 519 – Operating Systems Theory
20
DCS, Rutgers University
Thread Scheduling for Cache Locality
 Traditionally, each resource (CPU, memory, I/O) has
been managed separately
 Resources are not independent, however
 Policy for one resource can affect how another
resource is used. For instance, the order in which
threads are scheduled can affect performance of
memory subsystem
 Neat paper that uses a very simple scheduling idea to
enhance memory performance
CS 519 – Operating Systems Theory
21
DCS, Rutgers University
Main Idea
 When working with a large array, want to tile (block)
for efficient use of the cache
What is tiling? Restructuring loops for data re-use.
 Tiling by hand is a pain and is error-prone
 Compilers can automatically tile but not always. For
instance, when program contains dynamically
allocated or indirectly accessed data
 So, use threads and hints to improve cache utilization
CS 519 – Operating Systems Theory
22
DCS, Rutgers University
Example
Thread ti is denoted by ti(ai1,…,aik),
where aij is the address of the jth
piece of data reference by thread ti.
Simplify by using just 2 or 3 addresses
or hints. Hints might be elements of
rows or columns of a matrix, for ex.
CS 519 – Operating Systems Theory
23
DCS, Rutgers University
Algorithm
 Hash algorithm should
assign threads to bins so
that threads with similar
hints fall in the same bin
 Threads in each bin are
scheduled together
 2-D plane of bins with 2
hints
 Key insight: the sum of the
two dimensions of a bin is
less than the cache size C
 Easy to extend to k hints
CS 519 – Operating Systems Theory
24
DCS, Rutgers University
Performance
Fork = create and schedule null thread
Run = execute and terminate null thread
CS 519 – Operating Systems Theory
25
DCS, Rutgers University
More Complex Examples
Partial differential equation solver
CS 519 – Operating Systems Theory
26
DCS, Rutgers University
Multiprocessor Scheduling
 Load sharing: single ready queue; processor dequeues
thread at the front of the queue when idle;
preempted threads are placed at the end of queue
 Gang scheduling: all threads belonging to an
application run at the same time
 Hardware partitions: chunk of the machine is
dedicated to each application
 Advantages and disadvantages?
CS 519 – Operating Systems Theory
27
DCS, Rutgers University
Multiprocessor Scheduling
 Load sharing: poor locality; poor synchronization
behavior; simple; good processor utilization. Affinity
or per processor queues can improve locality.
 Gang scheduling: central control; fragmentation -unnecessary processor idle times (e.g., two
applications with P/2+1 threads); good
synchronization behavior; if careful, good locality
 Hardware partitions: poor utilization for I/Ointensive applications; fragmentation – unnecessary
processor idle times when partitions left are small;
good locality and synchronization behavior
CS 519 – Operating Systems Theory
28
DCS, Rutgers University