CS 519 -- Operating Systems -
Download
Report
Transcript CS 519 -- Operating Systems -
CS519: Lecture 7
Uniprocessor and Multiprocessor Scheduling
What and Why?
What is processor scheduling?
Why?
At first to share an expensive resource – multiprogramming
Now to perform concurrent tasks because processor is so
powerful
Future looks like past + now
Rent-a-computer approach – large data/processing centers use
multiprogramming to maximize resource utilization
Systems still powerful enough for each user to run multiple
concurrent tasks
CS 519 – Operating Systems Theory
2
DCS, Rutgers University
Assumptions
Pool of jobs contending for the CPU
CPU is a scarce resource
Jobs are independent and compete for resources
(this assumption is not always used)
Scheduler mediates between jobs to optimize some
performance criteria
CS 519 – Operating Systems Theory
3
DCS, Rutgers University
Types of Scheduling
We’re mostly
concerned with
short-term
scheduling
CS 519 – Operating Systems Theory
4
DCS, Rutgers University
What Do We Optimize?
System-oriented metrics:
Processor utilization: percentage of time the processor is
busy
Throughput: number of processes completed per unit of time
User-oriented metrics:
Turnaround time: interval of time between submission and
termination (including any waiting time). Appropriate for
batch jobs
Response time: for interactive jobs, time from the
submission of a request until the response begins to be
received
Deadlines: when process completion deadlines are specified,
the percentage of deadlines met must be promoted
CS 519 – Operating Systems Theory
5
DCS, Rutgers University
Design Space
Two dimensions
Selection function
Which of the ready jobs should be run next?
Preemption
Preemptive: currently running job may be interrupted and moved
to Ready state
Non-preemptive: once a process is in Running state, it continues to
execute until it terminates or it blocks for I/O or system service
CS 519 – Operating Systems Theory
6
DCS, Rutgers University
Job Behavior
CPU
I/O-bound jobs
Jobs that perform lots of
I/O
Tend to have short CPU
bursts
CPU-bound jobs
Jobs that perform very
little I/O
Tend to have very long CPU
bursts
Distribution tends to be
hyper-exponential
Very large number of very
short CPU bursts
A small number of very long
CPU bursts
Disk
CS 519 – Operating Systems Theory
7
DCS, Rutgers University
Scheduling Algorithms
FIFO: non-preemptive
Round-Robin: preemptive
Shortest Job Next: non-preemptive
Shortest Remaining Time: preemptive at arrival
Highest Response Ratio (turnaround/service time)
Next: non-preemptive
Priority with feedback: preemptive at time quantum
CS 519 – Operating Systems Theory
8
DCS, Rutgers University
Example Job Set
Arrival
Time
Service
Time
1
0
3
2
2
6
3
4
4
4
6
5
5
8
2
Process
CS 519 – Operating Systems Theory
9
DCS, Rutgers University
Behavior of Scheduling Policies
CS 519 – Operating Systems Theory
10
DCS, Rutgers University
Behavior of Scheduling Policies
CS 519 – Operating Systems Theory
11
DCS, Rutgers University
Priority with Feedback Scheduling
After each preemption,
process moves to
lower-priority queue
CS 519 – Operating Systems Theory
12
DCS, Rutgers University
Scheduling Algorithms
FIFO is simple but leads to poor average response times. Short
processes are delayed by long processes that arrive before
them
RR eliminate this problem, but favors CPU-bound jobs, which
have longer CPU bursts than I/O-bound jobs
SJN, SRT, and HRRN alleviate the problem with FIFO, but
require information on the length of each process. This
information is not always available (although it can sometimes be
approximated based on past history or user input)
Feedback is a way of alleviating the problem with FIFO without
information on process length
CS 519 – Operating Systems Theory
13
DCS, Rutgers University
It’s a Changing World
Assumption about bi-modal workload no longer holds
Interactive continuous media applications are sometimes
processor-bound but require good response times
New computing model requires more flexibility
How to match priorities of cooperative jobs, such as
client/server jobs?
How to balance execution between multiple threads of a
single process?
CS 519 – Operating Systems Theory
14
DCS, Rutgers University
Lottery Scheduling
Randomized resource allocation mechanism
Resource rights are represented by lottery tickets
Have rounds of lottery
In each round, the winning ticket (and therefore the
winner) is chosen at random
The chances of you winning directly depends on the
number of tickets that you have
P[wining] = t/T, t = your number of tickets, T = total number
of tickets
CS 519 – Operating Systems Theory
15
DCS, Rutgers University
Lottery Scheduling
After n rounds, your expected number of wins is
E[win] = nP[wining]
The expected number of lotteries that a client must wait
before its first win
E[wait] = 1/P[wining]
Lottery scheduling implements proportional-share resource
management
Ticket currencies allow isolation between users, processes, and
threads
OK, so how do we actually schedule the processor using lottery
scheduling?
CS 519 – Operating Systems Theory
16
DCS, Rutgers University
Implementation
CS 519 – Operating Systems Theory
17
DCS, Rutgers University
Performance
Allocated and observed
execution ratios between
two tasks running the
Dhrystone benchmark.
With exception of 10:1
allocation ratio, all observed
ratios are close to allocations
CS 519 – Operating Systems Theory
18
DCS, Rutgers University
Short-term Allocation Ratio
CS 519 – Operating Systems Theory
19
DCS, Rutgers University
Isolation
Five tasks running the Dhrystone
benchmark. Let amount.currency
denote a ticket allocation of amount
denominated in currency. Tasks
A1 and A2 have allocations 100.A and
200.A, respectively. Tasks B1 and B2
have allocations 100.B and 200.B,
respectively. Halfway thru experiment
B3 is started with allocation 300.B.
This inflates the number of tickets in B
from 300 to 600. There’s no effect on
tasks in currency A or on the aggregate
iteration ratio of A tasks to B tasks.
Tasks B1 and B2 slow to half their
original rates, corresponding to the
factor of 2 inflation caused by B3.
CS 519 – Operating Systems Theory
20
DCS, Rutgers University
Thread Scheduling for Cache Locality
Traditionally, each resource (CPU, memory, I/O) has
been managed separately
Resources are not independent, however
Policy for one resource can affect how another
resource is used. For instance, the order in which
threads are scheduled can affect performance of
memory subsystem
Neat paper that uses a very simple scheduling idea to
enhance memory performance
CS 519 – Operating Systems Theory
21
DCS, Rutgers University
Main Idea
When working with a large array, want to tile (block)
for efficient use of the cache
What is tiling? Restructuring loops for data re-use.
Tiling by hand is a pain and is error-prone
Compilers can automatically tile but not always. For
instance, when program contains dynamically
allocated or indirectly accessed data
So, use threads and hints to improve cache utilization
CS 519 – Operating Systems Theory
22
DCS, Rutgers University
Example
Thread ti is denoted by ti(ai1,…,aik),
where aij is the address of the jth
piece of data reference by thread ti.
Simplify by using just 2 or 3 addresses
or hints. Hints might be elements of
rows or columns of a matrix, for ex.
CS 519 – Operating Systems Theory
23
DCS, Rutgers University
Algorithm
Hash algorithm should
assign threads to bins so
that threads with similar
hints fall in the same bin
Threads in each bin are
scheduled together
2-D plane of bins with 2
hints
Key insight: the sum of the
two dimensions of a bin is
less than the cache size C
Easy to extend to k hints
CS 519 – Operating Systems Theory
24
DCS, Rutgers University
Performance
Fork = create and schedule null thread
Run = execute and terminate null thread
CS 519 – Operating Systems Theory
25
DCS, Rutgers University
More Complex Examples
Partial differential equation solver
CS 519 – Operating Systems Theory
26
DCS, Rutgers University
Multiprocessor Scheduling
Load sharing: single ready queue; processor dequeues
thread at the front of the queue when idle;
preempted threads are placed at the end of queue
Gang scheduling: all threads belonging to an
application run at the same time
Hardware partitions: chunk of the machine is
dedicated to each application
Advantages and disadvantages?
CS 519 – Operating Systems Theory
27
DCS, Rutgers University
Multiprocessor Scheduling
Load sharing: poor locality; poor synchronization
behavior; simple; good processor utilization. Affinity
or per processor queues can improve locality.
Gang scheduling: central control; fragmentation -unnecessary processor idle times (e.g., two
applications with P/2+1 threads); good
synchronization behavior; if careful, good locality
Hardware partitions: poor utilization for I/Ointensive applications; fragmentation – unnecessary
processor idle times when partitions left are small;
good locality and synchronization behavior
CS 519 – Operating Systems Theory
28
DCS, Rutgers University