Transcript lec08-part1

Operating Systems Engineering
OS Scheduling
By Dan Tsafrir, 27/4/2011
1
OSE 2011– OS scheduling
What’s OS scheduling





2
OS policies & mechanisms to allocate resources to entities
 Enforce a policy (e.g., “shortest first”)
 Using an OS mechanism (e.g., “thread chooser”)
Entities
 Threads, processes, process groups, users, I/O ops (web
requests, disk accesses…)
Resources
 CPU cycles, memory, I/O bandwidth (network, disk)
Dynamic setting
 Usually scheduling must occur when resources reassignment can happen with some frequency
 Not applicable to more static resources (like disk space)
Varying scale
 From shared memory accesses (HW) to parallel jobs (SW)
OSE 2011– OS scheduling
Overload



3
Scheduling unavoidable
 When resources are shared
But it’s mostly interesting in a state of overload
 When demand exceeds available resources
 E.g., if |threads| <= |cores|, scheduling is trivial
 Well, not really, as threads placement might impact
performance greatly (caches are shared)
 For parallel jobs on, say, the Bluegene supercomputer,
this example is more or less true
 Likewise, in the cloud, when there are more physical
servers than VMs (and you’re willing to pay for power)
A good scheduling policy
 Gives the most important entity what it needs
 Equally important => fair share
OSE 2011– OS scheduling
Popularity of OS scheduling research




4
Especially popular in the days of time-sharing
 When there was a shortage of resources all around
Many scheduling problems become uninteresting
 When you can just cheaply buy more/faster resources
But there were always important exceptions
 Web servers handling peak demands (flash crowds, attacks,
prioritizing paying customers)
And nowadays…
 Embedded (power/performance considerations on your
handheld device)
 Cloud servers
OSE 2011– OS scheduling
Key challenges




5
Knowing what’s important
 Can’t read clients’ mind; often unrealistic to explicitly ask
Many relevant, often conflicting performance metrics
 Throughput vs. latency (e.g., network packets)
 Throughput vs. fairness (e.g., DRAM accesses)
 Power vs. speed (e.g., DVFS)
 Soft/hard realtime
 …
Many schedulers
 CPU, disk, I/O, memory,…
 Interaction not necessarily healthy
Countless domain-specific, workload-dependent, ad-hoc
solutions
 No generic solution
OSE 2011– OS scheduling
Addressing challenges – baseline
6
1.
Understand where scheduling is occurring
2.
Expose scheduling decisions, allow control, allow different
policies
3.
Account for resource consumption to allow intelligent
control
OSE 2011– OS scheduling
Multilevel priority queue
7

Running reduces priority to run more
 Multiple ready queues, ordered by importance
 Run most important first in a round-robin fusion
 Use preemption if more important process enters the system
 The negative feedback loop ensures no starvation
 If you sleep a lot => consume little CPU => you’re important
=> when awakened, you’d typically immediately get a CPU

Used by all general-purpose OSes
 Unix family, Windows family

Problematic in many respects
 What if you run a lot but are still very important to the user?
OSE 2011– OS scheduling
Pitfall: priority inversion



8
Example
 One CPU
 P_low holds a lock
 P_high waits for it
 P_med becomes runnable => OS preempts P_low
 (From real life: exactly what happened in 1st Mars Rover)
Example
 Many CPU-bound background processes
 X server serves multiple clients => CPU quota might run out
Possible solution: priority inheritance
 P_high lends its priority to P_low until key is released
 X clients lend their priority to X server until serviced
OSE 2011– OS scheduling
Pitfall: uncoordinated schedulers
9

Example
 Even though CPU-scheduler favors emacs (always sleeps),
 disk I/O scheduler does not, preferring higher throughput
over emacs’s I/O ops

Example
 Emacs needs memory => memory is tight => must wait
 Other processes have dirty pages => write to disk
 Disk I/O scheduler doesn’t know these writes are important
OSE 2011– OS scheduling
Active field of research

“No justified complaints:
On fair sharing of multiple resources”
[Dec, 2010; Dolev, Feitelson, Linial et. Al; TR]
“We define fairness in such a scenario as the situation where
every user either gets all the resources he wishes for, or else
gets at least his entitlement on some bottleneck resource, and
therefore cannot complain about not getting more.
We then prove that a fair allocation according to this definition
is guaranteed to exist for any combination of user requests and
entitlements. The proof, which uses tools from the theory of
ordinary differential equations, is constructive and provides a
method to compute the allocations numerically.”
10
OSE 2011– OS scheduling
Active field of research

“RSIO:
Automatic User Interaction Detection and Scheduling”
[Jun 2010; Zheng, Viennot, Nieh; SIGMETRICS]
“We present RSIO, a processor scheduling framework for
improving the response time of latency-sensitive applications
by monitoring accesses to I/O channels and inferring when
user interactions occur.
RSIO automatically identifies processes involved in a user
interaction and boosts their priorities at the time the interaction
occurs to improve system response time.
RSIO also detects processes indirectly involved in processing
an interaction, automatically accounting for dependencies and
boosting their priorities accordingly.”
11
OSE 2011– OS scheduling
Active field of research

“Secretly monopolizing the CPU without superuser
privileges”
[Aug 2007; Tsafrir, Etsion, Feitelson; USENIX Security]

See next presentation…
12
OSE 2011– OS scheduling