Transcript Threads

CS 620 Advanced Operating
Systems
Lecture 5 – Processes
Professor Timothy Arndt
BU 331
Threads
• Motivation
 Keep programs interactive.
• Programs may block waiting for I/O.
• This is a bad thing if the program is interactive.
 Background save.
• Consider saving a large file to a slow disk system.
• The user has to wait for the save operation to finish before
proceeding.
 Multiple clients.
• Consider also a server program with multiple clients.
• We must fork off a new process for each client.
Threads
• Basic idea
 Multiple lightweight processes.
• Traditional processes are created very slowly and use many
system resources.
• Threads are similar to processes.
 They have their own stack and PC.
 Single address space.
• By sharing a single address space, there is no need to have
separate (costly) address spaces.
• Since threads of a single program cooperate (unlike processes
of separate programs) it is possible for them to share an
address space.
Thread Usage in Nondistributed
Systems
• Context switching as the result of IPC
Threads
• Threads also share:




open files
child processes
timers
etc.
 Synchronization primitives available.
• Necessary since threads share a common memory.
• Semaphores, mutexes, etc.
Threads
• So what’s a thread?
 Stack
 Program counter
 what about I/O, signals, global variables (like
errno?)?
Threads
• Is it managed from user or kernel level?
 User threads have a very cheap task context
switch
 Kernel threads handle blocking I/O cleanly
 In order for user threads not block, we need
extended models for I/O
• E.g. select() indicates which files are ready to
transfer data so we don’t block
 Hybrid is also possible
Thread Implementation
• Combining kernel-level lightweight processes and userlevel threads.
Threads
• Preemption can be implemented via signal
• Should user-level threads be preempted?
 Easier programming model if processes yield() the
processor.
 But it is a nuisance to program with extra yield() calls
 Preemption can be controlled with special no preempt
regions
Threads
• So how do you use threads?
 User interface/computation/I/O handled
separately (think of a browser)
 Pop-up server threads
 On multiprocessor systems, we can have
threads working in parallel on the multiple
processors as an alternative to shared memory
IPC
Multithreaded Servers (1)
• A multithreaded server organized in a dispatcher/worker
model.
Threads in Windows
 Each process in Windows contains one or more threads.
• Threads are the executable units in Windows, not processes.
 Threads have the following attributes:
• The PID of the process that owns it.
• A numeric base priority specifying its importance relative to
other threads.
• A dynamic priority.
• Its execution time so far.
• An allowed processor set.
• An exit status.
Threads in Windows
 The Windows Kernel schedules threads and
handles interrupts and exceptions.
• The Kernel schedules or dispatches threads for the
processor in order of priority.
• It also preempts threads of lower priority in favor
of threads of higher priority.
• It can force context switches, directing the
processor to drop one task ands pick up another.
• Therefore code operating in this system must be
reentrant. (Able to be interrupted and resumed
unharmed and shared by different threads executing
the code on different processors.)
Threads in Windows
 The Kernel’s own code does not, technically, run in
threads.
• Hence it is the only part of the OS that is not preemptible or
pageable.
• The rest of the threads in Windows are preemptible and fully
reentrant.
• Code which is non-reentrant can cause serialization, damaging
the performance of the OS on SMP machines.
 The Kernel schedules ready threads for processor time
based upon their dynamic priority, a number from 1 to
31.
Threads in Windows
• The highest priority thread always runs on the processor, even
if this requires that a lower-priority thread be interrupted.
 The base priority class of a process establishes a range
for the base priority of the process and its thread. The
base priority classes are:
•
•
•
•
Idle
Normal
High
Real-Time
 The base priority of a process varies within the range
established by its base priority class.
Threads in Windows
• When a user interacts with a process (the process window is at
the top of the window stack), Windows boosts the priority of
the process to maximize its response.
• The base priority of a thread is a function of the base priority
of the process in which it runs. It varies within +/- 2 from the
base priority of the process.
• The dynamic priority of a thread is a function of its base
priority. Windows continually adjusts the dynamic priority of
threads within the range established by its base priority.
• The base priority class of a running process can be changed by
using Task Manager.
Threads in Windows
 The Windows Kernel takes maximum advantage of
multiprocessor configurations by implementing
symmetric multiprocessing (SMP) and soft affinity.
• SMP allows the threads of any process, including the OS, to
run on any processor.
• The threads of a single process can run on different processors
at the same time.
• With soft affinity, the Kernel attempts to run the thread on the
last processor it ran on.
• Applications can restrict threads to run only on certain
processors (hard affinity).
Threads in Windows
 The Kernel manages two types of objects:
• Dispatcher objects have a signal state (signaled or
nonsignaled) and control dispatching and
synchronization of system operations.
 Semaphores, mutexes, events, etc.
• Control objects are used to control the operation of
the Kernel but do not affect dispatching.
 Processes, interrupts, etc.
 The I/O Manager (a component of the
Windows Executive) supports Asynchronous
I/O.
• Asynchronous I/O allows an application to continue
working while an I/O operation completes.
Threads in Windows
• A thread may wait for the I/O to complete or we
may use an application procedure call (APC) that
the I/O manager calls when the I/O completes or we
may use a synchronization object (e.g. an event) that
the I/O system sets to the signaled state when I/O
completes.
 The Process Manager creates and deletes
processes and tracks process objects and thread
objects.
• The subsystems define the rules for threads and
processes.
System Model
• We look at three models:
 Workstations (zero cost solution)
 Clusters (a.k.a. NOW a.k.a. COW, a.k.a.
LAMP, a.k.a. Beowulf, a.k.a. pool)
 Hybrid
• Workstation model
 Connect workstations in department via LAN
 Includes personal workstations and public ones
 We often have dedicated file servers
Workstation Model
Workstation Model - UMich
Workstations
 The workstations can be diskless
• Not so popular anymore (disks are cheap)
• Maintenance is easy
• Must have some startup code in ROM
 If you have a disk on the workstation you can use it for
•
•
•
•
1. Paging and temporary files
2. 1. + (some) system executables
3. 2. + file caching
4. full file system
Workstations
 Case 1 is often called dataless
• Just as easy to maintain (software) as diskless
• We still need startup code in ROM
 Case 2
• Reduces load more and speeds up program start time
• Adds maintenance since new releases of programs must be
loaded onto the workstations
 Case 3
• We can have a very few executables permanently on the disk
• Must keep the caches consistent
 Not trivial for data files with multiple writers
 This issue comes up for NFS as well
• Should you cache whole files or blocks?
Workstations
 Case 4
• You can work if just your machine is up
• But you lose location transparency
• Also requires the most maintenance
 Using idle workstations
• Early systems did this manually via rsh
 Still used today.
• How do you find idle workstations?
Workstations
• Idle = no mouse or keyboard activity and low load
average
• Workstation can announce it is idle and this is
recorded by all
• A job looking for a machine can inquire
• Must worry about race conditions
• Some jobs want a bunch of machines so they look
for many idle machines
• Can also have centralized solution, processor server
 Usual tradeoffs apply here
 What about the local environment?
Workstations
• Files on servers are no problem
• Requests for local files must be sent home
 ... but not needed for temporary files
• System calls for memory or process management probably
need to be executed on the remote machine
• Time is a bit of a mess unless have we time synchronized by a
system like ntp
• If a program is interactive, we must deal with devices
 mouse, keyboard, display
 What if the borrowed machine becomes non-idle (i.e.
the owner returns)?
Workstations
• Detect presence of user.
• Kill off the guest processes.
 Helpful if we made checkpoints (or ran short jobs)
• Erase files, etc.
• We could try to migrate the guest processes to other hosts but
this must be very fast or the owner will object.
• Our goal is to make owner not be aware of our presence.
 May not be possible since you may have paged out his basic
environment (shell, editor, X server, window manager) that s/he
left running when s/he stopped using the machine.
Clusters
 Bunch of workstations without displays in
machine room connected by a network.
 They are quite popular now.
 Indeed some clusters are packaged by their
manufacturer into a serious compute engine.
• Ohio Supercomputing Center replaced MPP and
Vector supercomputers with clusters
 Used to solve large problems using many
processors at one time
• Pluses of large time sharing system vs. small
individual machines.
A Cluster System
A Cluster System
Clusters
• Also the minuses of timesharing.
• We can use easy queuing theory to show that a large
fast server better in some cases than many slower
personal machines.
• Hybrid
 Each user has a workstation and uses the pool
for big jobs.
 It is the dominant model for cluster based
machines.
Virtualization
• Virtualization has a long history.
 It was important in the 1960s/70s
 Faded during the 1980s/90s
 Increasing importance nowadays
• Security
• Ease of management
 Different types of virtualization
• Process virtual machine
• Virtual machine monitor (hardware virtual machine)
Virtualization
• Process virtual machine
 JVM
 Macromedia Flash Player
 Wine
• VMM




VMWare
Parallels
VirtualBox
Microsoft Virtual PC
The Role of Virtualization in
Distributed Systems
• (a) General organization between a program,
interface, and system. (b) General organization of
virtualizing system A on top of system B.
Architectures of Virtual
Machines
• Interfaces at different levels
• An interface between the hardware and software
consisting of machine instructions
 that can be invoked by any program.
• An interface between the hardware and software,
consisting of machine instructions
 that can be invoked only by privileged programs, such
as an operating system.
Architectures of Virtual
Machines
• Interfaces at different levels
• An interface consisting of system calls as offered by
an operating system.
• An interface consisting of library calls
 generally forming what is known as an application
programming interface (API).
 In many cases, the aforementioned system calls are
hidden by an API.
Architectures of Virtual
Machines
• Figure 3-6. Various interfaces offered by
computer systems.
Architectures of Virtual
Machines
• A process virtual machine, with multiple
instances of (application, runtime)
combinations.
Architectures of Virtual
Machines
• A virtual machine monitor, with multiple
instances of (applications, operating system)
combinations.
Processor Allocation
• Processor Allocation
 Decide which processes should run on which
processors.
 Could also be called process allocation.
 We assume that any process can run on any
processor.
Processor Allocation
 Often the only difference between different
processors is:
• CPU speed
• CPU speed and amount of memory
 What if the processors are not homogeneous?
• Assume that we have binaries for all the different
architectures.
 What if not all machines are directly connected
• Send process via intermediate machines
Processor Allocation
• If we have only PowerPC binaries, restrict the
process to PowerPC machines.
• If we need machines very close for fast
communication, restrict the processes to a group of
close machines.
 Can you move a running process or are
processor allocations done at process creation
time?
• Migratory allocation algorithms vs. non migratory.
Processor Allocation
 What is the figure of merit, i.e. what do we
want to optimize in order to find the best
allocation of processes to processors?
• Similar to CPU scheduling in centralized operating
systems.
 Minimize response time is one possibility.
Processor Allocation
• We are not assuming all machines are equally fast.
 Consider two processes. P1 executes 100 millions
instructions, P2 executes 10 million instructions.
 Both processes enter system at time t=0
 Consider two machines A executes 100 MIPS, B 10 MIPS
 If we run P1 on A and P2 on B each takes 1 second so
average response time is 1 sec.
 If we run P1 on B and P2 on A, P1 takes 10 seconds P2 .1
sec. so average response time is 5.05 sec.
 If we run P2 then P1 both on A finish at times .1 and 1.1
so average response time is .6 seconds!!
Processor Allocation
 Minimize response ratio.
• Response ratio is the time to run on some machine
divided by time to run on a standardized
(benchmark) machine, assuming the benchmark
machine is unloaded.
• This takes into account the fact that long jobs should
take longer.
 Maximize CPU utilization
 Throughput
• Jobs per hour
• Weighted jobs per hour
Processor Allocation
 If weighting is CPU time, we get CPU utilization
 This is the way to justify CPU utilization (user centric)
• Design issues
 Deterministic vs. Heuristic
• Use deterministic for embedded applications, when
all requirements are known a priori.
 Patient monitoring in hospital
 Nuclear reactor monitoring
 Centralized vs. distributed
• We have a tradeoff of accuracy vs. fault tolerance
and bottlenecks.
Processor Allocation
 Optimal vs. best effort
• Optimal normally requires off line processing.
• Similar requirements as for deterministic.
• Usual tradeoff of system effort vs. result quality.
 Transfer policy
• Does a process decide to shed jobs just based on its
own load or does it have (and use) knowledge of
other loads?
• Also called local vs. global
• Usual tradeoff of system effort (gather data) vs.
result quality.
Processor Allocation
• Location policy
 Sender vs. receiver initiated.
• Sender initiated - uploading programs to a compute
server
• Receiver initiated - downloading Java applets
 Look for help vs. look for work.
 Both are done.
Processor Allocation
• Implementation issues
 Determining local load
• Normally use a weighted mean of recent loads with
more recent weighted higher.
Processor Allocation
• Example algorithms
• Min cut deterministic algorithm
 Define a graph with processes as nodes and IPC
traffic as arcs
 Goal: Cut the graph (i.e some arcs) into pieces
so that
• All nodes in one piece can be run on one processor
 Memory constraints
 Processor completion times
• Values on cut arcs are minimized
Processor Allocation
• Minimize the max
 minimize the maximum traffic for a process pair
• Minimize the sum
 minimize total traffic
 Minimize the sum to/from a piece
 don't overload a processor
• Minimize the sum between pieces
 minimize traffic for processor pair
 Tends to get hard as you get more realistic
Processor Allocation
• Up-down centralized algorithm
 Centralized table that keeps "usage" data for a
user, the users are defined to be the workstation
owners. Call this the score for the user.
 The goal is to give each user a fair share.
 When user requests a remote job, if a
workstation is available it is assigned.
 For each process a user has running remotely,
the user's score increases by a fixed amount
each time interval.
Processor Allocation
 When a user has an unsatisfied request pending (and
none being satisfied), the score decreases (it can go
negative).
 If no requests are pending and none are being satisfied,
the score is bumped towards zero.
 When a processor becomes free, assign it to a
requesting user with the lowest score.
Processor Allocation
• Hierarchical algorithm
 Goal - assign multiple processors to a job
 Quick idea of algorithm
• Processors arranged in a tree
• Requests go up the tree until a subtree has enough resources
• Request is split and parts go back down the tree
 Arrange processors in a hierarchy (tree)
• This is a logical tree independent of how physically connected
• Each node keeps (imperfect) track of how many available
processors are below it.
 If a processor can run more than one process, must be more
sophisticated and must keep track of how many processes can be
allocated (without overload) in the subtree below.
Processor Allocation
• If a new request appears in the tree, the current node sees if it
can be satisfied by the processors below (plus itself).
 If so, do it.
 If not pass the request up the tree
 Actually since machines may be down or the data on availability
may be out of date, you actually try to find more processes than
requested
• Once a request has gone high enough to be satisfied, the
current node splits the request into pieces and sends each piece
to appropriate child.
• What if a node dies?
 Promote one of its children say C
 Now C's children are peers with the previous peers of C
Processor Allocation
 If this is considered too unbalanced, we can promote one of C
children to take C's place.
• How can we decide which child C to promote?
 Peers of dead node have an election
 Children of dead node have an election
 Parent of dead node decides
• What if the root dies?
 Must use children since no peers or parent
 If we want to use peers, then we do not have a single root
 I.e. the top level of the hierarchy is a collection of roots that
communicate. This is a forest, not a tree
 What if multiple requests are generated simultaneously?
Processor Allocation
• Gets hard fast as information gets stale and potential race
conditions and deadlocks are possible.
• Distributed heuristic algorithm





Goal - find a lightly loaded processor to migrate job to
Send probe to a random processor
If the remote load is low, ship the job
If the remote load is high, try another random probe
After k (parameter of implementation) probes all say
the load is too high, give up and run the job locally.
 Modelled analytically and seen to work fairly well
Scheduling
 General goal is to have processes that
communicate frequently run simultaneously
 If they don’t and we use busy waiting for
messages, we will have a huge disaster.
 Even if we use context switching, we may have
a small disaster as only one message transfer
can occur per time scheduling slot
 Co-scheduling (a.k.a. gang scheduling).
Processes belonging to a job are scheduled
together
Scheduling
• Time slots are coordinated among the processors.
• Some slots are for gangs; other slots are for regular
processes.