CS 519 -- Operating Systems -
Download
Report
Transcript CS 519 -- Operating Systems -
CS 519: Lecture 2
Processes, Threads, and Synchronization
Assignment 1
Design and implement a threads package (details left out on
purpose!)
Check the Web page and the newsgroup for further details
Groups of 3 students. Anybody still looking for a group?
Due 2/12/2001 at 5pm
CS 519
2
Operating System Theory
Introduction: Von Neuman Model
Both text (program) and data reside in
memory
Execution cycle
Fetch instruction
Decode instruction
Execute instruction
CPU
OS is just a program
OS text and data reside in memory too
Invoke OS functionality through
procedure calls
CS 519
3
Memory
Operating System Theory
Execution Mode
Most processors support at least two modes of execution for
protection reasons
Privileged - kernel-mode
Non-privileged - user-mode
The portion of the OS that executes in kernel-mode is called
the kernel
Access hardware resources
Protected from interference by user programs
Code running in kernel-mode can do anything - no protection
User code executes in user-mode
OS functionality that does not need direct access to hardware
may run in user-mode
CS 519
4
Operating System Theory
Interrupts and traps
Interrupt: an asynchronous event
External events which occur independently of the instruction
execution in the processor, e.g. DMA completion
Can be masked (specifically or not)
Trap: a synchronous event
Conditionally or unconditionally caused by the execution of the
current instruction, e.g. floating point error, trap instruction
Interrupt and trap events are predefined (typically as integers)
Each interrupt and trap has an associated interrupt vector
Interrupt vector specifies a handler that should be called when the
event occurs
Interrupts and traps force the processor to save the current
state of execution and jump to the handler
CS 519
5
Operating System Theory
Process
An “instantiation” of a program
System abstraction – the set of resources required for executing
a program
Execution context(s)
Address space
File handles, communication endpoints, etc.
Historically, all of the above “lumped” into a single abstraction
More recently, split into several abstractions
Threads, address space, protection domain, etc.
OS process management:
Supports creation of processes and interprocess communication (IPC)
Allocates resources to processes according to specific policies
Interleaves the execution of multiple processes to increase system
utilization
CS 519
6
Operating System Theory
Process Image
The physical representation of a process in the OS
A process control structure includes
Identification info: process, parent process, user
Control info: scheduling (state, priority), resources (memory,
opened files), IPC
Execution contexts – threads
An address space consisting of code, data, and stack
segments
CS 519
7
Operating System Theory
Process Address Space
A:
OS
Code
C
Globals
B
B:
A’s activation
frame
Stack
…
C()
…
C:
Heap
CS 519
…
B()
…
8
Operating System Theory
Virtual Memory
Process addresses are virtual memory addresses
Mapping from virtual memory to physical memory
Details next lecture
Translation: Translation Look-aside Buffer (TLB), page table
Will cover more next time …
CS 519
9
Operating System Theory
User Mode
When running in user-mode, a process can only access
its virtual memory and processor resources
(registers) directly
All other resources can only be accessed through the
kernel by “calling the system”
System call
A system call is a call because it looks like a procedure call
It’s a system call because, in reality, it’s a software trap
Why is a system call a trap instead of a procedure
call?
CS 519
10
Operating System Theory
User Mode
When running in user-mode, a process can only access
its virtual memory and processor resources
(registers) directly
All other resources can only be accessed through the
kernel by “calling the system”
System call
A system call is a call because it looks like a procedure call
It’s a system call because, in actuality, it’s a software trap
Why is a system call a trap instead of a procedure
call? Need to switch to kernel mode
CS 519
11
Operating System Theory
Mode switching
One or several bits in the processor status word (PSW) indicate
the current mode of execution
Sometimes also the previous mode
(Other bits in PSW: interrupt enabled/disabled flags, alignment
check bit, arithmetic overflow bit, parity bit, carry bit, etc.)
Mode bits can be changed explicitly while in kernel-mode but
not in user-mode
The processor status word can be loaded from the interrupt
vector along with the address of the interrupt handler
Setting of kernel mode bit in the interrupt vector allows
interrupt and trap handlers to be “automatically” executed in
kernel mode
CS 519
12
Operating System Theory
System Call In Monolithic OS
interrupt vector for trap instruction
PC
PSW
in-kernel file system(monolithic OS)
code for read system call
kernel mode
trap
iret
user mode
read(....)
CS 519
13
Operating System Theory
Signals
OS may need to “upcall” into user processes
Signals
UNIX mechanism to upcall when an event of interest occurs
Potentially interesting events are predefined: e.g.,
segmentation violation, message arrival, kill, etc.
When interested in “handling” a particular event (signal), a
process indicates its interest to the OS and gives the OS a
procedure that should be invoked in the upcall.
CS 519
14
Operating System Theory
Signals (Cont’d)
When an event of interest occurs
the kernel handles the event first,
then modifies the process’s stack to
look as if the process’s code made a
procedure call to the signal handler.
When the user process is scheduled
next it executes the handler first
From the handler the user process
returns to where it was when the
event occurred
CS 519
15
Handler
B
B
A
A
Operating System Theory
Process Creation
How to create a process? System call.
In UNIX, a process can create another process using the fork()
system call
int pid = fork()
The creating process is called the parent and the new process is
called the child
The child process is created as a copy of the parent process
(process image and process control structure) except for the
identification and scheduling state
Parent and child processes run in two different address spaces
by default no memory sharing
Process creation is expensive because of this copying
CS 519
16
Operating System Theory
Example of Process Creation Using Fork
The UNIX shell is command-line interpreter whose basic
purpose is for user to run applications on a UNIX system
cmd arg1 arg2 ... argn
CS 519
17
Operating System Theory
Inter-Process Communication
Most operating systems provide several abstractions
for inter-process communication: message passing,
shared memory, etc
Communication requires synchronization between
processes (i.e. data must be produced before it is
consumed)
Synchronization can be implicit (message passing) or
explicit (shared memory)
Explicit synchronization can be provided by the OS
(semaphores, monitors, etc) or can be achieved
exclusively in user-mode (if processes share memory)
CS 519
18
Operating System Theory
Inter-Process Communication
More on shared memory and message passing later
Synchronization after we talk about threads
CS 519
19
Operating System Theory
Threads
Why limit ourselves to a single execution context?
For example, have to use select() to deal with multiple outstanding
events. Having multiple execution contexts is more natural.
Multiprocessor systems
Multiple execution contexts threads
All the threads of a process share the same address space and
the same resources
Each thread contains
An execution state: running, ready, etc..
An execution context: PC, SP, other registers
A per-thread stack
CS 519
20
Operating System Theory
Process Address Space Revisited
OS
OS
Code
Code
Globals
Globals
Stack
Stack
Stack
Heap
Heap
(a) Single-threaded address space (b) Multi-threaded address space
CS 519
21
Operating System Theory
Threads vs. Processes
Why multiple threads?
Can’t we use multiple processes to do whatever it is that we do with
multiple threads?
Sure, but we would need to be able to share memory (and other resources)
between multiple processes
This sharing is directly supported by threads – see later in the lecture
Operations on threads (creation, termination, scheduling, etc) are
cheaper than the corresponding operations on processes
This is because thread operations do not involve manipulations of other
resources associated with processes
Inter-thread communication is supported through shared memory
without kernel intervention
Why not? Have multiple other resources, why not execution
contexts
CS 519
22
Operating System Theory
Process state diagram
ready
(in
memory)
swap in
swap out
suspended
(swapped out)
CS 519
23
Operating System Theory
Thread State Diagram
thread scheduling
dispatch
ready
running
timeout
event occurred
suspend
activate
suspend
suspended
(swapped out)
wait for
event
blocked
process
scheduling
CS 519
24
Operating System Theory
Thread Switching
Typically referred to as context switching
Context switching is the act of taking a thread off of
the processor and replacing it with another one that
is waiting to run
A context switch takes place when
Time quota allocated to the executing thread expires
The executing thread performs a blocking system call
A memory fault due to a page miss
CS 519
25
Operating System Theory
Context Switching
How to do a context switch?
Very carefully!!
Save state of currently executing thread
Copy all “live” registers to thread control block
Need at least 1 scratch register – points to area of memory
in thread control block that registers should be saved to
Restore state of thread to run next
Copy values of live registers from thread control block to
registers
How to get into and out of the context switching
code?
CS 519
26
Operating System Theory
Context Switching
OS is just code stored in memory, so …
Call context switching subroutine
The subroutine saves context of current thread, restores context
of the next thread to be executed, and returns
The subroutine returns in the context of another (the next) thread!
Eventually, will switch back to the current thread
To thread, it appears as if the context switching subroutine just
took a long while to return
CS 519
27
Operating System Theory
Thread Switching (Cont’d)
What if switching to a thread of a different
process? Context switch to process.
Implications for caches, TLB, page table, etc.?
Caches
Physical addresses: no problem
Virtual addresses: cache must either have process tag or must
flush cache on context switch
TLB
Each entry must have process tag or flush TLB on context switch
Page table
Typically have page table pointer (register) that must be reloaded
on context switch
CS 519
28
Operating System Theory
Process Swapping
May want to swap out entire process
Thrashing if too many processes competing for resources
To swap out a process
Suspend all of its threads
Must keep track of whether thread was blocked or ready
Copy all of its information to backing store (except for PCB)
To swap a process back in
Copy needed information back into memory, e.g. page table,
thread control blocks
Restore each thread to blocked or ready
Must check whether event(s) has (have) already occurred
CS 519
29
Operating System Theory
Threads & Signals
(Usual Unix Implementation)
A signal is directed to a single thread (we don’t want a signal
delivered multiple times), but the handler is associated with the
process
Signals can be ignored/masked by threads. Common strategy:
one thread responsible to field all signals (calls sigwait() in an
infinite loop); all other threads mask all signals
What happens if kernel wants to signal a process when all of its
threads are blocked? When there are multiple threads, which
thread should the kernel deliver the signal to?
OS writes into PCB that a signal should be delivered to the process
If one thread is responsible for all signals, it is scheduled to run
and fields the signal
Otherwise, the next time any thread that does not mask the signal
is allowed to run, the signal is delivered to that thread as part of
the context switch
CS 519
30
Operating System Theory
POSIX Threads (Pthreads) API Standard
thread creation and termination
pthread_create(&tid,NULL,start_fn,arg);
pthread_exit(status)’
thread join
pthread_join(tid, &status);
mutual exclusion
pthread_mutex_lock(&lock);
pthread_mutex_unlock(&lock);
condition variable
pthread_cond_wait(&c,&lock);
pthread_cond_signal(&c);
CS 519
31
Operating System Theory
Thread Implementation
Kernel-level threads (lightweight processes)
Kernel sees multiple execution contexts
Thread management done by the kernel
User-level threads
Implemented as a thread library which contains the code for
thread creation, termination, scheduling and switching
Kernel sees one execution context and is unaware of thread
activity
Can be preemptive or not
CS 519
32
Operating System Theory
User-Level vs. Kernel-Level Threads
Advantages of user-level threads
Performance: low-cost thread operations and context
switching, since it is not necessary to go into the kernel
Flexibility: scheduling can be application-specific
Portability: user-level thread library easy to port
Disadvantages of user-level threads
If a user-level thread is blocked in the kernel, the entire
process (all threads of that process) are blocked
Cannot take advantage of multiprocessing (the kernel assigns
one process to only one processor)
CS 519
33
Operating System Theory
User-Level vs. Kernel-Level Threads
user-level
threads
kernel-level
threads
threads
thread
scheduling
user
kernel
threads
process
thread
scheduling
process
scheduling
processor
processor
CS 519
process
scheduling
34
Operating System Theory
User-Level vs. Kernel-Level Threads
No reason why we shouldn’t
have both (e.g. Solaris)
Most systems now support
kernel threads
User-level threads are
available as linkable libraries
user-level
threads
thread
scheduling
user
kernel
kernel-level
threads
thread
scheduling
process
scheduling
processor
CS 519
35
Operating System Theory
Kernel Support for User-Level Threads
Even kernel threads are not quite the right
abstraction for supporting user-level threads
Mismatch between where the scheduling information
is available (user) and where scheduling on real
processors is performed (kernel)
When the kernel thread is blocked, the corresponding
physical processor is lost to all user-level threads although
there may be some ready to run.
CS 519
36
Operating System Theory
Why Kernel Threads Are Not The Right
Abstraction
user-level threads
user-level scheduling
user
kernel
kernel thread
blocked
kernel-level scheduling
physical processor
CS 519
37
Operating System Theory
Scheduler Activations
Kernel allocates processors to processes as scheduler
activations
Each process contains a user-level thread system
which controls the scheduling of the allocated
processors
Kernel notifies a process whenever the number of
allocated processors changes or when a scheduler
activation is blocked due to the user-level thread
running on it
The process notifies the kernel when it needs more
or fewer scheduler activations (processors)
CS 519
38
Operating System Theory
User-Level Threads On Top of
Scheduler Activations
user-level threads
user-level scheduling
blocked
active
user
kernel
scheduler activation
kernel-level scheduling
blocked active
physical processor
CS 519
39
Operating System Theory
Thread/Process Operation Latencies
Operation
User-level
Threads
(s)
Kernel
Threads
(s)
Processes
(s)
Null fork
34
948
11,300
Signal-wait
37
441
1,840
VAX uniprocessor running UNIX-like OS, 1992.
CS 519
40
Operating System Theory
Synchronization
Why synchronization?
Problem
Threads share data
Data integrity must be maintained
Example
Transfer $10 from account A to account B
A A + 10
B B - 10
If was taking snapshot of account balances, don’t want to
read A and B between the previous two statements
CS 519
41
Operating System Theory
Terminology
Critical section: a section of code which reads or
writes shared data
Race condition: potential for interleaved execution of
a critical section by multiple threads
Results are non-deterministic
Mutual exclusion: synchronization mechanism to avoid
race conditions by ensuring exclusive execution of
critical sections
Deadlock: permanent blocking of threads
Livelock: execution but no progress
Starvation: one or more threads denied resources
CS 519
42
Operating System Theory
Requirements for Mutual Exclusion
No assumptions on hardware: speed, # of processors
Execution of CS takes a finite time
A thread/process not in CS cannot prevent other
threads/processes to enter the CS
Entering CS cannot de delayed indefinitely: no
deadlock or starvation
CS 519
43
Operating System Theory
Synchronization Primitives
Most common primitives
Mutual exclusion
Condition variables
Semaphores
Monitors
Need
Semaphores
Mutual exclusion and condition variables
CS 519
44
Operating System Theory
Mutual Exclusion
Mutual exclusion used when want to be
only thread modifying a set of data items
Have three components:
Lock, Unlock, Waiting
Example: transferring $10 from A to B
Lock(A)
Lock(B)
A A + 10
B B - 10
Unlock(B)
Unlock(A)
CS 519
Function Transfer (Amount, A, B)
Lock(Transfer_Lock)
A A + 10
B B - 10
Unlock(Transfer_Lock)
45
Operating System Theory
Implementation of Mutual Exclusion
Many read/write algorithms for mutual exclusion
See any OS book
Disadvantages: difficult to get correct and not very general
Simple with a little help from the hardware
Atomic read-modify-write instructions
Test-and-set
Fetch-and-increment
Compare-and-swap
CS 519
46
Operating System Theory
Atomic Read-Modify-Write Instructions
Test-and-set
Read a memory location and, if the value is 0, set it to 1 and
return true. Otherwise, return false
Fetch-and-increment
Return the current value of a memory location and increment
the value in memory by 1
Compare-and-swap
Compare the value of a memory location with an old value,
and if the same, replace with a new value
CS 519
47
Operating System Theory
Simple Spin Lock
Test-and-set
Spin_lock(lock)
{
while (test-and-set(lock) == 0);
}
Spin_unlock(lock)
{
lock = 0;
}
CS 519
48
Operating System Theory
Load-Locked, Store-Conditional
Load-Linked, Store-Conditional (LLSC)
Pair of instructions may be easier to implement in hardware
Load-linked (or load-locked) returns the value of a memory
location
Store-conditional stores a new value to the same memory
location if the value of that location has not been changed
since the LL. Returns 0 or 1 to indicate success or failure
If a thread is switched out between an LL and an SC, then
the SC automatically fails
CS 519
49
Operating System Theory
What To Do While Waiting?
Spinning
Waiting threads keep testing location until it changes value
Doesn’t work on uniprocessor system
Blocking
OS or RT system deschedules waiting threads
Spinning vs. blocking becomes an issue in
multiprocessor systems (with > 1 thread/processor)
What is the main trade-off?
CS 519
50
Operating System Theory
What To Do While Waiting?
Spinning
Waiting threads keep testing location until it changes value
Doesn’t work on uniprocessor system
Blocking
OS or RT system deschedules waiting threads
Spinning vs. blocking becomes an issue in
multiprocessor systems (with > 1 thread/processor)
What is the main trade-off? Overhead of blocking vs.
expected waiting time
CS 519
51
Operating System Theory
Deadlock
CS 519
Lock A
Lock B
A
B
52
Operating System Theory
Deadlock
CS 519
Lock A
Lock B
A
B
53
Operating System Theory
Deadlock
CS 519
Lock A
Lock B
A
B
54
Operating System Theory
Deadlock (Cont’d)
Deadlock can occur whenever multiple parties are competing for
exclusive access to multiple resources
How can we deal deadlocks?
Deadlock prevention
Design a system without one of mutual exclusion, hold and wait, no
preemption or circular wait (four necessary conditions)
To prevent circular wait, impose a strict ordering on resources. For instance,
if need to lock variables A and B, always lock A first, then lock B
Deadlock avoidance
Deny requests that may lead to unsafe states (Banker’s algorithm)
Running the algorithm on all resource requests is expensive
Deadlock detection and recovery
Check for circular wait periodically. If circular wait is found, abort all
deadlocked processes (extreme solution but very common)
Checking for circular wait is expensive
CS 519
55
Operating System Theory
Condition Variables
A condition variable is always associated with a
condition and a lock
Typically used to wait for a condition to take on a
given value
Three operations:
cond_wait(lock, cond_var)
cond_signal(cond_var)
cond_broadcast(cond_var)
CS 519
56
Operating System Theory
Condition Variables
cond_wait(lock, cond_var)
Release the lock
Sleep on cond_var
When awaken by the system, reacquire the lock and return
cond_signal(cond_var)
If at least 1 thread is sleeping on cond_var, wake 1 up
Otherwise, no effect
cond_broadcast(cond_var)
If at least 1 thread is sleeping on cond_var, wake everyone up
Otherwise, no effect
CS 519
57
Operating System Theory
Producer/Consumer Example
Producer
Consumer
lock(lock_bp)
while (free_bp.is_empty())
cond_wait(lock_bp, cond_freebp_empty)
buffer free_bp.get_buffer()
unlock(lock_bp)
lock(lock_bp)
while (data_bp.is_empty())
cond_wait(lock_bp, cond_databp_empty)
buffer data_bp.get_buffer()
unlock(lock_bp)
… produce data in buffer …
… consume data in buffer …
lock(lock_bp)
data_bp.add_buffer(buffer)
cond_signal(cond_databp_empty)
unlock(lock_bp)
lock(lock_bp)
free_bp.add_buffer(buffer)
cond_signal(cond_freebp_empty)
unlock(lock_bp)
CS 519
58
Operating System Theory
Condition Variables
Condition variables are implemented using locks
Implementation is tricky because it involves multiple
locks and scheduling queue
Implemented in the OS or run-time thread systems
because they involve scheduling operations
Sleep/Wakeup
CS 519
59
Operating System Theory
Semaphores
Synchronized counting variables
A semaphore is comprised of:
An integer value
Two operations: P() and V()
P()
Decrement value
While value < 0, sleep
V()
Increments value
If there are any threads sleeping, waiting for value to
become zero, wakeup 1 thread
CS 519
60
Operating System Theory
Monitors
Semaphores have a few limitations: unstructured,
difficult to program correctly. Monitors eliminate
these limitations and are as powerful as semaphores
A monitor consists of a software module with one or
more procedures, an initialization sequence, and local
data (can only be accessed by procedures)
Only one process can execute within the monitor at
any one time (mutual exclusion)
Synchronization within the monitor implemented with
condition variables (wait/signal) => one queue per
condition variable
CS 519
61
Operating System Theory
Lock and Wait-free Synchronization (or
Mutual Exclusion Considered Harmful)
Problem: Critical sections
If a process is delayed or halted in a critical section, hard or
impossible for other processes to make progress
Lock-free (aka non-blocking) concurrent objects
Some process will complete an operation on the object in a
finite number of steps
Wait-free concurrent objects
Each process attempting to perform an operation on the
concurrent object will complete it in a finite number of steps
Essential difference between these two?
CS 519
62
Operating System Theory
Lock and Wait-free Synchronization (or
Mutual Exclusion Considered Harmful)
Problem: Critical sections
If a process is delayed or halted in a critical section, hard or
impossible for other processes to make progress
Lock-free (aka non-blocking) concurrent objects
Some process will complete an operation on the object in a
finite number of steps
Wait-free concurrent objects
Each process attempting to perform an operation on the
concurrent object will complete it in a finite number of steps
Essential difference between these two? Latter
definition guarantees lack of starvation
CS 519
63
Operating System Theory
Herlihy’s Paper
What’s the point?
Lock and wait-free synchronization can be very useful for
building multiprocessor systems; they provide progress
guarantees
Building lock and wait-free concurrent objects is HARD, so a
methodology for implementing these objects is required
Lock and wait-free synchronization provide guarantees but
can be rather expensive – lots of copying
M. Greenwald and D. Cheriton. The Synergy between
Non-blocking Synchronization and Operating System
Structure. In Proceedings of OSDI ’96, Oct, 1996.
CS 519
64
Operating System Theory
Single Word Protocol
Fetch_and_add_sync(obj: integer, v: integer) returns(integer)
success: boolean := false
loop exit when success
old: integer := obj
new: integer := old+v
success := compare_and_swap(obj, old, new)
end loop
return old
end Fetch_and_add_sync
Lock-free but not wait-free!!
CS 519
65
Operating System Theory
Small Objects
Small object: occupies 1 block (or less)
Basic idea of lock-free object implementation
1.
2.
3.
4.
Allocate a new block
Copy object to new block, making sure copy is consistent
Modify copied block, i.e. perform operation on copied block
Try to atomically replace pointer to old block with new one
Basic idea of wait-free object implementation
Same as before, except that processes should announce
their operations in a shared data structure
Any process in step 3 should perform all previously
announced operations, not just its own operations
CS 519
66
Operating System Theory
Single Word Protocol
Fetch_and_add_sync(obj: integer, v: integer) returns(integer)
success: boolean := false
loop exit when success
old: integer := obj
new: integer := old+v
// allocate, copy and modify
// single word so consistent
success := compare_and_swap(obj, old, new) // replace
end loop
return old
end Fetch_and_add_sync
Lock-free but not wait-free!!
CS 519
67
Operating System Theory
Large Objects
I’ll leave for you to read
Basic idea is to make local changes in the data
structure using small object protocol
Complicated but not different than locking: e.g., if
you lock an entire tree to modify some nodes,
performance will go down the drain
CS 519
68
Operating System Theory