Synchronization

Download Report

Transcript Synchronization

CSC 660: Advanced OS
Synchronization
CSC 660: Advanced Operating Systems
Slide #1
Topics
1.
2.
3.
4.
5.
6.
7.
8.
Race Conditions
Critical Sections
Kernel Concurrency
What needs Protection?
Atomic Operations
Spin Locks
Semaphores
Deadlock
CSC 660: Advanced Operating Systems
Slide #2
Race Condition Example
1. Process A read in, stores in local var slot.
2. Timer interrupt.
3. Scheduler switches to process B.
4. Process B reads in, stores in local var slot.
5. Process B stores filename in slot 7 (slot).
6. Process B updates in to be 8.
7. Scheduler eventually switches to process A.
8. Process A writes filename in slot 7 (slot).
9. Process A computes in = slot + 1.
10. Process A updates in to be 8.
CSC 660: Advanced Operating Systems
Slide #3
Race Condition Example
Process A
4
abc.pdf
5
prog.c
6
as2.txt
7
out = 4
in = 7
Process B
CSC 660: Advanced Operating Systems
Slide #4
Critical Sections
How can we prevent race conditions?
Prohibit multiple processes from accessing shared
resource at the same time.
Critical section: code that accesses shared resource.
CSC 660: Advanced Operating Systems
Slide #5
Critical Sections
1. No two processes may be simultaneously
inside their critical sections.
2. No assumptions may be made about speed
or number of CPUs.
3. No process running outside its critical
section may block other processes from
entering the critical section.
4. No process should have to wait forever to
enter its critical section.
CSC 660: Advanced Operating Systems
Slide #6
Critical Sections
CSC 660: Advanced Operating Systems
Slide #7
Why do we need atomicity?
Problem: Two processes incrementing i.
Uniprocessor Version
A: read i(7)
A: incr i(7 -> 8)
B: read i(8)
B: incr i(8 -> 9)
Process A
read i(7)
incr i(7 -> 8)
write i(8)
CSC 660: Advanced Operating Systems
Process B
read i(7)
incr i(7 -> 8)
write i(8)
Slide #8
Atomic Operations
Atomic operations are indivisible.
Process A
atomic_inc i (7->8)
Process B
atomic_inc i (8->9)
Provided by atomic_t in the kernel.
x86 assembly: lock byte preceding opcode makes atomic.
CSC 660: Advanced Operating Systems
Slide #9
Atomicity doesn’t provide Ordering
One atomic order of operations:
Process A
atomic_inc i (7->8)
Process B
atomic_inc i (8->9)
Another atomic order of operations:
Process A
Process B
atomic_inc i (8->9)
atomic_inc i (7->8)
CSC 660: Advanced Operating Systems
Slide #10
Locking
What if operation too complex to be atomic?
CPU can’t have a dedicated instruction for every
operation that might need to be atomic.
Ex: adding a node to a doubly-linked list.
Locking
Require that thread lock the object before access.
Other threads must wait until object unlocked.
Locking is simple enough to be atomic.
Ex: Acquire lock before modifying linked list.
CSC 660: Advanced Operating Systems
Slide #11
Locking Example with 5 Processes
CSC 660: Advanced Operating Systems
Slide #12
Deadlocks
Single process deadlock:
A: spin_lock(A)
A: spin_lock(A)
Two process deadlock:
Process A
Process B
down(A)
down(B)
down(B)
down(A)
CSC 660: Advanced Operating Systems
Slide #13
Deadlock Prevention
Lock Ordering
If you acquire multiple locks, you must always
acquire them in the same order.
Don’t double acquire a lock.
Always use interrupt-disabling variants of
spin_lock() in interrupt context.
Always release locks.
Be sure that your code will always make
appropriate calls to release locks.
CSC 660: Advanced Operating Systems
Slide #14
Sources of Concurrency
1.
2.
3.
4.
5.
Interrupts
Softirqs and tasklets
Kernel preemption
Sleeping
SMP
CSC 660: Advanced Operating Systems
Slide #15
Kernel Preemption
Processes can be preempted while in kernel functions.
Process A is running an exception handler. It can be
preempted if a higher priority process B becomes
runnable.
Process A is running an exception handler and its time
quantum expires. The scheduler will schedule another
process to run.
Preemption disabled when preempt_count > 0:
Interrupt handlers.
SoftIRQs and tasklets.
Code that explicitly sets preempt_count.
CSC 660: Advanced Operating Systems
Slide #16
What needs Protection?
What needs protection?
Global kernel data structures.
Shared data between process/interrupt context.
Shared data between interrupt handlers.
What doesn’t?
Data local to executing thread (i.e., stack.)
Local variables.
Dynamically allocated data w/ only local ptrs.
Per-CPU data doesn’t need SMP protection.
CSC 660: Advanced Operating Systems
Slide #17
Where is Synchronization Unnecessary?
• An interrupt handler cannot be interrupted by
the same interrupt that it handles.
• Kernel control path performing interrupt
handling cannot be interrupted by a bottom
half or system call service routine.
• SoftIRQs and tasklets cannot be interleaved
on the same CPU.
• The same tasklet cannot be executed
simultaneously on multiple CPUs.
CSC 660: Advanced Operating Systems
Slide #18
Kernel Synchronization Techniques
Technique
Per-CPU vars
Description
Scope
Each CPU has data All CPUs
Atomic operation
Memory barrier
Spin lock
Atomic operation
Avoid re-ordering
Lock w/ busy wait
All CPUs
Local or All CPUs
All CPUs
Semaphore
Seqlocks
Interrupt disabling
SoftIRQ disabling
Lock w/ sleep wait
Lock on access ctr
cli on a single CPU
Forbid SoftIRQs
All CPUs
All CPUs
Local CPU
Local CPU
Read-copy-update
(RCU)
Lock-free access to All CPUs
shared data via ptrs.
CSC 660: Advanced Operating Systems
Slide #19
Per-CPU Variables
One variable exists for each CPU in system.
Avoid need for synchronization between CPUs.
Potential sources of concurrency
Interrupts
Kernel-preemption
Must disable kernel-preemption
What if process pre-empted after initial read,
then moved to another CPU before write?
CSC 660: Advanced Operating Systems
Slide #20
Per-CPU Variables
int cpu;
/* Disable kernel preemption and
set cpu to current processor # */
cpu = get_cpu();
/* Manipulate Per-CPU data */
put_cpu();
CSC 660: Advanced Operating Systems
Slide #21
Atomic Operations
atomic_t guarantees atomic operations
atomic_t v;
atomic_t u = ATOMIC_INIT(0);
Atomic operations
atomic_set(&v, 4);
atomic_add(2, &v);
atomic_inc(&v);
printk(“%d\n”, atomic_read(&v));
atomic_dec_and_test(&v);
CSC 660: Advanced Operating Systems
Slide #22
Barriers provide Ordering
Optimization Barriers
Prevent compiler from re-ordering instructions.
Compiler doesn’t know when interrupts or other
processors may read/write your data.
Kernel provides barrier() macro.
Memory Barriers
Read/write barriers prevent loads/stores from
being re-ordered across barrier.
Kernel provides rmb(), wmb() macros.
All syncronization primitives act as barriers.
CSC 660: Advanced Operating Systems
Slide #23
Using a Spin Lock
spinlock_t my_lock =
SPIN_LOCK_UNLOCKED;
spin_lock(&my_lock);
/* critical section */
spin_unlock(&my_lock);
CSC 660: Advanced Operating Systems
Slide #24
Spin Locks
If lock “open”
Sets lock bit with atomic test-and-set.
Continues into critical section.
else lock “closed”
Code “spins” in busy wait loop until available.
Waits are typically much less than 1ms.
Kernel-preemption can run other processes while
task is busy waiting.
CSC 660: Advanced Operating Systems
Slide #25
spinlock_t
slock
Spin lock state.
1 is unlocked.
<1 is locked.
break_lock
Flag signals that task is busy waiting for this lock.
CSC 660: Advanced Operating Systems
Slide #26
Inside a Spin Lock
1. Disables kernel pre-emption.
2. Atomic test-and-sets lock.
3. If old value positive
Lock acquired.
4. Else
Enables pre-emption.
If break_lock is 0, sets to 1 to indicate a task is waiting.
Busy wait loop
while (spin_is_locked(lock))
cpu_relax(); # pause instruction on P4
Goto 1.
CSC 660: Advanced Operating Systems
Slide #27
Spin Lock Functions
spin_lock_init(spinlock_t *lock)
Initialize spin lock to 1 (unlocked).
spin_lock(spinlock_t *lock)
Spin until lock becomes 1, then set to 0 (locked).
spin_lock_irqsave(spinlock_t *l, u flags)
Like spin_lock() but disables and saves interrupts.
Always use an IRQ disabling variant in interrupt context.
spin_unlock(spinlock_t *lock)
Set spin lock to 1 (unlocked).
spin_lock_irqrestore(spinlock_t *l, u flags)
Like spin_lock(), but restores interrupt status.
spin_trylock(spinlock_t *lock)
Set lock to 0 if unlocked and return 1; return 0 if locked.
CSC 660: Advanced Operating Systems
Slide #28
Read/Write Spinlocks
•
•
•
•
Multiple readers can acquire lock simultaneously.
Only one writer can have the lock at a time.
Increase concurrency by allowing many readers.
Example use: task list
CSC 660: Advanced Operating Systems
Slide #29
Seqlocks
Read/write lock where writers never wait.
Seqlock = spinlock + sequence counter.
typedef struct {
unsigned sequence;
spinlock_t lock;
} seqlock_t
Writers gain immediate access.
Readers have to check sequence before and
after their read of the data to detect writes.
CSC 660: Advanced Operating Systems
Slide #30
Using Seqlocks
Writers
write_seqlock();
/* ... critical section ... */
write_sequnlock();
Readers
unsigned int seq;
do {
seq = read_seqbegin(&seqlock);
/* ... critical section ... */
} while (read_seqretry(&seqlock, seq));
CSC 660: Advanced Operating Systems
Slide #31
Semaphores
If semaphore “open”
Task acquires semaphore.
else
Task placed on wait queue and sleeps.
Task awakened when semaphore released.
CSC 660: Advanced Operating Systems
Slide #32
Semaphores
Integer value S with atomic access.
If S>0, semaphore prevents access.
Using a semaphore for mutual exclusion:
down(S);
/* critical section */
up(S);
CSC 660: Advanced Operating Systems
Slide #33
Semaphores
Down (P): Request to enter critical region.
If S > 0, decrements S, enters region.
Else process sleeps until semaphore is released.
Up (V): Request to exit critical region.
Increments S.
If S > 0, wakes sleeping processes.
CSC 660: Advanced Operating Systems
Slide #34
Linux Semaphores
DECLARE_MUTEX(sem);
Static declares a mutex semaphore.
void init_MUTEX(struct semaphore *sem);
Dynamic declaration of a mutex semaphore.
void down(struct semaphore *sem);
Decrements semaphore and sleeps.
int down_interruptible(struct semaphore *sem);
Same as down() but returns on user interrupt.
int down_trylock(struct semaphore *sem);
Same as down() but returns immediately if not available.
void up(struct semaphore *sem);
Releases semaphore.
CSC 660: Advanced Operating Systems
Slide #35
Linux Semaphores
#include <asm/semaphore.h>
struct semaphore sem;
init_MUTEX(&sem);
if (down_interruptible(&sem))
return –ERESTARTSYS; /* user interrupt */
/*
* critical section
*/
up(&sem);
CSC 660: Advanced Operating Systems
Slide #36
Read/Write Semaphores
One writer or many readers can hold lock.
static DECLARE_RWSEM(my_rwsem);
down_read(&my_rwsem);
/* critical section (read only) */
up_read(&my_rwsem);
down_write(&my_rwsem);
/* critical section (read/write) */
up_write(&my_rwsem);
CSC 660: Advanced Operating Systems
Slide #37
Spin Locks vs. Semaphores
Spin Locks
Busy waits waste CPU cycles.
Can use in interrupt context, as does not sleep.
Cannot use when code sleeps while holding lock.
Use for locks held a short time.
Semaphores
Context switch on sleep is expensive.
Sleeps, so cannot use in interrupt context.
Can use when code sleeps while holding lock.
Use for locks that held a long time.
CSC 660: Advanced Operating Systems
Slide #38
Read-Copy Update (RCU)
Lock-free synchronization technique.
Allows multiple readers and writers at once.
RCU can only be used when
All data structures are dynamically allocated and
referenced by pointers.
No kernel control path can sleep in critical
region protected by RCU.
CSC 660: Advanced Operating Systems
Slide #39
Read-Copy Update (RCU)
Reader
rcu_read_lock(); /* disables kernel preemption */
/* Critical region (read only) */
rcu_read_unlock();
Writer
Makes a copy of data structure.
Modifies the copy.
Switches pointer to point to copy.
Deallocates old struct when all readers are done.
CSC 660: Advanced Operating Systems
Slide #40
Key Points
• Synchronization essential for using shared kernel data.
– Sources of concurrency: interrupts, bottoms, kernel preemption, SMP
– Atomic operations required to provide synchronization.
• Kernel provides two major types w/ RW variants
– Spin Locks (non-sleeping busy wait lock)
– Semaphores (sleeping lock)
• 2.6 kernel provides new synchronization constructs
– Seqlocks
– RCU
• System performance depends strongly on the type of
synchronization used.
• Locks must be acquired in consistent order to avoid
deadlocks.
CSC 660: Advanced Operating Systems
Slide #41
References
1.
2.
3.
4.
5.
6.
Daniel P. Bovet and Marco Cesati, Understanding the
Linux Kernel, 3rd edition, O’Reilly, 2005.
Johnathan Corbet et. al., Linux Device Drivers, 3rd
edition, O’Reilly, 2005.
Robert Love, Linux Kernel Development, 2nd edition,
Prentice-Hall, 2005.
Claudia Rodriguez et al, The Linux Kernel Primer,
Prentice-Hall, 2005.
Peter Salzman et. al., Linux Kernel Module Programming
Guide, version 2.6.1, 2005.
Andrew S. Tanenbaum, Modern Operating Systems, 3rd
edition, Prentice-Hall, 2005.
CSC 660: Advanced Operating Systems
Slide #42