Transcript ppt

CS510 Advanced OS Seminar
Class 10
A Methodology for Implementing
Highly Concurrent Data Objects
by Maurice Herlihy
Motivation

A concurrent object is a data structure shared by concurrent
processes
o
o
o

It is blocking if delays in one process can cause delays in others
(all processes might fail to complete)
Non-blocking means some process will complete in finite number of
steps
Wait-free means it is free from starvation, ie. All processes will
complete in a finite number of steps
General methodology for constructing non-blocking and waitfree concurrent objects
o
o
Automatic transformation from sequential implementation
Using universal primitives such as LL/SC or CAS
•
CAS algorithms are more complex and less efficient than LL/SC ones
CS533 - Concepts of Operating Systems
2
Overview

Methodology
o
o
o
Start with sequential objects and operations
Apply synchronization and memory management algorithms
to transform sequential objects and operations into
concurrent objects and operations
Simple enough to be applied by a compiler
CS533 - Concepts of Operating Systems
3
Small objects


Slightly different approach for small and large
objects
Small objects
o
o

can be copied efficiently
object occupies a fixed size contiguous region of memory
called a “block”
Restrictions
o
Sequential operations must be “total”
•
ie. well-defined for all states of the object
CS533 - Concepts of Operating Systems
4
Small object transformation

The basic idea
o
o
o
o
Load-linked pointer to current version
Copy version to new (local) block of memory
Apply the sequential operation to the copy
Store-conditional pointer to current version to point to new
version
•
o
retry (to load-linked) on failure
Reclaim memory of old version
CS533 - Concepts of Operating Systems
5
Problem!

What if you are slow and another process reclaims
and reuses the memory while you are making your
copy?
o
o
o
o
You may generate an inconsistent copy
During your modifications you may later dereference a
corrupted pointer
You may get a divide by zero error …
All before you reach the store-conditional that would let
you know to fail and retry
CS533 - Concepts of Operating Systems
6
Solution


Check the consistency of the copy before you use it
But how can you know?
o
Use two checks
•
•
•
Updaters increment check 0 before updating and increment
check 1 after updating
– The two checks have the same value unless an update is in
progress
Copiers read check 1 before copying and check 0 after copying
– If they are the same then the copy is consistent
– If not retry the copy
This is a classic application for memory barriers
CS533 - Concepts of Operating Systems
7
Concurrent priority queue
Load linked queue pointer
CS533 - Concepts of Operating Systems
8
Concurrent priority queue
Read first part of consistency check
CS533 - Concepts of Operating Systems
9
Concurrent priority queue
Copy object
CS533 - Concepts of Operating Systems
10
Concurrent priority queue
Read second part of consistency check
CS533 - Concepts of Operating Systems
11
Concurrent priority queue
If object is consistent
CS533 - Concepts of Operating Systems
12
Concurrent priority queue
Perform sequential operation on copy
CS533 - Concepts of Operating Systems
13
Concurrent priority queue
Commit changes by overwriting queue
pointer with pointer to new version.
Exit loop on success … retry on failure
CS533 - Concepts of Operating Systems
14
Concurrent priority queue
Take ownership of old queue’s memory to
replace the memory given away for new
version.
CS533 - Concepts of Operating Systems
15
Performance evaluation


How expensive is the copying overhead?
What kind of contention characteristics does this
approach have?
CS533 - Concepts of Operating Systems
16
Benchmark: million enqueue/dequeue pairs
CS533 - Concepts of Operating Systems
17
Performance results
CS533 - Concepts of Operating Systems
18
Contention and fairness
CS533 - Concepts of Operating Systems
19
Problem


How to reduce contention?
Introduce delay via exponential back-off
o
Like in spin-locks
CS533 - Concepts of Operating Systems
20
Concurrent priority queue with backoff
Set wait time
Busy wait
CS533 - Concepts of Operating Systems
21
Performance impact back-off
CS533 - Concepts of Operating Systems
22
More problems

Performance
o

Requires backoff to reduce contention
Fairness
o
Non-blocking algorithm is open to starvation
•
•
o
Especially of enqueue’s which are slower (longer) and
frequently have to retry
If a short fast thing runs concurrently with a long slow thing,
the short fast thing nearly always wins and the long slow thing
nearly always has to retry
– It may never complete!
How can we make the approach wait-free?
CS533 - Concepts of Operating Systems
23
Wait-free algorithm

Processes register their attempted invocations in an
n-element “announce” array
o
o
o

Processes register their results in an n-element
“responses” array
o
o

one element per process
fields are <operation-name> <arguments> <toggle>
toggle field is complemented on each invocation
one element per process
fields are <result> <value> <toggle>
Processes help each other complete invocations by
running the “apply” function before performing own
operation
CS533 - Concepts of Operating Systems
24
Apply function – basic idea

Before performing an operation
o
o

look to see if any other processes have operations in
progress
if so, try to perform those operations for them and place
the result somewhere they can pick it up
Guarantees that whoever finishes first will succeed
in performing every process’ operations
o
o
no processes operations can be starved
sounds like lots of wasted work and lots of contention!
CS533 - Concepts of Operating Systems
25
Apply function
For all processes
CS533 - Concepts of Operating Systems
26
Apply function
Are any other operations
active concurrently?
CS533 - Concepts of Operating Systems
27
Apply function
If so, complete the
operation on behalf of
the other process (in
case its slower than you
and would be forced to
retry by the operation
you are about to do)
CS533 - Concepts of Operating Systems
28
Apply function
Mark operation as having
completed
CS533 - Concepts of Operating Systems
29
Wait-free concurrent pqueue
CS533 - Concepts of Operating Systems
30
Wait-free concurrent pqueue
Announce your attempted operation
CS533 - Concepts of Operating Systems
31
Wait-free concurrent pqueue
Distinguish it from the last one, which
has a result already registered
(… is one bit enough??)
CS533 - Concepts of Operating Systems
32
Wait-free concurrent pqueue
Check (twice!) that someone else has
not completed the operation for you??
(… hmm, why does it help to read this
twice?)
CS533 - Concepts of Operating Systems
33
Wait-free concurrent pqueue
Apply pending operations ??
… is there code missing here???
Should include apply(announce, new_pqueue);
CS533 - Concepts of Operating Systems
34
Issues with the wait-free approach


Aside from the bugs in this tech report version of
the paper ….
Avoids starvation by having all processes attempt to
complete any concurrent activity of all other
processes
o
o
One will succeed and commit all results
All others will fail to commit their version, but should
notice that their operation was completed by someone else
•
•
How do they pick up the result?
How is concurrency managed for the announce and result data
structures?
CS533 - Concepts of Operating Systems
35
Large objects

Copying the entire object is too costly

Programmers must construct logical versions
o
o
o
Memory is shared among versions
New memory is allocated for unique parts of new versions
Memory of old versions must be freed
CS533 - Concepts of Operating Systems
36
Memory management for large objects

Per-process pool of memory
o

3 states: committed, allocated and freed
Operations:
o
o
o
o
o
set_alloc moves block from committed (freed?) to allocated
and returns address
set_free moves block to freed
set_prepare marks blocks in allocated as consistent
set_commit sets committed to union of freed and
committed
set_abort sets freed and allocated to the empty set
CS533 - Concepts of Operating Systems
37
Memory management for large objects


May require a global memory pool
Problem:
o
How to prevent this from becoming a synchronization
bottleneck?
CS533 - Concepts of Operating Systems
38
ACM version – corrected simply algorithm
CS533 - Concepts of Operating Systems
39
ACM version – corrected delay algorithm
CS533 - Concepts of Operating Systems
40
ACM version – corrected apply function
CS533 - Concepts of Operating Systems
41
ACM version – corrected wait-free
algorithm
CS533 - Concepts of Operating Systems
42