Wael Yehia - York University

Download Report

Transcript Wael Yehia - York University

Scalable lock-free Stack
Algorithm
Wael Yehia
York University
February 8, 2010
A scalable lock-free stack algorithm
• To reduce contention on the stack (i.e. on the top pointer) we used:
– Elimination as a Backoff mechanism
•
•
•
•
Example:
t1: push(3), t2: pop(), t3: push(1), t4: push(2)
All but t4 fail to modify the top pointer
Stack
So they try to collide:
top
top
2
5
t1 t2
t3
5
1
Collision Array
1
Implementation
• The algorithm is non-blocking so no locks
• 4 classes created
stack<T>
StackThread
AtomicReference<cell<T>> top
T pop(threadInfo<T> p)
void push(T value,
threadInfo<T> p)
threadInfo<T> info
threadInfo<T>
cell<T>
cell<T> next
T value
final int id
final int spin
OP op
cell<T> cell
Key variables
• We have three shared variables:
– two arrays used during collision:
• void * location[] - for data exchange and synch
 java.util.concurrent.atomic.AtomicReferenceArray
• int collision[] - for finding a collision partner
 java.util.concurrent.atomic.AtomicIntegerArray
– one top pointer:
• Cell * top - points to the top of the stack
 java.util.concurrent.atomic.AtomicReference
Correctness Tests
• Tested for correctness by:
– Counting the number of pushes, pops and
failed pops
– Compare the expected stack size to the
actual size
• So far all expected == actual
Timing tests
• Ran the stack under different number of threads
and total operations
• Compared to Treiber’s stack (the one discussed
in class)
• Ours ran 2-3 times faster.
• The rate increased as # of threads increased
Our stack vs Treiber’s
2 threads
8 threads
4500
25000
4000
20000
3500
2500
generic
2000
simple
time (ms)
time (ms)
3000
15000
generic
simple
10000
1500
1000
5000
500
0
0
100,000
1,000,000
2,000,000
5,000,000
100,000
# of Ops
1,000,000
2,000,000
5,000,000
# of Ops
256 threads
64 threads
200000
800000
180000
700000
160000
600000
120000
generic
100000
simple
80000
60000
time (ms)
time (ms)
140000
500000
generic
400000
simple
300000
200000
40000
100000
20000
0
0
100,000
1,000,000
2,000,000
# of Ops
5,000,000
100,000
1,000,000
2,000,000
# of Ops
5,000,000
Possible improvement and
the next step
• Will test against:
– improved implementations of our stack
algorithm
– synchronized stack
• Try to prove some properties of our
algorithm