accelerated cascading

Download Report

Transcript accelerated cascading

Accelerated Cascading
Advanced Algorithms & Data Structures
Lecture Theme 16
Prof. Dr. Th. Ottmann
Summer Semester 2006
Fast computation of maximum
Input: An array A holding p elements from a linearly
ordered universe S. We assume that all the elements in A
are distinct.
Output: The maximum element from the array A.
We use a boolean array M such that M(k)=1 if and only if
A(k) is the maximum element in A.
Initialization: We allocate p processors to set each entry in
M to 1.
2
Fast computation of maximum: Step1
•Step 1: Assign p processors for each element in A, p2
processors overall.
• Consider the p processors allocated to A(j).
We name these processors as P1, P2,..., Pi,..., Pp.
• Pi compares A(j) with A(i) :
If A(i) > A(j) then M(j) := 0
else do nothing.
3
Fast computation of maximum: Step2
Step 2: At the end of Step 1, M(k) , 1  k  p will be 1
if and only if A(k) is the maximum element.
• We allocate p processors, one for each entry in M.
• If the entry is 0, the processor does nothing.
• If the entry is 1, it outputs the index k of the maximum
element.
4
Processor requirement and PRAM model
Complexity: The processor requirement is p2 and the time
complexity is O(1).
• We need concurrent write facility and hence the
Common CRCW PRAM model.
5
Optimal computation of maximum
• This is the same algorithm which we used for adding n
numbers.
6
Optimal computation of maximum: Analysis
• This algorithm takes O(n) processors and O(log n) time.
• We can reduce the processor complexity to O(n / log n).
Hence the algorithm does optimal O(n) work.
7
An O(log log n) time algorithm (1)
• Instead of a binary tree, we use a more complex tree.
2k
Assume that n  2 .
2k 1
• The root of the tree has 2  n children.
2k i1
• Each node at the i-th level has 2
children
for
.
• Each node at level k has two children.
0  i  k 1
8
An O(log log n) time algorithm (2)
Some Properties
2k
• The depth of the tree is k. Since n  2 , k   loglog n

2k 2k i
• The number of nodes at the i-th level is 2

,for 0  i  k.
Prove this by induction.
9
An O(log log n) time algorithm (3)
The Algorithm
• The algorithm proceeds level by level, starting from the
leaves.
• At every level, we compute the maximum of all the
children of an internal node by the O(1) time algorithm.
• The time complexity is O(log log n) since the depth of the
tree is O(log log n).
10
An O(log log n) time algorithm: Work complexity
Total Work:
• Recall that the O(1) time algorithm needs O(p2) work for
p elements.
• Each node at the i-th level has 2
2k i1
children.
• So the total work for each node at the i-th level
is O(22k i.1 )2
11
Total Work
Total Work:
2k  2k i
• There are 2
nodes at the i-th level. Hence the total
work for the i-th level is:
2k i1 2
O(2
2k 2k i
) 2
 O(2 )  O(n)
2k
• For O(log log n) levels, the total work is O(n log log n) .
This is suboptimal.
12
Accelerated cascading: Idea
• The first algorithm which is based on a binary tree, is
optimal but slow.
• The second algorithm is suboptimal, but very fast.
• We combine these two algorithms through the
accelerated cascading strategy.
• We start with the optimal algorithm until the size of the
problem is reduced to a certain value.
• Then we use the suboptimal but very fast algorithm.
13
Accelerated cascading: Phase 1
Phase 1.
• We apply the binary tree algorithm, starting from the leaves
and upto log log log n levels.
• The number of candidates reduces to
n
2log log log n
n

.
log log n
• The total work done so far is O(n) and the total time is
O(log log log n) .
14
Accelerated cascading: Phase 2
Phase 2.
• In this phase, we use the fast algorithm on the
n
n  O(
)
log log n
• The total work is
remaining candidates.
O(n log log n)  O(n)
• The total time is O(log log n)  O(log log n)
• Theorem: Maximum of n elements can be computed
in O(log log n) time and O(n) work on the Common
CRCW PRAM.
15