Transcript CS300-07

Selection and Adversary arguments
Sung Yong Shin
CS Dept., KAIST
Contents
1. Introduction
2. Finding max. and min.
3. Finding the 2th largest key
4. The Selection Problem
5. A lower bound for finding the median
1. Introduction
SP : (Selection Problem)
Given a set of n real numbers, find the kth smallest one, 1  k  n.
How can you solve it?
well, …
(1) Sort the numbers.
(2) Pick the kth smallest one.
O(nlogn)
Any better way?
What is a trivial lower bound in time complexity for solving SP?
TL(n) = (n)
Why?
What if only considering comparisons?
well, ...
P : Given a set S of n real numbers, find the largest one.
n=3
S = {x1, x2, x3}
1:2
L = {11, 12, 13, 14}
<
>
1:3
2:3
<
>
<
11
12
x3
(x1, x2, x3)
x2
(?, ?, x2)
>
13
x3
(x2, x1, x3)
|W|  |L|
W = {(?, ?, x1), (?, ?, x2), (?, ?, x3)}
|W| = n
TL(n) = log2|W| = log2n
However, this is not tight !!!
Why?
14
x1
(?, ?, x1)
Adversary Arguments
Z1000 = {0, 1, …… , 999}
Guess what number in Z1000 I have in mind?
Guessing Game !!!
Maximize the number of leaves in a decision tree.
You can change your mind as long as your answers(responses)
are consistent !!!
2. Finding Max. and Min.
MM : Given a set of n real numbers, find max and min.
max = the largest number
min = the smallest number
How can you solve MM?
x1 x2 x3 x4 …… x2n-1 x2m
W  {x11, x21, x31,……, xm1}  max
L  {x12, x22, x32,……, xm2}  min
How many comparisons?
m
…… dividing
m-1 …… finding max
m-1 …… finding min
3
n2
2
Any better way?
3m  2 
n = 2m
What information is needed for finding max and min?
Finding max :
All numbers except max itself must lose at least once in some comparisons.
(n-1 losses)
Similarly,
Finding min :
All numbers except min itself must win at least once in some comparisons.
(n-1 wins)
1 unit of information  (1 win) or (1 loss)
(2n - 2) units of information is needed !!!
Is that right?
Yes !!!
Why?
x1
L
W
x2
L
W
x3
L
W
max
x4
W
x5
L
W
min
x6
L
Status of a number
(xi, si)
Status
W  at least one win, no lost
L  at least one lost, no win
WL  wins and losts
N  no comparisons
Status of keys x and y
compared by an algorithm
N,N
W,N or WL,N
L,N
W,W
L,L
W,L or WL,L or W,WL
WL,WL
Adversary response
New Status
Units of new
information
x>y
x>y
x<y
x>y
x>y
x>y
Consistent with
assigned values
W,L
W,L or WL,L
L,W
W,WL
WL,L
No change
No change
2
1
1
1
1
0
0
Example
Theorem : Any algorithm to find max and min of n numbers must do at least
3n/2-2 comparisons in the worst case
[proof]
n  2m (for simplicity)
(N, N)
n
2
……
2m information
?
?
……
2m-2, since 2n-2(4m-2) needed
n
n
3n
 (2m  2)   (n  2) 
2
2
2
2
What if n = 2m + 1 ?
Exercise.
3. Finding the 2nd largest key
2L : Given a set of n real numbers find max and max2
max2  the 2nd largest one
19
19
7*
19
19
1*
15*
2
15
7
(n-1) + (log2 n - 1)
= n + log2 n - 2 comparisons
9
6
15
3
6
x1 x2 x3  xn
W L L L
W L  L Do we need all those “L” ?
Any better algorithm ?
max --- n - 1 comparisons
max2 --- ?
How many numbers are compared directly with max.
w(xi) = 1, i = 1, 2, ……, n
Comparing xi and xj
w(xi) > w(xj)
xi > xj
w(xi) = w(xj) > 0
w(xi) := w(xi) + w(xj)
w(xj) := 0
``
``
w(xi) < w(xj)
xi < xj
w(xi) = w(xj) = 0
consistent
w(xj) := w(xj) + w(xi)
w(xi) := 0
no change
w(x2)
w(x3)
w(x1)
(x1,x2)
1
1
2
0
1
2
(x3,x5)
3
0
1
w(x5)
1
x1 > x2
(x3,x4)
(x1,x3)
w(x4)
5
0
x3 > x4
0
x3 > x5
x1 < x3
Lemma : # of direct loser to max = log2n
[Proof]
max  xi for some i
w(xi) = n
wk(xi)  w(xi) after the kth win against a previously undefeated key
wk(xi)  2  wk-1(xi), w0(xi) = 1
Why ?
n = wt(xi)  2tw0(xi)
n = 2t
Why?
t = log2n
Theorem : Any algorithm to find the 2nd largest number in a set of n real numbers
must do at least n + log2n - 2 comparisons.
4. Selection Problem
SP : Given a set S of n real numbers, find the kth smallest one.
n - k numbers > Nk
Nk
k - 1 numbers < Nk
a
b
b is less than a (b < a)
In order to know the kth smallest number Nk, the relation to each number in S to
Nk must be known !!!
Why ?
y
Nk
y
x
x
An adversary could change the value of y
which is not related to Nk !!!
 n - 1 Comparisons !!!
Why ?
Theorem : Finding the kth smallest element in S requires at least |S| - 1
comparisons.
How to find the kth smallest one
A straightforward approach
(1) Sort S
(2) Pick the kth smallest one
O(nlogn)
Far from optimality !!!
Any better idea ?
well, ….
Try “Divide and Conquer !!!”
S = { 21, 15, 13, 8, 7, 29, 22, 2, 5, 10,
3, 26, 4, 19, 12, 20, 18, 24, 16, 23,
11, 1, 25, 14, 27, 6, 17, 9, 28 }
21
15
13
8
7
29
22
2
5
10
3
26
4
19
12
20
18
24
16
23
11
1
25
14
27
6
17
9
28
21
15
13
8
7
29
22
10
5
2
26
19
12
4
3
24
23
20
18
16
27
25
14
11
1
6
17
9
28
Divide S into  |S|/5 
sequences of S
element each with up
to 4 leftover elements
Sort each 5-element
sequence
A
29
22
10
5
2
C
26
19
12
4
3
21
15
13
8
7
B
27
25
14
11
1
D
24
23
20
18
16
6
17
9
28
M=
m = the median of M
S1 = {s | s < m and s  S}
S2 = {s | s = m and s  S}
S3 = {s | s > m and s  S}
3
|S1|  |S|
4
Why?
3
|S3|  |S|
4
.
.
.
.
.
C
A
.
.
.
.
.
.
.
.
.
.
. .
. m.
. .
. .
. .
B
.
.
.
.
.
D
. .
. .
. .
. .
. .
S1 = {s | s < m and s  S}
S2 = {s | s = m and s  S}
S3 = {s | s > m and s  S}
3
3
|S1|  |S| and |S3|  |S|
4
4
m
if |S1|  k then
select (S1, k)
else if |S1| + |S2|  k then
m is the kth smallest one
else
select (S3, k - |S1| - |S2|)
end
T ( n)  T (
3n
)  c  n  T (n / 5)
4
Why?
Algorithm ( finding the kth smallest element in S )
procedure SELECT(k,S)
begin
if |S| < 50 then
sort S;
SELECT := the kth smallest one
end {if}
else
divide S into  |S|/5  sequences of 5 elements
each with up to 4 leftover elements;
sort each 5-element sequences;
Let M be the sequence of medians of
5-elements sets (sequence);
|M |
, M );
m := SELECT(
2
S1 = {s | s < m and s  S};
S2 = {s | s = m and s  S};
S3 = {s | s > m and s  S};
if |S1|  k then
SELECT ( k, S1 )
else if |S1| + |S2|  k then SELECT := m
else
SELECT ( k - |S1| - |S2|, S3 )
end
end
end
c1  n
Why?
c2  n
n
T 
5
c3  n
 3n 
T 
 4 
if n  50
c  n
 T ( n)  
T (n / 5)  T (3n / 4)  c  n if n  50
Show that T(n)  20cn .
How?
By induction!!!
n = 50
150
)  c  50
4
 T (10)  T (38)  c  50
 c 10  c  38  c  50
 c  98  20  c  50
T (50)  T (10)  T (
50 < m  k
T(m)  20cm
n = k+1
T(k+1)


T((k+1) / 5) + T(3(k+1) / 4) + c·(k+1)
20·c·(k+1) / 5 + 20·c·3(k+1) / 4 + c(k+1)
4c(k+1) + 15·c·(k+1) + c(k+1)

20·c·(k+1)
Finding the Median
# of comparisons
16n
Blum
5.4n
Hyafile & Schonhage
Paterson & Pippenger
The
3n + o(n)
best
known
little “o”
algorithm
5. A Lower Bound for Finding the Median
k = n/2
 n-1 comparisons
Can you find any tighter lower bound?
well, …
Why not using an adversary argument?
Observation
median
Crucial Comparisons
non-crucial comparisons
Def’n : A comparison involving an element x is said to be a crucial comparison
for x if it is the first comparison with y satisfying one of the following
conditions :
(1) x > y for some y  median.
(2) x < y for some y  median.
Note :
(i) A crucial comparison for x establishes the relation of x to median.
(ii) The relation of y to median is not necessarily known at the time the crucial
comparison for x is done.
Adversary Strategy
“Force an algorithm to perform as many non-crucial comparisons as possible.”
How ?
Assigning values to variables.
(xi, si)
Status
L: assigned a value larger than median
S: assigned a value smaller than median
N: not yet in comparison
Comparing xi and xj, i  j
{N, N} -- xi > median > xj
{L, N} -- make the unassigned one smaller than median
{S, N} -- reverse the above
{L, L} consistent with previous responses
{S, S}
Unless there are already (n - 1) / 2 elements with status S (or L), keep the strategy
previously stated !!!
Otherwise, make the balance for the number of both status.
(n - 1) / 2 non-crucial comparisons possible !!!
Why ?
3
(n  1) / 2  (n  1)  (n  1)
2
Comparisons
crucial
non-crucial
Theorem : Any algorithm to find the median of n numbers must do
at least 3 (n  1) comparisons.
2
3
( n  1)
2
1.75n  log n
1.8n
2n
Best lower bound
currently known !!!
(n - 1) comparisons are tight lower bound only for k = 1 and n !!!
Project 2
•
“Graph-related algorithms”
–
Both directed and undirected graphs
–
Menu-driven
1) Initialize Graphs
2) Min-Cost spanning tree
3) Dijkstra’s shortest path
4) Depth-first Search
5) Breadth-first Search
6) Biconnected components
7) Strongly connected components
Due: Nov. 29 (Friday)