Transcript i + 1

Internet Algorithms:
Design and Analysis
Alex Kesselman, MPI
High Performance
Switching and Routing
Telecom Center Workshop: Sept 4, 1997.
MiniCourse, Oct. 2004
Algorithms for Networks
•
Networking provides a rich new context
for algorithm design
–
algorithms are used everywhere in networks
–
–
–
–
many new scenarios
and very stringent constraints
–
–
–
–
at the end-hosts for packet transmission
in the network: switching, routing, caching, etc.
high speed of operation
large-sized systems
cost of implementation
require new approaches and techniques
2
Methods
In the networking context
–
–
–
we also need to understand the “performance”
of an algorithm: How well does a network or a
component that uses a particular algorithm
perform, as perceived by the user?
performance analysis is concerned with metrics
like delay, throughput, loss rates, etc
metrics of the designer and of the theoretician
not necessarily the same
3
Recent Algorithm Design Methods
•
Motivated by the desire
–
–
•
for simple implementations
and for robust performance
Several methods of algorithm design can
be used in the networking context
–
–
–
–
randomized algorithms
approximation algorithms
online algorithms
distributed algorithms
4
In this Mini Course…
•
We will consider a number of problems in
networking
•
Show various methods for algorithm design
and for performance analysis
5
Network Layer Functions
• transport packet from
sending to receiving hosts
• network layer protocols in
every host, router
important functions:
• path determination: route
taken by packets from
source to dest.
• switching: move packets
from router’s input to
appropriate router output
application
transport
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
application
transport
network
data link
physical
6
The Internet
The Internet Core
Edge Router
7
Internet Routing Algorithms
Balaji Prabhakar
High Performance
Switching and Routing
Telecom Center Workshop: Sept 4, 1997.
Network looks like Graph !
9
Routing
Routing protocol
5
Goal: determine “good” path
(sequence of routers) thru
network from source to dest.
Graph abstraction for
routing algorithms:
• graph nodes are
routers
• graph edges are
physical links
– link cost: delay, $ cost,
or congestion level
2
A
B
2
1
D
3
C
F
1
3
1
5
E
2
• “good” path:
– typically means minimum
cost path
– other def’s possible
10
Routing Algorithms Classification
Global or decentralized
information?
Global:
• all routers have complete
topology, link cost info
• “link state” algorithms
Decentralized:
• router knows physicallyconnected neighbors, link
costs to neighbors
• iterative process of info
exchange with neighbors
• “distance vector”
algorithms
Static or dynamic?
Static:
• routes change slowly
over time
Dynamic:
• routes change more
quickly
– periodic update
– in response to link cost
changes
11
Link-State Routing Algorithms: OSPF
Compute least cost paths from a node to all
other nodes using Dijkstra’s algorithm.
– advertisement carries one entry per neighbor router
– advertisements disseminated via flooding
12
Dijkstra’s algorithm: example
Step
0
1
2
3
4
5
start N
A
AD
ADE
ADEB
ADEBC
ADEBCF
D(B),p(B) D(C),p(C) D(D),p(D) D(E),p(E) D(F),p(F)
2,A
1,A
5,A
infinity
infinity
2,A
4,D
2,D
infinity
2,A
3,E
4,E
3,E
4,E
4,E
5
2
A
B
2
1
D
3
C
3
1
5
F
1
E
2
13
Route Optimization
Improve user performance and network efficiency by tuning
OSPF weights to the prevailing traffic demands.
AT&T
customers or
peers
backbone
customers or
peers
14
Route Optimization
• Traffic engineering
– Predict influence of weight changes on traffic flow
– Minimize objective function (say, of link utilization)
• Inputs
– Networks topology: capacitated, directed graph
– Routing configuration: routing weight for each link
– Traffic matrix: offered load each pair of nodes
• Outputs
– Shortest path(s) for each node pair
– Volume of traffic on each link in the graph
– Value of the objective function
15
Example
B
1
1
2
D
A
1
2
C
E
Links AB and BD are overloaded
Change weight of CD to 1 to improve routing (load balancing) !
16
References
1.
2.
Anja Feldmann, Albert Greenberg, Carsten Lund, Nick
Reingold, Jennifer Rexford, and Fred True, "Deriving
traffic demands for operational IP networks: Methodology
and experience," IEEE/ACM Transactions on Networking,
pp. 265-279, June 2001.
Bernard Fortz and Mikkel Thorup, "Internet traffic
engineering by optimizing OSPF weights," in Proc. IEEE
INFOCOM, pp. 519-528, 2000.
17
Distance Vector Routing: RIP
• Based on the Bellman-Ford algorithm
– At node X, the distance to Y is updated by
D X (Y )  min ZN ( X ) (c( X , Z )  D Z (Y ))
where DX(Y) denote the distance at X currently
from X to Y,
N(X) is set of the neighbors of node X, and c(X,
Z) is the distance of the direct link from X to Z
18
Distance Table: Example
A
Below is just one step! The algorithm repeats for ever! 1
7
B
1
C
2
8
E
D
2
D () A
B
D
A
B
D
E’s
distance
table
A 0
7

1
15

1, A
A: 1
B 7
0

8
8

8, B
B: 8
C 
1
2

9
4
4, D
C: 4
D 

0


2
2, D
D: 2
1
8
2
E: 0
c(E,A)
c(E,B)
c(E,D)
19
E
distance tables
from neighbors
computation
distance
table E sends
to its neighbors
Link Failure and Recovery
• Distance vectors: exchanged every 30 sec
• If no advertisement heard after 180 sec -->
neighbor/link declared dead
– routes via neighbor invalidated
– new advertisements sent to neighbors
– neighbors in turn send out new advertisements
(if tables changed)
– link failure info quickly propagates to entire net
20
The bouncing effect
dest cost
B
C
1
2
dest cost
1
A
B
A
C
1
1
1
25
C
dest cost
A
B
2
1
21
C sends routes to B
dest cost
dest cost
B
C
1
2
A
B
A
C
~
1
1
25
C
dest cost
A
B
2
1
22
B updates distance to A
dest cost
dest cost
B
C
1
2
A
B
A
C
3
1
1
25
C
dest cost
A
B
2
1
23
B sends routes to C
dest cost
dest cost
B
C
1
2
A
B
A
C
3
1
1
25
C
dest cost
A
B
4
1
24
How are these loops caused?
• Observation 1:
– B’s metric increases
• Observation 2:
– C picks B as next hop to A
– But, the implicit path from C to A includes itself!
25
Solutions
• Split horizon/Poisoned reverse
– B does not advertise route to C or advertises it
with infinite distance (16)
• Works for two node loops
– does not work for loops with more nodes
26
Example where Split Horizon fails
A
B
1
1
1
C
D
1
• When link breaks, C marks D
as unreachable and reports
that to A and B
• Suppose A learns it first. A
now thinks best path to D is
through B. A reports a route
of cost=3 to C.
• C thinks D is reachable
through A at cost 4 and
reports that to B.
• B reports a cost 5 to A who
reports new cost to C.
• etc...
27
Comparison of LS and DV algorithms
Message complexity
• LS: with n nodes, E links,
O(nE) msgs sent
• DV: exchange between
neighbors only
– larger msgs
Speed of Convergence
• LS: requires O(nE) msgs
– may have oscillations
• DV: convergence time varies
– routing loops
– count-to-infinity problem
Robustness: what
happens if router
malfunctions?
LS:
– node can advertise
incorrect link cost
– each node computes only
its own table
DV:
– DV node can advertise
incorrect path cost
– error propagates thru
network
28
Hierarchical Routing
Our routing study thus far - idealization
• all routers identical
• network “flat”
… not true in practice
scale: with 50 million
destinations:
• can’t store all dest’s in
routing tables!
• routing table exchange
would swamp links!
administrative autonomy
• internet = network of
networks
• each network admin may
want to control routing in
its own network
29
Hierarchical Routing
• aggregate routers
into regions,
“autonomous
systems” (AS)
• routers in same AS
run same routing
protocol
– “intra-AS” routing
protocol
gateway routers
• special routers in AS
• run intra-AS routing
protocol with all other
routers in AS
• also responsible for
routing to destinations
outside AS
– run inter-AS routing
protocol with other
gateway routers
30
Internet AS Hierarchy
Inter-AS border (exterior gateway) routers
Intra-AS interior (gateway) routers
31
Intra-AS and Inter-AS routing
C.b
a
Host
h1
C
b
A.a
Inter-AS
routing
between
A and B
A.c
a
d
c
b
A
Intra-AS routing
within AS A
B.a
a
c
B
Host
h2
b
Intra-AS routing
within AS B
32
Peer-to-Peer Networks: Chord
Balaji Prabhakar
High Performance
Switching and Routing
Telecom Center Workshop: Sept 4, 1997.
A peer-to-peer storage problem
• 1000 scattered music enthusiasts
• Willing to store and serve replicas
• How do you find the data?
34
The Lookup Problem
N1
Key=“title”
Value=MP3 data…
Publisher
N2
Internet
N4
N5
N3
?
Client
Lookup(“title”)
N6
35
Centralized lookup (Napster)
N1 N2
SetLoc(“title”, N4)
Publisher@N4
Key=“title”
Value=MP3 data…
N3
DB
N9
N6
N7
Client
Lookup(“title”)
N8
Simple, but O(N) state and a single point of failure
36
Flooded queries (Gnutella)
N2
N1
Publisher@N
4
Key=“title”
Value=MP3 data…
N6
N7
N3
Lookup(“title”)
Client
N8
N9
Robust, but worst case O(N) messages per lookup
37
Routed queries (Freenet, Chord, etc.)
N2
N1
Publisher
N3
N4
Key=“title”
Value=MP3 data…
N6
N7
Client
Lookup(“title”)
N8
N9
38
Chord Distinguishing Features
• Simplicity
• Provable Correctness
• Provable Performance
39
Chord Simplicity
• Resolution entails participation by
O(log(N)) nodes
• Resolution is efficient when each node
enjoys accurate information about
O(log(N)) other nodes
40
Chord Algorithms
• Basic Lookup
• Node Joins
• Stabilization
• Failures and Replication
41
Chord Properties
• Efficient: O(log(N)) messages per lookup
– N is the total number of servers
• Scalable: O(log(N)) state per node
• Robust: survives massive failures
42
Chord IDs
•
•
•
•
Key identifier = SHA-1(key)
Node identifier = SHA-1(IP address)
Both are uniformly distributed
Both exist in the same ID space
• How to map key IDs to node IDs?
43
Consistent Hashing
[Karger 97]
• Target: web page caching
• Like normal hashing, assigns items to
buckets so that each bucket receives
roughly the same number of items
• Unlike normal hashing, a small change in
the bucket set does not induce a total
remapping of items to buckets
44
Consistent Hashing
[Karger 97]
Key 5
Node 105
K5
N105
K20
Circular 7-bit
ID space
N32
A key is stored at its successor:
node with next higher ID
N90
K80
45
Basic lookup
N120
N10 “Where is key 80?”
N105
“N90 has K80”
N32
K80 N90
N60
46
Simple lookup algorithm
Lookup(my-id, key-id)
n = my successor
if my-id < n < key-id
call Lookup(id) on node n // next hop
else
return my successor
// done
• Correctness depends only on successors
47
“Finger table” allows log(N)-time
lookups
¼
½
1/8
1/16
1/32
1/64
1/128
N80
48
Finger i points to successor of n+2i
N120
112
¼
½
1/8
1/16
1/32
1/64
1/128
N80
49
Lookup with fingers
Lookup(my-id, key-id)
look in local finger table for
highest node n s.t. my-id < n < key-id
if n exists
call Lookup(id) on node n
// next hop
else
return my successor
// done
50
Lookups take O(log(N)) hops
N5
N10
K19
N20
N110
N99
N32 Lookup(K19)
N80
N60
51
Node Join
Linked List Insert
N25
N36
1. Lookup(36)
N40
K30
K38
52
Node Join (2)
N25
2. N36 sets its own
successor pointer
N36
N40
K30
K38
53
Node Join (3)
N25
3. Copy keys 26..36
from N40 to N36
N36 K30
N40
K30
K38
54
Node Join (4)
N25
4. Set N25’s successor
pointer
N36 K30
N40
K30
K38
Update finger pointers in the background
Correct successors produce correct lookups
55
Stabilization
• Case 1: finger tables are reasonably fresh
• Case 2: successor pointers are correct;
fingers are inaccurate
• Case 3: successor pointers are inaccurate
or key migration is incomplete
• Stabilization algorithm periodically
verifies and refreshes node knowledge
– Successor pointers
– Predecessor pointers
– Finger tables
56
Failures and Replication
N120
N113
N10
N102
N85
Lookup(90)
N80
N80 doesn’t know correct successor, so incorrect lookup
57
Solution: successor lists
• Each node knows r immediate successors
• After failure, will know first live successor
• Correct successors guarantee correct lookups
• Guarantee is with some probability
58
Choosing the successor list length
• Assume 1/2 of nodes fail
• P(successor list all dead) = (1/2)r
– I.e. P(this node breaks the Chord ring)
– Depends on independent failure
• P(no broken nodes) = (1 – (1/2)r)N
– r = 2log(N) makes prob. = 1 – 1/N
59
References
Ion Stoica, Robert Morris, David Liben-Nowell, David R.
Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan,
``Chord: A Scalable Peer-to-peer Lookup Protocol for
Internet Applications,'‘IEEE/ACM Transactions on
Networking, Vol. 11, No. 1, pp. 17-32, February 2003.
60
Switch Scheduling Algorithms
Balaji Prabhakar
High Performance
Switching and Routing
Telecom Center Workshop: Sept 4, 1997.
Basic Architectural Components
1.
Routing
Table
2.
Interconnect
3.
Output
Scheduling
Forwarding
Decision
Routing
Table
Forwarding
Decision
Routing
Table
Forwarding
Decision
62
Switching Fabrics
Output
Queued
Input
Queued
Multi
stage
3
7
5
2
6
0
1
4
7
2
3
5
6
1
0
4
Combined
Input and
Output Queued
Batcher Sorter
7 7 7 77
5 0 4 66
2 5 5 45
3
1
6 54
1
3 0 33
0 4 3 22
6 2
1
01
4 6 2 20
Self-Routing Network
000
001
Parallel
Packet
Switches
010
011
100
101
110
111
63
Input Queueing
Data In
Scheduler
configuration
Data Out
64
Background
1.
[Karol et al. 1987] Throughput limited to 2  2  58% by
head-of-line blocking for Bernoulli IID uniform traffic.
2.
[Tamir 1989] Observed that with “Virtual Output Queues”
(VOQs) Head-of-Line blocking is reduced and throughput
goes up.
65
Head of Line Blocking
66
67
68
Input Queueing
Virtual output queues
69
Background
Scheduling via Matching
3.
[Anderson et al. 1993] Observed analogy to
maximum size matching in a bipartite graph.
Matching
O(N2.5)
4.
[McKeown et al. 1995]
(a) Maximum size match can not guarantee 100%
throughput.
(b) But maximum weight match can – O(N3).
70
Background
Speedup
5.
[Chuang, Goel et al. 1997] Precise emulation of a central
shared memory switch is possible with a speedup of two
and a “stable marriage” scheduling algorithm.
6.
[Prabhakar and Dai 2000] 100% throughput possible for
maximal matching with a speedup of two.
71
Simulation
Input Queueing
Output Queueing
72
Using Speedup
1
2
1
2
1
73
Scheduling algorithms to achieve 100%
throughput
1.
2.
3.
4.
•
•
•
Basic switch model.
When traffic is uniform (Many algorithms…)
When traffic is non-uniform.
Technique: Birkhoff-von Neumann decomposition.
Load balancing.
Technique: 2-stage switch.
Technique: Parallel Packet Switch.
74
Basic Switch Model
S(n)
A1(n)
A11(n)
L11(n)
1
1
D1(n)
A1N(n)
AN(n)
AN1(n)
DN(n)
N
ANN(n)
N
LNN(n)
75
Some Definitions
1. Traffic matrix :
 
  ij , where : ij  E[ Aij (n)]
If  ij  1,

i
ij
 1 we say the traffic is " admissible ".
j
2. Service matrix :
S  [ sij ], where : sij  0,1 and S is a permutatio n matrix.
3. Queue occupancies:
Occupancy
L11(n)
LNN(n)
76
Some possible performance goals
1. Work conservati on
2. "100% throughput "
When
traffic is
admissible
3. Lij (n)  C , n
4. E[ Lij (n)]  C ,
5. lim
n 
Dij (n)
 lim
Aij (n)
n 
n
n
6. Other metrics... ?
 ij
77
Scheduling algorithms to achieve 100%
throughput
1.
2.
3.
4.
•
•
•
Basic switch model.
When traffic is uniform (Many algorithms…)
When traffic is non-uniform.
Technique: Birkhoff-von Neumann decomposition.
Load balancing.
Technique: 2-stage switch.
Technique: Parallel Packet Switch.
78
Algorithms that give 100% throughput
for uniform traffic
• Quite a few algorithms give 100%
throughput when traffic is uniform
• “Uniform”: the destination of each cell is
picked independently and uniformly and at
random (uar) from the set of all outputs.
79
Maximum size bipartite match
• Intuition: maximizes instantaneous
throughput
L11(n)>0
Maximum
Size Match
LN1(n)>0
“Request” Graph
Bipartite Match
• Gives 100% throughput for uniform
traffic.
80
Some Observations
• A maximum size match (MSM) maximizes
instantaneous throughput.
• But a MSM is complex – O(N2.5).
• In general, maximal matching is much
simpler to implement, and has a much
faster running time.
• A maximal size matching is at least half
the size of a maximum size matching.
81
Maximal vs. Maximum Matching
A
1
A
1
B
2
B
2
C
3
C
3
D
4
D
4
E
5
5
F
6
E
F
6
Maximal Matching
Maximum Matching
82
TDM Scheduling Algorithm
If arriving traffic is i.i.d with destinations picked uar
across outputs, then a “TDM” schedule gives 100%
throughput.
C
A
2 B
3 C
D
4
A
B
1
D
1
2
A
B
1
2
3
C
3
4
D
4
Permutations are picked uar from
the set of N! permutations.
83
Why doesn’t maximizing instantaneous throughput
give 100% throughput for non-uniform traffic?
 11   12  1 / 2  
Three possible
matches, S(n):
 21  1 / 2  
 32  1 / 2  
Assume that at time n, L11(n)  0, L12(n)  0, and Q21 (n) and Q32 (n) both
have arrivals (w.p. (1/ 2-δ ) 2 )  Input 1 is serviced w.p. 2 / 3.

 
The total rate at which input 1 is served is at most 2 / 3  (1/ 2- ) 2  1 1  (1/ 2   ) 2
 1  1/ 3  (1/ 2   ) 2 .
But λ1  1-2  1 - 1/ 3  (1/ 2 -δ ) 2 .
And so if   0.0358 switch is not stable (throughpu t  100%).
84

Scheduling algorithms to achieve 100%
throughput
1.
2.
3.
4.
•
•
•
Basic switch model.
When traffic is uniform (Many algorithms…)
When traffic is non-uniform.
Technique: Birkhoff-von Neumann decomposition.
Load balancing.
Technique: 2-stage switch.
Technique: Parallel Packet Switch.
85
Example:With random arrivals, but
known traffic matrix
• Assume we know the traffic matrix, and the arrival pattern
is random:
1 / 2
1 / 2

 0

 0
1/ 2
1/ 2
0
0
0
1
0
0
0
0

0

1
• Then we can simply choose:
1
0
S (odd )  
0

0
0
1
0
0
0
1
0
0
0
0
1
0
, S (even)  
0
0


1
0
1
0
0
0
0
1
0
0
0
0
0

1
86
Birkhoff - von Neumann Decomposition
Intuitivel y, we can pick some set of constants (a1 , , ar )
and set of service matrices, ( M 1 ,  M r ) such that :
r
a M
i 1
i
i
 ,
(element by element) .
Then pick the sequence of service matrices :
S (n)  ( M 1 , M 2 , M 2 , M 3 , , M r , M 1 , )
T
So that the # of occurrence s of M i in period T is ai .
T
In other words, (   S (i ))  0, and the departure
i 1
rate exceeds the arrival rate.
Turns out, any  can always be decomposed into a linear (convex)
combination of matrices, (M1, …, Mr) by Birkhoff-von Neumann.
87
Birkhoff ‘1946 Decomposition
Example
0.6 0.3 0.1
~
R  0.3 0.2 0.5


 0.1 0.5 0.4
=
1 0 0


0.2 0 1 0
0 0 1
=
1 0 0
0.2 0 1 0


0 0 1
+
0.4 0.3 0.1
0.3 0 0.5


 0.1 0.5 0.2
+
1 0 0
0.4 0 0 1


0 1 0
+
 0 0.3 0.1
0.3 0 0.1


0.1 0.1 0.2
1 0 0
1 0 0
0 1 0
0 1 0
0 0 1
~
R  0.2 0 1 0  0.4 0 0 1  0.21 0 0  0.10 0 1  0.11 0 0










0 0 1
0 1 0
0 0 1
1 0 0
0 1 0
88
In practice…
• Unfortunately, we usually don’t know
traffic matrix  a priori, so we can:
– measure or estimate , or
– use the current queue occupancies.
89
Scheduling algorithms to achieve 100%
throughput
1.
2.
3.
4.
•
•
•
Basic switch model.
When traffic is uniform (Many algorithms…)
When traffic is non-uniform.
Technique: Birkhoff-von Neumann decomposition.
Load balancing.
Technique: 2-stage switch.
Technique: Parallel Packet Switch.
90
2-stage Switch
Motivation:
1. If traffic is uniformly distributed, then
even a simple TDM schedule gives 100%
throughput.
2. So why not force non-uniform traffic to
be uniformly distributed?
91
2-stage Switch
S1(n)
A1(n)
1
1
L11(n)
A’1(n)
S2(n)
1
1
A’N(n)
AN(n)
N
D1(n)
DN(n)
N
N
N
LNN(n)
Bufferless
Load-balancing
Stage
Buffered
Switching
Stage
92
2-stage Switch
Main Result [Chang et al.]:
1. Consider a periodic sequence of permutation matrices:
 (n)   nˆ , where  is a one-cycle permutation matrix
(for example, a TDM sequence), and nˆ  n mod N .
2. If 1st stage is scheduled by a sequence of permutation
matrices:
 1 (n)   (n  1 ),
where 1 is a random starting phase, and
3. The 2nd stage is scheduled by a sequence of permutation
matrices:
 2 (n)   (n  2 ),
4. Then the switch gives 100% throughput for a very broad
range of traffic types.
Observation 1:
1st stage makes non-uniform traffic uniform,
and breaks up burstiness. For bursty traffic, delay can be
lower than for an output queued switch!
Observation 2:
Cells can become mis-sequenced.
93
Parallel Packet Switches
Definition:
A PPS is comprised of multiple identical
lower-speed packet-switches operating
independently and in parallel. An incoming
stream of packets is spread, packet-bypacket, by a demultiplexor across the
slower packet-switches, then recombined
by a multiplexor at the output.
We call this “parallel packet switching”
94
Architecture of a PPS
Demultiplexor
R
(sR/k)
1
OQ Switch
(sR/k)
1
1
OQ Switch
2
Demultiplexor
R
R
Multiplexor
Demultiplexor
R
Multiplexor
2
Multiplexor
2
3
3
OQ Switch
Demultiplexor
R
(sR/k)
R
Multiplexor
R
N=4
k=3
N=4
R
(sR/k
)
95
Parallel Packet Switches
Results
[Iyer et al.] If S >= 2 then a PPS can
precisely emulate a FIFO output queued
switch for all traffic patterns, and hence
achieves 100% throughput.
96
References
1.
2.
3.
4.
5.
6.
C.-S. Chang, W.-J. Chen, and H.-Y. Huang, "Birkhoff-von Neumann
input buffered crossbar switches," in Proceedings of IEEE
INFOCOM '00, Tel Aviv, Israel, 2000, pp. 1614 – 1623.
N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand.
Achieving 100% Throughput in an Input-Queued Switch. IEEE
Transactions on Communications, 47(8), Aug 1999.
A. Mekkittikul and N. W. McKeown, "A practical algorithm to achieve
100% throughput in input-queued switches," in Proceedings of IEEE
INFOCOM '98, March 1998.
L. Tassiulas, “Linear complexity algorithms for maximum throughput
in radio networks and input queued switchs,” in Proc. IEEE INFOCOM
‘98, San Francisco CA, April 1998.
C.-S. Chang, D.-S. Lee, Y.-S. Jou, “Load balanced Birkhoff-von
Neumann switches,” Proceedings of IEEE HPSR ‘01, May 2001, Dallas,
Texas.
S. Iyer, N. McKeown, "Making parallel packet switches practical," in
Proc. IEEE INFOCOM `01, April 2001, Alaska.
97
Competitive Analysis: Theory and
Applications in Networking
Balaji Prabhakar
High Performance
Switching and Routing
Telecom Center Workshop: Sept 4, 1997.
Decision Making Under Uncertainty:
Online Algorithms and Competitive Analysis
• Online Algorithm:
– Inputs arrive online (one by one)
– Algorithm must process each input as it arrives
– Lack of knowledge of future arrivals results in
inefficiency
• Malicious, All-powerful Adversary:
– Omniscient: monitors the algorithm
– Generates “worst-case” inputs
• Competitive Ratio:
– Worst ratio of the “cost” of online algorithm to
the “cost” of optimum algorithm
99
Competitive Analysis: Discussion
• Very Harsh Model
– All powerful adversary
• But..
– Can often still prove good competitive ratios
– Really tough Testing-Ground for Algorithms
– Often leads to good rules of thumb which can be
validated by other analyses
– Distribution independent: doesn’t matter whether
traffic is heavy-tailed or Poisson or Bernoulli
100
Competitive Analysis in Networking:
Outline
• Shared Memory Switches
• Multicast Trees
– The Greedy Strategy
• Routing and Admission Control
– The Exponential Metric
• More Restricted Adversaries
– Adversarial Queueing Theory
• Congestion Control
101
Interconnects
Output Queueing
Individual Output Queues
Centralized Shared Memory
1
2
N
1
2
N
102
Buffer Model
• We consider NxN switch
• Shared memory able to hold M bytes
• Packets may be either:
– accepted/rejected
– preempted
• All packets have the same size
M
103
Shared Memory Example
104
Competitive Analysis
Aim: maximize the total number of packets
transmitted
For each packet sequence S denote,
• VOPT(S): value of best possible solution,
• VA(S): value obtained by algorithm A
Throughput-Competitive Ratio:
MAXS {VOPT(S) / VA(S)}
Uniform performance guarantee
105
Longest Queue Drop Policy
When a packet arrives:
– Always accept if the buffer is not full
– Otherwise we accept the packet and drop a
packet from the tail of the longest queue
106
Longest Queue Drop Policy
M=9
107
LQD Policy Analysis
Theorem 1 (UB): The competitive ratio of the
LQD Policy is at most 2.
Theorem 2 (LB): The competitive ratio of the
LQD policy is at least 2.
Theorem 3 (LB): The competitive ratio of any
online policy is at least 4/3.
108
Proof Outline (UB)
Definition: An OPT packet p sent at time t is
an extra packet if the LQD port is idle.
Claim: There exists a matching between each
packet from EXTRA to a packet in LQD.
OPT
LQD
EXTRA
109
Matching Construction
• For each unmatched OPT packet p in a
higher position than the LQD queue length:
• When p arrives and it is accepted by both
OPT and LQD then match p to itself
• Otherwise, match p to any unmatched packet
in LQD
• If a matched LQD packet p is preempted,
then the preempting packet replaces p.
110
Proof Outline (UB)
OPT
LQD
111
Proof Outline (UB)
Lemma:
The matching process never fails.
• Notice:
V(OPT)  V(LQD) + V(EXTRA)
• Existence of matching implies:
V(EXTRA)  V(LQD)
• We obtain that:
V(OPT)  2 V(LQD)
112
Proof Outline (LB)
Scenario (active ports 1 & 2):
• At t = 0 two bursts of M packets to 1 & 2 arrive.
• The online retains at most M/2, say 1’s packets.
• During the following M time slots one packet
destined to 2 arrives.
• The scenario is repeated.
113
Proof Outline (LB-LQD)
Scenario:
Active
A
• the switch memory
M = A2/2 + A
• the number of output ports
N = 3A
Ovld.
A
Idle
A
114
Proof Outline (LB-LQD)
Active output ports:
• have an average load of 1 with period A
• the bursts to successive ports are evenly
staggered in time
Overloaded output ports:
• receive exactly 2 packets every time slot
115
Proof Outline (LB-LQD)
OPT ensures that both the active and
overloaded output ports are completely
utilized.
At the same time the throughput of the
active output ports in LQD is (2 -1)A.
116
Other Policies
Complete Partition: N-competitive
– Allocate to each output port M/N buffer
space
Complete Sharing: N-competitive
– Admit packets into the buffer if there is
some free space
117
Other Policies Cont.
Static Threshold: N-competitive
– Set the threshold for a queue length to M/N
– A packet is admitted if the threshold is not violated
and there is a free space
Dynamic Threshold: open problem
– Set the threshold for a queue length to the amount
of the free buffer space
– All packets above the threshold are rejected
118
Competitive Analysis in Networking:
Outline
• Shared Memory Switches
• Multicast Trees
– The Greedy Strategy
• Routing and Admission Control
– The Exponential Metric
• More Restricted Adversaries
– Adversarial Queueing Theory
• Congestion Control
119
Steiner Tree Problem
Objective: find a minimum cost tree
connecting S.
120
KMB Algorithm (Offline)
Due to [Kou, Markowsky and Berman 81’]
• Step 1: Construct a complete directed distance
graph G1=(V1,E1,c1).
• Step 2: Find the min spanning tree T1 of G1.
• Step3: Construct a subgraph GS of G by replacing
each edge in T1 by its corresponding shortest path
in G.
• Step 4: Find the min spanning tree TS of GS.
• Step 5: Construct a Steiner tree TH from TS by
deleting edges in TS if necessary, so that all the
leaves in TH are Steiner points.
121
KMB Algorithm Cont.
Worst case time complexity O(|S||V|2).
Cost no more than 2(1 - 1/l) *optimal cost
where l = number of leaves in the steiner tree.
122
KMB Example
A
A
4
1
10
H
I
1/2
1/2
G
1
1
1
B
8
1
F
2
C
9
E
2
D
B
D
4
4
4
4
A
1
4
H
C
I
1/2
1/2
1
G
A
4
1
D
B
1
4
B
1
F
2
4
C
E
2
D
C
Destination Nodes
Intermediate Nodes
123
KMB Example Cont.
A
A
1
1
H
1/2
1/2
G
B
1
1
F
2
C
I
I
1
1
E
B
1
1
F
2
2
C
D
E
2
D
Destination Nodes
Intermediate Nodes
124
Incremental Construction of Multicast
Trees
• Fixed Multicast Source s
– K receivers arrive one by one
– Must adapt multicast tree to each new arrival
without rerouting existing receivers
– Malicious adversary generates bad requests
– Objective: Minimize total size of multicast tree
a
s
r1
b
C=3/2
Can create worse sequences
125
Dynamic Steiner Tree (DST)
• G=(V,E) weighted, undirected,
connected graph.
• Si  V is the set of terminal nodes to
be connected at step i.
126
Two Classes of Online Algorithms
• Shortest Path Algorithm
– Each receiver connects using shortest path to
source (or to a core)
• DVMRP [Waitzman, Partridge, Deering ’88]
• CBT
[Ballardie, Francis, Crowcroft ‘93]
• PIM
[Deering et al. ’96]
• Greedy Algorithm [Imase and Waxman ‘91]
– Each receiver connects to the closest point on the
existing tree
– Independently known to the Systems community
• The “naive” algorithm [Doar and Leslie ‘92]
• End-system multicasting [Faloutsos, Banerjea, Pankaj ’98;
Francis ‘99]
127
Shortest Path Algorithm: Example
r1
r2
s
r3
rK
N
• Receivers r1, r2, r3, … , rK join in order
128
Shortest Path Algorithm
r1
r2
s
r3
rK
N
• Cost of shortest path tree K N
129
Shortest Path Algorithm
Competitive Ratio
s
r1
r2
r3
rK
• Optimum Cost K + N
• If N is large, the competitive ratio is K
130
Greedy Algorithm
• Theorem 1: For the greedy algorithm,
competitive ratio = O(log K)
• Theorem 2: No algorithm can achieve a
competitive ratio better than log K
[Imase and Waxman ’91]
Greedy algorithm is the optimum strategy
131
Proof of Theorem 1
[Alon and Azar ’93]
• L = Size of the optimum multicast tree
• pi = amount paid by online algorithm for ri
– i.e. the increase in size of the greedy multicast
tree as a result of adding receiver ri
• Lemma 1: The greedy algorithm pays 2L/j
or more for at most j receivers
– Assume the lemma
– Total Cost  2L (1 + 1/2 + 1/3 + … 1/K) ¼ 2L log K
132
Proof of Lemma 3
Suppose towars a contradiction that there
are more than j receivers for which the
greedy algorithm paid more than 2L/j
– Let these be r1, r2, … , rm, for m larger than j
– Each of these receivers is at least 2L/j away from
each other and from the source
133
Tours and Trees
r1
s
r3
r2
r4
rm
s
rm
r1
Each segment 2L/j,
Tour cost > 2L
r2
r3
One can construct tour from
tree by repeating edges at
most twice, Tour cost  2L
r4
134
Competitive Analysis in Networking:
Outline
• Shared Memory Switches
• Multicast Trees
– The Greedy Strategy
• Routing and Admission Control
– The Exponential Metric
• More Restricted Adversaries
– Adversarial Queueing Theory
• Congestion Control
135
The Exponential Cost Metric
• Consider a resource with capacity C
• Assume that a fraction  of the resource has
been consumed
• Exponential cost “rule of thumb”: The cost of the
resource is given by m for appropriately chosen m
• Intuition: Cost increases steeply with 
– Bottleneck resources become expensive
Cost

136
Applications of Exponential Costs
• Exponential cost “rule of thumb” applies to
–
–
–
–
–
Online Routing
Online Call Admission Control
Stochastic arrivals
Stale Information
Power aware routing
137
The Online Routing Problem
• Connection establishment requests arrive online
in a VPN (Virtual Private Network)
• Must assign a route to each connection and
reserve bandwidth along that route
– PVCs in ATM networks
– MPLS + RSVP in IP networks
• Oversubscribing is allowed
– Congestion = the worst oversubscribing on a link
• Goal: Assign routes to minimize congestion
• Assume all connections have identical b/w
requirement, all links have identical capacity
138
Online Routing Problem: Example
a
s
r1
b
C=2
Can create worse sequences
139
Online Algorithm for Routing
• L = Fraction of bandwidth of link L that
has been already reserved
• m = N, the size of the network
• The Exponential Cost Algorithm:
– Route each incoming connection on current
cheapest path from src to dst
– Reserve bandwidth along this path
[Aspnes et al. ‘93]
140
Online Algorithm for Routing
• Theorem 1: The exponential cost algorithm
achieves a competitive ratio of O(log N)
for congestion
• Theorem 2: No algorithm can achieve
competitive ratio better than log N in
asymmetric networks
This simple strategy is optimum!
141
Applications of Exponential Costs
• Exponential cost “rule of thumb” applies to
–
–
–
–
–
Online Routing
Online Call Admission Control
Stochastic arrivals
Stale Information
Power aware routing
142
Online Admission Control and Routing
• Connection establishment requests arrive
online
• Must assign a route to each connection
and reserve bandwidth along that route
• Oversubscribing is not allowed
– Must perform admission control
• Goal: Admit and route connections to
maximize total number of accepted
connections (throughput)
143
Exponential Metric and Admission
Control
• When a connection arrives, compute the
cheapest path under current exponential
costs
• If the cost of the path is less than m then
accept the connection; else reject
[Awerbuch, Azar, Plotkin ’93]
• Theorem: This simple algorithm admits at
least O(1/log N) as many calls as the
optimum
144
Applications of Exponential Costs
• Exponential cost “rule of thumb” applies to
–
–
–
–
–
Online Routing
Online Call Admission Control
Stochastic arrivals
Stale Information
Power aware routing
145
Assume Stochastic Arrivals
•
•
•
Connection arrivals are Poisson, durations
are Memory-less
Assume fat links (Capacity >> log N)
Theorem: The exponential cost algorithm
results in
1. Near-optimum congestion for routing problem
2. Near-optimum throughput for admission problem
[Kamath, Palmon, Plotkin ’96]
Near-optimum: Compt. ratio = (1+e) for e close to 0
146
Versatility of Exponential Costs
• Guarantees of log N for Competitive ratio
against malicious adversary
• Near-optimum for stochastic arrivals
• Near-optimum given fixed traffic matrix
[Young ’95; Garg and Konemann ’98]
147
Applications of Exponential Costs
• Exponential cost “rule of thumb” applies to
–
–
–
–
–
Online Routing
Online Call Admission Control
Stochastic arrivals
Stale Information
Power aware routing
148
Exponential Metrics and Stale
Information
• Exponential metrics continue to work well if
– Link states are a little stale
– Shortest paths are reused over small intervals
rather than recomputed for each connection
– No centralized agent
[Goel, Meyerson, Plotkin ’01]
• Caveat: Still pretty hard to implement
149
Applications of Exponential Costs
• Exponential cost “rule of thumb” applies to
–
–
–
–
–
Online Routing
Online Call Admission Control
Stochastic arrivals
Stale Information
Power aware routing
150
Power Aware Routing
• Consider a group of small mobile nodes eg.
sensors which form an adhoc network
– Bottleneck Resource: Battery
– Goal: Maximize the time till the network partitions
• Assign a cost to each mobile node which is
m where  = fraction of battery consumed
– Send packets over the cheapest path under this
cost measure
• O(log n) competitive against an adversary
– Near-optimum for stochastic/fixed traffic
151
Competitive Analysis in Networking:
Outline
• Shared Memory Switches
• Multicast Trees
– The Greedy Strategy
• Routing and Admission Control
– The Exponential Metric
• More Restricted Adversaries
– Adversarial Queueing Theory
• Congestion Control
152
Adversarial Queueing Theory
Motivation
• Malicious, all-knowing adversary
– Injects packets into the network
– Each packet must travel over a specified route
• Suppose adversary injects 3 packets per second
from s to r
– Link capacities are one packet per second
s
r
– No matter what we do, we will have unbounded queues and
unbounded delays
– Need to temper our definition of adversaries
153
Adversarial Queueing Theory
Bounded Adversaries
• Given a window size W, and a rate r < 1
– For any link L, and during any interval of duration
T > W, the adversary can inject at most rT
packets which have link L in their route
• Adversary can’t set an impossible task!!
– More gentle than competitive analysis
• Will study packet scheduling strategies
– Which packet to forward if more than one packets
are waiting to cross a link?
154
Some Interesting Scheduling Policies
• FIFO: First In First Out
• LIFO: Last In First Out
• NTG: Nearest To Go
– Forward a packet which is closest to destination
• FTG: Furthest To Go
– Forward a packet which is furthest from its destination
• LIS: Longest In System
– Forward the packet that got injected the earliest
– Global FIFO
• SIS: Shortest In System
– Forward the packet that got injected the last
– Global LIFO
155
Stability in the Adversarial Model
• Consider a scheduling policy (eg. FIFO,
LIFO etc.)
• The policy is universally stable if for
networks and all “bounded adversaries”,
the packet delays and queue sizes remain
bounded
• FIFO, LIFO, NTG are not universally
stable [Borodin et al. ‘96]
• LIS, SIS, FTG are universally stable
[Andrews et al. ‘96]
156
Adversarial Queueing Model: Routing
Using the Exponential Cost Metric
• Adversary injects packets into the
network but gives only the src, dst
– The correct routes are hidden
• Need to compute routes
– Again, use the exponential cost metric
– Reset the cost periodically to zero
– Use any stable scheduling policy
• Theorem: The combined routing and
scheduling policy is universally stable
[Andrews et al. ’01]
157
Competitive Analysis in Networking:
Outline
• Shared Memory Switches
• Multicast Trees
– The Greedy Strategy
• Routing and Admission Control
– The Exponential Metric
• More Restricted Adversaries
– Adversarial Queueing Theory
• Congestion Control
158
The Problem
S
o
u
r
c
e
s
S
i
n
k
s
• What rates should the users use to send
their data?
• How to keep the network efficient and
fair?
• Goal: match the available bandwidth !
159
Model Description
• Model
– Time divided into steps
– Oblivious Adversary
– Source select xi
Available Bandwidth bi
chosen by the Adversary
Algorithm picks and
sends xi
• Severe cost function
Time
160
Competitive Ratio
• An Algorithm achieves
• Optimal (offline) achieves
• Seek to minimize
161
Adversary Model
• Unrestricted Adversary
– Has too much power
• Fixed Range Adversary
• µ-multiplicative adversary
• {α,β}-additive adversary
162
Fixed Range Model
• Adversary selects any value
• Deterministic Algorithm
– Optimal would never select a rate > c
• If optimal does, adversary can select c, causing the
algorithm to send 0
– Optimal selects c
– In that case, adversary selects d
– Competitive ratio is d/c
163
Fixed range – Randomized Algorithm
• No randomized algorithm can achieve
competitive ratio better than 1+ln(d/c) in
the fixed range model with range [c,d]
• Proof :
– Yao’s minimax principle
– Consider a randomized adversary against
deterministic algorithms
– Adversary can choose g(y) = c/y^2 in [c,d)
– With probability c/d chooses d
164
Proof continued ….
• If the algorithm picks xi = x
• The expected optimal is at most
165
µ-multiplicative model – Randomized
Algorithm
• No randomized algorithm can achieve
competitive ratio better than ln(µ) + 1
• Proof:
– Adversary can always choose bi in [bi, µbi]
166
Randomized Algorithm 4 log(µ) + 12
• Assumptions –relaxed later– µ is a power of 2
– b1 is in the range [1,2µ)
• Algorithm (MIMD)
– At step 1, pick at random x1 power of 2
between 1 and 2µ
– On failure, xi+1 = xi/2;
– On success, xi+1 = 2µxi;
• Claim:
– Competitive ratio of 4 log(µ) + 12
167
Proof outline
• Think about choosing one deterministic
algorithm from log(2µ) + 1 choices
• Think about the algorithms as an ensemble
running in parallel
• Will show that the ensemble manages to
send at least opt/4. [A bit of work]
• Once this is done, picking one algorithm
gives opt/4(log(µ)+2)
168
Proof (1/3)
• Algorithms pick consecutive sequence
• Ensemble is successful
– bi falls in the picked range
– ei : largest value sent by any
algorithm
– bi < 2ei
– At the next step, if the bandwidth increases or
stays constant, the ensemble will succeed
• bi < 2ei , bi+1 < µbi => bi+1 < 2µei
• Bandwidth lies in the range covered by the ensemble
169
Proof (2/3)
• Need to worry about decreasing bandwidth
– May decrease very fast
– Ensemble achieved ei at step i
– Now it was unsuccessful at step i + 1
• Could not have been more than ei available
– At step i+2, they all divide their rates by 2
• Could not have been more than ei/2 available
– By induction, one can show that :
• ei + ei/2 + ei/4 + …. = 2ei
170
Proof (3/3)
• Optimal algorithm could have achieved at
most 4ei
– Up to 2ei in at step I because it is not constrained
to choose a power of 2
– 2ei when the ensemble were not successful
• Summing over all time steps, at least we can
transmit opt/4
• µ- assumption -> round µ to the next power
of 2. Result in log(µ) + 3 algorithms
171
References
1.
2.
3.
4.
5.
6.
N. Alon and Y. Azar. On-line Steiner trees in the Euclidean plane.
Discrete and Computational Geometry, 10(2), 113-121, 1993.
M. Andrews, B. Awerbuch, A. Fernandez, J. Kleinberg, T. Leighton,
and Z. Liu. Universal stability results for greedy contentionresolution protocols. Proceedings of the 37th IEEE Conference on
Foundations of Computer Science, 1996.
M. Andrews, A. Fernandez, A. Goel, and L. Zhang. Source Routing and
Scheduling in Packet Networks. To appear in the proceedings of the
42nd IEEE Foundations of Computer Science, 2001.
J. Aspnes, Y. Azar, A. Fiat, S. Plotkin, and O. Waarts. On-line load
balancing with applications to machine scheduling and virtual circuit
routing. Proceedings of the 25th ACM Symposium on Theory of
Computing, 1993.
B. Awerbuch, Y. Azar, and S. Plotkin. Throughput competitive online
routing. Proceedings of the 34th IEEE symposium on Foundations of
Computer Science, 1993.
A. Ballardie, P. Francis, and J. Crowcroft. Core Based Trees(CBT) An architecture for scalable inter-domain multicast routing.
Proceedings of the ACM SIGCOMM, 1993.
172
References [Contd.]
7.
8.
9.
10.
11.
12.
A. Borodin, J. Kleinberg, P. Raghavan, M. Sudan, and D. Williamson.
Adversarial queueing theory. Proceedings of the 28th ACM
Symposium on Theory of Computing, 1996.
S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and L. Wei.
The PIM architecture for wide-area multicast routing. IEEE/ACM
Transactions on Networking, 4(2), 153-162, 1996.
M. Doar and I. Leslie. How bad is Naïve Multicast Routing? IEEE
INFOCOM, 82-89, 1992.
M. Faloutsos, A. Banerjea, and R. Pankaj. QoSMIC: quality of service
sensitive multicast Internet protocol. Computer Communication
Review, 28(4), 144-53, 1998.
P. Francis. Yoid: Extending the Internet Multicast Architecture.
Unrefereed report, http://www.isi.edu/div7/yoid/docs/index.html .
N. Garg and J. Konemann. Faster and simpler algorithms for
multicommodity flow and other fractional packing problems.
Proceedings of the 39th IEEE Foundations of Computer Science,
1998.
173
References [Contd.]
13.
14.
15.
16.
17.
18.
19.
A. Goel, A. Meyerson, and S. Plotkin. Distributed Admission Control,
Scheduling, and Routing with Stale Information. Proceedings of the
12th ACM-SIAM Symposium on Discrete Algorithms, 2001.
A. Goel and K. Munagala. Extending Greedy Multicast Routing to
Delay Sensitive Applications. Short abstract in proceedings of the
11th ACM-SIAM Symposium on Discrete Algorithms, 2000. Long
version to appear in Algorithmica.
M. Imase and B. Waxman. Dynamic Steiner tree problem. SIAM J.
Discrete Math., 4(3), 369-384, 1991.
C. Intanagonwiwat, R. Govindan, and D. Estrin. Directed diffusion: A
scalable and robust communication paradigm for sensor networks.
Proceedings of the 6th Annual International Conference on Mobile
Computing and Networking (MobiCOM), 2000.
A. Kamath, O. Palmon, and S. Plotkin. Routing and admission control in
general topology networks with Poisson arrivals. Proceedings of the
7th ACM-SIAM Symposium on Discrete Algorithms, 1996.
D. Waitzman, C. Partridge, and S. Deering. Distance Vector Multicast
Routing Protocol. Internet RFC 1075, 1988.
N. Young. Randomized rounding without solving the linear program.
Proceedings of the 6th ACM-SIAM Symposium on Discrete
Algorithms, 1995.
174
References [Contd.]
20. R. Karp, E. Koutsoupias, C. Papadimitriou, and S. Shenker,
“Optimization problems in congestion control”. In Proceedings of the
41st Annual IEEE Symposium of Foundation of Computer Science.
21. S. Arora, B. Brinkman, “A Randomized Online Algorithm for Bandwidth
Utilization ”
175
Non-Preemptive Scheduling
of Optical Switches
Balaji Prabhakar
High Performance
Switching and Routing
Telecom Center Workshop: Sept 4, 1997.
Optical Fabric
Tunable Lasers
.
.
.
Receivers
.
.
.
Switching is achieved by tuning lasers to
different wavelengths
The time to tune the lasers can be much
longer than the duration of a cell
177
Model Description
 Input-queued switch.
 Scheduler picks a new configuration
(matching).
 There is a configuration delay C.
 Then the configuration is held for a
pre-defined period of time.
178
The Bipartite Scheduling Problem
 The makespan of the schedule:
• total holding time +
• the configuration overhead.
 Goal: minimize the makespan.
 Preemptive: cells from a single queue can be
scheduled in different configurations.
 Non-preemptive: all cells from a single queue
are scheduled in just one configuration.
179
Non-Preemptive Scheduling
Minimizes the number of reconfigurations.
Allows to design low complexity schedulers,
which can operate at high speeds.
Handles efficiently variable size packets:
no need to keep packet reassembly buffers.
180
Greedy Algorithm
The weight of each edge is the occupancy of
the corresponding input queue.
1. Create a new matching.
2. Go over uncovered edges in order of nondecreasing weight. Add the edge to the
matching if possible marking it as covered.
3. If there are uncovered edges, goto Step 1.
181
Greedy Algorithm Example
7
3
4
2
5
Total holding time: 7+3+2
Configuration overhead: 3C
2
1
7
4
3
2
5
2
1
182
Analysis of Greedy: Complexity
Theorem 1: Greedy needs at most 2N-1
configurations.
Proof outline:
• Consider all VOQi* and all VOQ*j
• There can be at most 2N-1 such queues
• At each iteration, at least one of the
corresponding edges is covered
• Thus, after 2N-1 iterations VOQij
must be served.
183
Analysis of Greedy: Makespan
Theorem 2 (UB):
Greedy achieves an approximation
factor of at most 2 for all values of C.
Theorem 3 (Greedy-LB):
Greedy achieves an approximation
factor of at least 2 for C=.
184
Proof of Theorem 2
Consider the k-th matching and let (i,j) be the
heaviest edge of weight w.
Lemma 1: There are at least k/2 edges of
weight w incident to either input i or
output j.
Proof outline:
In all iterations 1,...,k-1 Greedy chosen edge
of weight w incident to i or j.
185
Proof of Theorem 2 Cont.
Observation 1: OPT’s schedule
contains at least k/2
configurations.
Observation 2: The k/2-th largest
holding time in OPT’s schedule is
at least w.
The theorem follows !
186
Hardness Results
Theorem 4 (General-LB):
The NPBS problem is NP-hard for all
values of C and hard to approximate
within a factor better than 7/6.
Proof outline: [GW85, CDP01]
• Reduction from the Restricted Timetable
Design problem, asg. of teachers for 3 hrs.
• Encoding as a demand matrix, C=.
• There is an optimal non-preemptive
schedule that contains 3 matchings.
• Works for all values of C !
187
Offline vs. Online
 We considered Greedy in the offline case
 What if packets constantly arrive ?
 We use the idea of batch scheduling
 Avoids starvation since all queued cells
are included in the next batch
188
Batch Scheduling
Batch-(k+1)
1
Batch-(k)
Crossbar
1
R
1
R
N
N
R
1
N
R
N
189
Requirements
 We have shown that the makespan of
Greedy is at most twice that of OPT
 A moderate speedup of 2 will allow us to
provide strict delay guarantees for any
admissible traffic
190
Open Problems
• Close the gap between the upper
and the lower bound (2 vs. 7/6),
• Consider packet-mode scheduling
191
Literature
 Preemptive scheduling:
 [Inukai79] Inukai. An Efcient SS/TDMA Time Slot Assignment Algorithm. IEEE Trans.
on Communication, 27:1449-1455, 1979.
 [GW85] Gopal and Wong. Minimizing the Number of Switchings in a SS/TDMA System.
IEEE Trans. Communication, 33:497-501, 1985.
 [BBB87] Bertossi, Bongiovanni and Bonuccelli. Time Slot Assignment
in SS/TDMA systems
with intersatellite links. IEEE Trans. on Communication, 35:602-608. 1987.
 [BGW91] Bonuccelli, Gopal and Wong. Incremental Time Slot Assignement in SS/TDMA
satellite systems. IEEE Trans. on Communication, 39:1147-1156. 1991.
 [GG92] Ganz and Gao. Efficient Algorithms for SS/TDMA scheduling. IEEE Trans. on
Communication, 38:1367-1374. 1992
 [CDP01] Crescenzi, Deng and Papadimitriou. On Approximating a Scheduling Problem,
Journal of Combinatorial Optimization, 5:287-297, 2001.
 [TD02] Towles and Dally. Guaranteed Scheduling for Switches with Conguration Overhead.
Proc. of INFOCOM'02.
 [LH03] Li and Hamdi, -Adjust Algorithm for Optical Switches with Reconguration Delay.
Proc. of ICC'03.
 ... many others
 Non-preemptive scheduling:
 [PR00] Prais and Ribeiro. Reactive GRASP: An Application to a Matrix Decomposition
Problem in TDMA Trafc Assignment. INFORMS Journal on Computing, 12:164-176, 2000.
192