Reflections on the Development of A/P Nets

Download Report

Transcript Reflections on the Development of A/P Nets

Principles in Practice
CS 685 Network Algorithmics
Spring 2006
Spring 2006
CS 685 Network Algorithmics
1
Scheduling for ATM VC's
Background:
• Asynchronous Transfer Mode:
virtual-circuit service
• Individual circuits may be flowcontrolled for various reasons
transmit
– bandwidth allocation
– end-to-end flow control
• At each output port, each v.c. has
its own queue of packets waiting
for transmission
• A scheduler selects which packet
to transmit next, according to
per-v.c. state
Spring 2006
CS 685 Network Algorithmics
scheduler
2
Scheduling for ATM VC's
Background:
• To be eligible for selection a v.c.
must have:
– at least one "credit" (i.e. it has
not yet used up its allocation)
– at least one packet queued
transmit
• "Credits" may be accumulated by:
– passage of time (for BW
allocation)
– transmission from upstream (for
end-to-end flow control)
Spring 2006
CS 685 Network Algorithmics
scheduler
3
Scheduling for ATM VC's
Problem:
• Too expensive to find next ready
queue via linear search, if there are
1000's of virtual circuits
• How can we avoid spending time
looking at many queues that are
not ready?
Hint: P12 "add state for speed"
Spring 2006
CS 685 Network Algorithmics
transmit
scheduler
4
Scheduling for ATM VC's
Solution (version 0):
• Maintain a queue of ready v.c.'s
• When a flow becomes ready, add it
via tail pointer
– Flow becomes ready when
Tail
Head
• #pkts > 0, #credits 0  1
• #credits > 0, #pkts 0  1
• When a flow is serviced (i.e. a
packet is transmitted), remove it
via head pointer
scheduler
– If still ready after xmission, place it
back in queue via tail
Spring 2006
CS 685 Network Algorithmics
5
Scheduling for ATM VC's
Problem:
• Credits and packets arrive
asynchronously
• How to ensure v.c. is not added to
the list multiple times?
• Ensure that events that may cause
transition between "ready" states
are "atomic"
– But locking is expensive!
Spring 2006
CS 685 Network Algorithmics
Tail
Head
scheduler
6
Scheduling for ATM VC's
Solution 1:
• Add a ready/not ready bit to each
v.c.'s state:
– bit set means the v.c. is already in
the queue
– clear bit before adding to/removing
from the queue
Tail
Head
scheduler
Spring 2006
CS 685 Network Algorithmics
7
Scheduling for ATM VC's
Problem:
• Extend the scheme to allow some
v.c.'s to get "more opportunities to
send", based on administrativelyassigned "weights"
– E.g. a v.c.'s with a weight of 2
would get twice as many
opportunities as a v.c. with a
weight of 1
Possible Solution:
• Multiply credits by weight when
they arrive
Spring 2006
CS 685 Network Algorithmics
Tail
Head
scheduler
8
Passive Monitor Using Bridge Hardware
Background:
• Ethernet bridge: box with connections to multiple
Ethernets
• Keeps track of which interface a 48-bit MAC address
lives on
• Forwards incoming packets out the right i/f for the
destination address
• Contains special "CAM" hardware:
– Keeps track of up to 64000 (addr, i/f #) pairs
– Given addr, returns i/f # in just 1.4 sec
Spring 2006
CS 685 Network Algorithmics
9
Bridge Hardware Lookup
Address: 00:06:20:12:CD:BB
up to 64000
entries
20:AB:0C:71:6F:A8
02:00:20:6A:F3:19
I/F 4
I/F 1
00:06:20:12:CD:BB
I/F 6
Address: 00:06:20:12:CD:BB  I/F 6
Spring 2006
CS 685 Network Algorithmics
10
Passive Monitor Using Bridge Hardware
Problem:
• Want to build a monitor to count the number of packets sent
between source, destination MAC address pairs
• Keep a counter per (S,D) pair, where S, D are 48-bit MAC
addresses
• Need to exploit the existing bridge hardware (P4c) to do the
lookup in less than 64 sec (minimum inter-packet time on
10Mbps Ethernet)
• There are about 1000 possible sources and 1000 destinations,
but only around 20000 pairs active at any time
Spring 2006
CS 685 Network Algorithmics
11
Passive Monitor Using Bridge Hardware
Naive Solution:
• Map each 48-bit address to a smaller one (say 10 bits) using
one lookup each
• Store 2-D array indexed by 10-bit quantities
• Bad: array size 1000000
Better:
• Map each 48-bit address to a smaller one (say 24 bits) using
one lookup each
• Concatenate 24-bit IDs to get a single 48-bit ID for the
connection
• Retrieve counter using another hardware lookup with that ID
Spring 2006
CS 685 Network Algorithmics
12
Passive Monitor Solution
Src = 12:34:56:78:9A:BC
SrcID = 99:98:1B
Spring 2006
Dest = 12:34:56:78:9A:BC
DestID = 01:22:FE
CS 685 Network Algorithmics
CounterID = 99:98:1B:01:22:FE
Counter = 322
13
Refinement I
• Set of active source-destination pairs changes over
time. How to handle this without storing state for
every pair that ever communicated since power-on?
• Solution:
– Run a "sweeper" process in the background, which does the
following:
• Mark every counter periodically
– Mark is cleared when counter is incremented
• Once per period, dump marked counters to secondary storage
and reclaim them (make them available or use with new pairs)
Spring 2006
CS 685 Network Algorithmics
14
Challenge
Can we solve the problem using only two lookups with
bridge hardware, without requiring extra memory?
Spring 2006
CS 685 Network Algorithmics
15
Tries With Node Compression
Background:
• Trie: tree data structure
Key: Q
Spring 2006
Key: X
Key: X
– Each node is an array of M = 2c elements (M < 32)
– Each entry is either a key (value) or a pointer to another trie
node, or empty
– Empty space is wasted
5 out of 24 entries used  apply P1
CS 685 Network Algorithmics
16
Tries With Node Compression
To search the trie:
– Input string broken into c-bit "chunks"
– First chunk used as index into root node's array
– If the indexed entry is:
Key: X
Key: Q
Key: X
• null: search terminates (not found)
• Key: search terminates (found, result = Key)
• Pointer to another node: search continues there with next chunk
Spring 2006
CS 685 Network Algorithmics
17
Tries With Node Compression
Problem: how to compress the tree to remove wasted
space without slowing search more than a small
factor?
Can we replace the node array with a linked list?
Key: X
Key: Q
Key: X
Not good: takes up to factor of M longer to search
Spring 2006
CS 685 Network Algorithmics
18
Tries With Node Compression
Idea:
– Use a bitmap (P14) to indicate which entries are missing
– Replace M-element array with array containing only non-null
entries
– To search with c-bit index k (indices start at 1):
Spring 2006
Key: X
00010000
Key: X
Key: Q
00000100
Original index: 4
New index: 2
10010010
• Convert c-bit index k to smaller index
– If bit k of bitmap is 0  0, else  #of 1's in first k bits
CS 685 Network Algorithmics
19
Tries With Node Compression
Efficiency:
– Can count # 1's in bitmap using special hardware (or s/w)
– Requires two memory accesses:
• one for bitmap
• one to read array entry
Spring 2006
Key: X
00010000
Key: X
Key: Q
00000100
Original index: 4
New index: 2
10010010
– Slowdown: factor of 2+
CS 685 Network Algorithmics
20
Challenge I
Challenge: Can we use table lookup to speed up counting
# of bits set in bitmap in software (P2a, P14)?
Solution (for M=8, c=3):
– There are 256 possible bitmap values
– Keep a table indexed by [bitmap value, position], containing the
(precomputed) number of 1 bits up to that position (3 bits per
table entry)
– Total: 256 x 8 x 3 = 6144 bits  probably doable in hardware
00000000
...
01001100
...
Spring 2006
0
0
0
0
0
0
0
0
2
3
0
0
...
0
1
0
0
...
CS 685 Network Algorithmics
21
Challenge II
Challenge: What if the bitmap is really huge? (e.g. c=16)
– Table in previous solution would have 264K rows!
How can we speed up counting bits for such a large
bitmap, with at most one additional memory access?
Solution: Apply P12 "add state for speed" and P12a
"compute incrementally"
– Logically divide bitmap into chunks (say 32 bits each)
– For each node, keep an array B with one element per chunk
•
•
•
Spring 2006
For 64K bitmap in 32-bit chunks: B has 2K elements of 16 bits
each = 1K 32-bit words (still much smaller than 64K entries!)
B[j] contains # of 1 bits in positions 1 through 32(j-1)
To count 1's up through position k: B[k] + #1's in chunk k
– Count 1's in chunk k individually
CS 685 Network Algorithmics
22
Packet Classification
Background:
• TCP/IP "flows" identified by some set of fields in
packet headers, including:
–
–
–
–
Source, destination IP address
Source, destination port number
Protocol (e.g. TCP or UDP)
Higher-level (application) information
• Packet "filter" = specification of fields that define a set
of packets "of interest" to some receiver
– Example (simple):
Source IP = X, Dest IP = *
Protocol = UDP
Source port = *, Dest port = Z
Spring 2006
CS 685 Network Algorithmics
23
Packet Classification
Background:
• Consider a "router" that serves a group of receivers
– Receivers specify packets they want to receive by giving filters
– Router takes each incoming packet, matches it against all
filters, and forwards to each receiver that has some matching
filter
• Example:
– To receive NBC's video transmission from Winter Olympics:
specify source address of NBC host in Turin, UDP protocol,
and a specific destination port
Spring 2006
CS 685 Network Algorithmics
24
Packet Classification
Problem:
• How to compare each packet's headers against many
(possibly thousands) of filter specifications?
Observation:
• Caching (P11a) is not a very good solution
– Caching is good for storing pairs of the form (x, (x)) in order
to avoid re-computing (or looking up) (x)
• Store is keyed by x, which is typically small (48 bits or less)
– Here the key could be
How could we solve this if header contained a "flow
identifier" field F in the network layer header?
Note: IPv6 has such a field
Spring 2006
CS 685 Network Algorithmics
25
Packet Classification
Problem: How to assign/use "flow ID" field?
Solution:
• Let the sender assign it!
– E.g. keep a local counter, increment for each flow
• Keep (in addition to the regular list of filters) a "fast"
mapping from (source address, flow ID) to sets of
receivers
• When a packet arrives, first search the "fast" list
– If present, dispatch to indicated set of receivers
– Otherwise search the "slow" list to determine set G of
receivers
– Then add (source address, flow ID, G) to fast list
Spring 2006
CS 685 Network Algorithmics
26
Classification: Tricky Bits
• If a source crashes and restarts flows with different
flow IDs, chaos could result
– Solution:
• Time out "Fast list" entries periodically
• Ensure rebooting host does not re-use old entries (e.g. start with
last used flowID + 1)
• Each "slow list" entry has a pointer to its "fast list" entry
• When a packet fails to match any "fast list" entry, but does
match a "slow list" entry, existing fast list filter should be
invalidated
• When a receiver adds a new filter, "fast list" entries
may need to be updated
– How to do this?
Spring 2006
CS 685 Network Algorithmics
27