Transcript NP_ch05

Chapter 5
Basic Packet Processing:
Algorithms And Data Structures
1
Outline












2

State information and Resource Exhaustion
Packet Buffer Allocation
Packet Buffer Size And Copying
Protocol Layering And Copying
Heterogeneity And Network Byte Order
Bridge Algorithm
Table Lookup And Hashing
IP Datagram Fragmentation And Reassembly
IP Datagram Forwarding
IP Fragment Algorithm
High-Speed IP Forwarding
TCP Connection Recognition Algorithm
TCP Splicing Algorithm
State information and Resource
Exhaustion



3
Information stored indefinitely in memory is
known state information
The amount of state information depends on
traffic
To allow it to run arbitrarily long, a network
system must be designed with limits on all
resources, and the limits must be fixed
independent of arriving traffic; designs that
violate this principle will not be considered
Packet Buffer Allocation




4
Memory is perhaps the most obvious
resource:Asymmetric traffic is common
Buffer management is closely related to Qos
Network system usually allocate fixed-size
buffer:Variable-size buffer allocation does
not work well → memory fragmentation
How large buffer be:Choose a size
appropriate for the network technology
Packet Buffer Size And Copying


5
Using a single set of buffers helps avoid an
inefficiency:packet copying
Because copying an entire packet from one
locate to another requires time proportional
to the entire packet and because memory
speed is often a bottleneck, network systems
avoid copying whenever possible
Protocol Layering And Copying


6
Higher layer protocols also affect the buffer
allocation decision
Layering implies encapsulation
Two buffer allocation schemes

Large buffer
–
–

Linked List
–
–
7
Input:Place the incoming frame in a buffer, pass the buffer
address among layers, and arrange for each layer to locate the
appropriate header in the buffer
Output:Leave sufficient space in the buffer before the payload,
pass the address of the buffer among layers, and allow each layer
to prepend another header
Output:Allocate a separate buffer for each header and for the
data; construct a packed by forming a linked list where each node
on the list points to one of the header buffers or the data buffer
Input:Create an initial list of one node that points to the frame
buffer; as the packet passed up the stack, append a node to the
list that points to the next header
Heterogeneity And Network Byte Order

To ensure interoperability among heterogeneous
computers, many protocol specify representation
and transmission details

Two integer representations
Little endian
Increasing memory adresses
1
Big endian
8
2
3
4
Increasing memory adresses
4
3
2
1
Integer Conversion




Needed when heterogeneous computers communicate
Protocols define network byte order
Computers convert to network byte order
Typical library functions
Function Data size Translation
ntohs
16 bits
Network bytes order to host’s byte order
htons
16 bits
Host’s byte order to network bytes order
9
ntohl
htonl
32 bits
32 bits
Network bytes order to host’s byte order
Host’s byte order to network bytes order
Examples Of Algorithms Implemented
With Software-Based Systems

Layer 2
–

Layer 3
–
–

TCP connection recognition and splicing
Other
–
10
IP forwarding
IP fragmentation and reassembly
Layer 4
–

Ethernet bridge
Hash table lookup
Ethernet Bridge



11
Commercial Ethernet bridges are often implemented
in software on conventional hardware:the code run
from ROM on an embedded system that consist of a
microprocessor, memory, and two network interfaces
The bridge need receives a copy of any fame:the
interface operate in promiscuous mode
A physical interface on a bridge must not change the
source address in outgoing frames because the
bridge must replicate an exact copy the frame
Bridge Filtering


12
Uses source address in frames to identify
computers on each network
Uses destination address to decide whether
to forward frame
Bridge Algorithm
13
Assume: two network interfaces each operating in promiscuous
mode.
Create an empty list, L, that will contain pairs of values;
Do forever {
Acquire the next frame to arrive;
Set l to the interface over which the frame arrived;
Extract the source address, S;
Extract the destination address, D;
Add the pair (S, I) to list L if not already present.
If the pair (D, l) appears in list L {
Drop the frame;
} Else {
Forward the frame over the other interface;
}
}
Table Lookup And Hashing


Many network systems use a table to store
state information or auxiliary data needed for
processing, and lookup must be table for
each packet
Software-based systems typically use
hashing for table lookup
–
14
Double hashing works well if the hash table
contains many empty slots
Table Lookup Alogrithm
Given: a key, a table in memory, and the table size N.
Produce: a slot in the table that corresponds to the key or an
empty table slot if the key is not in the table.
Method: double hashing with open addressing
Choose P1 and P2 to be prime numbers;
Fold the key to produce an integer, K;
Compute table pointer Q equal to (P1 * K) modulo N;
Compute increment R equal to (P2 * K) modulo N;
While (table slot Q not equal to K and nonempty) {
Q  (Q + R) modulo N;
}
At this point, Q either points to an empty table slot or to the slot
containing the key
15
IP Datagram Fragmentation And
Reassembly





16
Maximum Transmission Unit (MTU):the largest
payload a given network can accommodate
Fragmentation and Reassembly:a router can divide
a large datagram into a set of smaller datagram
called fragments
Each fragment travels independently
Ultimate destination reassembles fragments to
reproduce the original datagram
Implemented in software:The task is
straightforward
Summarizes



Each fragment is a datagram that begins with a datagram
header
Header fields in a fragment are derived from the original
datagram
In a fragment, the following fields differ from the original
datagram
–
–
–
–


17
TOTAL LENGTH
FLAGS
FRAGMENT OFFSET
HEADER CHECKSUM
The size of a fragment is determined by the MTU of the
outgoing network
The FLAGS and FRAGMENT OFFSET fields together identify a
datagram as a fragment; if both contain zero the datagram is
not a fragment
Uses FLAGS Bits in Datagram
Header
0
D
M
FLAGS bits
0 = last fragment; 1 = more fragments
0 = many fragment; 1 = do not fragment
Reserved (must be 0)
18
Interpretation Of The Fragment Offset
Field





19
The FRAGMENT OFFSET fields the relationship
between data in the fragment and data in the original
datagram
The FRAGMENT OFFSET specifies how far into the
original payload the data in the fragment belongs
A receiver uses the offset during reassembly to put
data from each fragment in its original position
The amount of data in a frame must be a multiple of
eight octets
The limit is imposed even if the network MTU allows
extra octets
IP Fragment Algorithm
20
Given: IP datagram, D , and a network MTU.
Produce: a set of fragments for D.
If the DO THE FRAGMENT bit is set {
Stop and report an error ;
}
Compute the size of the datagram header, H;
Chooses N to be the largest multiple of 8 such
that H + K ≤ MTU ;
Initialize an offset counter , O , to zero ;
Repeat until datagram empty {
Create a new fragment that has a copy of D’s header;
Extract up to the next N octets of data from D and Place the data in the
fragment ;
Set the MORE FRAGMENTS bit in fragment header;
Set TOTAL LENGTH field in fragment header to be H + K;
Set FRAGMENT OFFSET field in fragment header to be O ;
Computer and set the CHECKSUM field in fragment header;
Increment O by N/8;
}
Fragmenting A Fragment




21
A datagram may pass over networks with various size MTUs
All offset refer to the original datagram:each fragment is
handled independently
M bit in
original
0
M bit in last
fragment
0
M bit in other
fragments
1
1
1
1
The assignment of the MORE FRAGMENTS bit in fragments
The same rules apply to fragment or a datagram
IP Reassembly





22
When all fragments arrive, the system reassembles
them to produce the original datagram
Out-Of-Order Delivery:routes can change at any
time
Duplication:overlap
Loss:Not retransmit
Concurrent Reception:A receiving system must be
prepared to accept incoming fragment from multiple
datagrams concurrently
Grouping Fragment Together


23
The identification field (ID) in a datagram
header is unique to a given source
A receiving system uses both the ID and IP
source address fields to determine whether
two fragment came from the same datagram
Fragment Position


The receiver must be able to accommodate a fragment that carries an
arbitrary amount of data with an arbitrary offset
Reassembly maintains two pieces of information
–
–

The data octets themselves
Information about which data has been received
Uses a reassembly buffer
40
Reassembly buffer
24
40
40
^
Fragment in
reassenbly buffer
IP Reassembly Algorithm
Give: a fragment, F ,add to a partial reassembly
Method: maintain a set of fragment for each datagram
Extract the IP source address, S , and ID fields from F;
Combine S and ID to produce a lookup key, K;
Find the fragment set with key K or create a new set;
Insert F into the set;
If the set combine all the data for the datagram {
From a completely reassembled datagram and
process it;
}
25
IP Datagram Forwarding




IP forwarder uses the address in each incoming
packet to determine how to forward the packet
IP forwarding operates in layer 3, forwards IP
datagrams, and use layer 3 (i.e., IP) address
Use routing table for each forwarding decision
Conceptual mapping
–
26
(next hop, interface)  f(datagram, routing table)
IP Routing Table


One entry per destination
Entry contains
–
–
–
–
27
32-bit IP address of destination
32-bit address mask
32-bit next-hop address
N-bit interface number
Example IP Routing Table
Destination
Address
192.5.48.0
128.10.0.0
Address
Mask
255.255.255.0
255.255.0.0
Next-Hop
Address
128.210.30.5
128.210.141.12
Interface
Number
2
1
0.0.0.0
0.0.0.0
128.210.30.5
2



28
Values stored in binary
Interface number is for internal use only
Zero mask produces default route
IP Forwarding Algorithm
Given: destination address A and routing table R.
Find: a next hop and interface used to route datagrams to A.
For each entry in table R {
Set MASK to the Address Mask in the entry;
Set DEST to the Destination Address in the entry;
If (A & MASK) == DEST {
Stop; use the next hop and interface in the entry;
}
}
If this point is reached, declare error: no route exists;
29
High-Speed IP Forwarding




30
Binary trie is the most popular
Uses individual bits of the lookup key to navigate
through the trie — the search starts at the root and
proceeds toward the leaves
At each node, a bit of the lookup key specifies
whether to follow the left subtree or the right subtree
Each terminal node must contain a complete IP
Address
Binary Trie
string prefix
01011 0
10000
10001
10101
11001
31
10000
10001
101
1100
node
A
0
A
1
0
B
C
D
E
11010 11010 F
11011 11011 G
11101 111
H
1
0
1
0
1
D
0
H
1
0
E
0
B
1
C
0
F
1
G
TCP Connection





Keep information about the state of each
connection
Involves a pair of endpoints
Started with SYN segment
Terminated with FIN or RESET segment
Identified by 4-tuple
–
32
( src addr, dest addr, src port, dest port )
TCP Connection Recognition
Algorithm
33
Given: a copy of traffic passing across a network.
Produce: a record of TCP connections present in the traffic.
Initialize a connection table, C, to empty;
For each IP datagram that carries a TCP segment {
Extract the IP source, S, and destination, D, addresses;
Extract the source, P1, and destination, P2, port numbers;
Use (S,D,P1,P2) as a lookup key for table C and
create a new entry, if needed;
If the segment has the RESET bit set, delete the entry;
Else if the segment has the FIN bit set, mark the connection
closed in one direction, removing the entry from C if
the connection was previously closed in the other;
Else if the segment has the SYN bit set, mark the connection as
being established in one direction, making it completely
established if it was previously marked as being
established in the other;
}
TCP Splicing





34
Join two TCP connections
Allow data to pass between them
To avoid termination overhead translate
segment header fields
Acknowledgement number
Sequence number
Illustration Of TCP Splicing
TCP connection #1
Host A
Sequence 200
Connection
&Direction
35
TCP connection #2
splicer
Sequence 50
Sequence
Number
Sequence 800
Connection
&Direction
Host B
Sequence 1200
Sequence
Number
Income #1
200
Outgoing #2 860
Income #2
1200
Outgoing #1 50
Change
Change
660
-1150
TCP Splicing Algorithm
36
Given: two TCP connections.
Produce: sequence translations for splicing the connection.
Compute D1, the difference between the starting sequences
on incoming connection 1 and outgoing connection 2;
Compute D2, the difference between the starting sequences
on incoming connection 2 and outgoing connection 1;
For each segment {
If segment arrived on connection 1 {
Add D1 to sequence number;
Subtract D2 from acknowledgement number;
} else if segment arrived on connection 2 {
Add D2 to sequence number;
Subtract D1 from acknowledgement number;
}
}
QUESTION?
37