Network Layer
Download
Report
Transcript Network Layer
Instructor: Rob Nash
Readings: Chapter 3, P&D
We have a limited number of hosts so far
Also, a limited geographical distance
◦ As broadcast can only take us so far
We can connect two distant nodes (or
networks) via point-to-point connections
◦ But we don’t service any nodes in between
We’d like to build a global network, so we
must consider hosts that aren’t directly
connected.
“Nature seems […] to reach many of her ends by
long circuitous routes.” – Rudolph Lotze
“Packets are able to reach many different ends by
(sometimes) long circuitous routes”
◦ But imagine this dilemma for a second:
◦ How are packets able to navigate an unknown topology?
Ether is simple: send to everybody, but again doesn’t scale
Your phone isn’t directly connected to all other
phone users
Rather, you’re connected to a switch
An operator will provide the “directly
connected” illusion by configuring a
(temporary) link for use in the call
In the same vein, computer networks have
packet switches
◦ For use in forwarding/switching packets
Routing is the process of building a forwarding table (4)
Very broadly defined here as either:
Connection-oriented: Like a telephone call, with
temporary state stored at each switch
◦ X.25
◦ ATM
Connectionless: Like the postal service, with even less
recourse for problems (no RTS, etc.)
◦ IP, UDP
Also, we’ll focus on two specific examples of
switching
◦ Ethernet & ATM
Forwarding is a table lookup
◦ Given the input port and ID, what is the output port
and outgoing ID?
Routing is the algorithm that builds the table
◦
◦
◦
◦
A distributed algorithm by nature of the domain
Should be fair
Consider offering a QoS
This has evolved over the history of networks
LAN Switching is an evolution of Ethernet
Bridging with performance augmentations
Switch Function:
◦ Connects two or more network segments
◦ Forwards packets from input port to output port
◦ Selects a port based on address in packet header
T3
T3
STS-1
Input
ports
Switch
T3
T3
STS-1
Output
ports
7
CSS432: Switching and
Fowarding
Covers a large geographic area (> 2500m in
Ethernet)
Support large numbers of hosts (>1024
hosts in Ethernet)
Maintain performance (>two packets
through a switch)
◦ And for n input ports each with buffer b, we can
provide n x b queuing simultaneously
Contrast this to Ethernet, where two hosts will
compete for the line
Point-to-Point
Ethernet MAC
Rings
A switch adds the star topology to our set
◦ Also, the ability to interconnect any of the above
networking technologies
As switches may be connected to hosts, or other
switches
Switched networks are more scalable than sharedmedia networks
◦ Directly due to their ability to support many hosts at full
speed (limited to memory capacity)
And, we can use a switch to combine two disparate
networks
◦ A SONET STS-3 link with and a few T3s
◦ Each port runs the appropriate link layer protocol
Switching (or forwarding): receiving incoming packets
on an input port and selecting the appropriate output
port on which to forward the data
How does a switch make its decision?
◦ This depends on the approach {connectionless, etc}
◦ In general, look at the header of the packet for an
identifier (could be a local id, could be an IP addr)
Use this to make your decision by looking up the ID in
a table, and forward accordingly
We’ll start simply with the datagram approach
We can provide unique identifiers to each
host on the network (e.g., an address)
We also will be interested in providing
identifiers to label each input and output port
in a switch
Each packet contains enough information to
enable any switch to forward it
How? Just including the complete destination
address in every packet.
Each switch will use the destination address as
the key in the lookup
No connection state (thus no setup)
All packets are forwarded independently
Node failure and reroute is possible
Host D
Host E
0 Switch 1
3
Table at Switch 2
Dest
Port
A
3
B
0
C
3
D
3
E
2
F
1
G
0
H
0
Host C
Host F
1
2 Switch 2
2
3
1
0
Host A
Host G
1
0 Switch 3 Host B
3
2
Host H
CSS432: Switching and Fowarding
14
In a simple and static environment, one
network operator may know the topology
◦ And, manually install this in switches in the network
In a distributed and dynamic environment, no
one operator knows the complete topology
◦ Multiple pathways, failing nodes, etc.
◦ This harder problem is routing (Section 4.2)
◦ For now: routing is an assumed background
process, and forwarding is a simple lookup
Hosts can send packets at any time (and to
anywhere)
◦ No setup or teardown
◦ All switches can immediately forward this packet,
assuming a correct routing table
Hosts don’t know (or care) about the health of the
intermediary network or destination node
◦ You could send a packet to a machine that just lost power
◦ Or, you could send a packet through a network whose
switches just lost power
Failures may not catastrophically effect
communications if alternate routes exists around
failed nodes (and the network updates its tables)
A connection-oriented approach
◦ With a setup, communicate, and teardown phase
◦ This may seem like TCP over IP, but we’ll see this is
implemented on top of the connectionless approach
Setup: establishing connection state and path
through the network
◦ Each subsequent packet will follow this path
Forwarding tables use VCIs – Virtual Circuit
Identifiers – that help uniquely identify
connections at a local switch
Each switch maintains a VC table
The Input Port & VCI uniquely determine a connection
0 Switch 1
1
3
VCI = 11
2
5 VCI = 5
3
11
2 Switch 2
1
0
VCI = 7
Host A
Switch 1
Port (in)
VCI
Port (out)
VCI
2
5
1
11
Switch 2
Port (in)
VCI
Port (out)
VCI
3
11
0
7
Switch 3
Port (in)
VCI
Port (out)
VCI
0
7
3
4
7
1
18
0 Switch 3
VCI = 4
3
4
Host B
2
CSS432: Switching and
Fowarding
PVCs – “permanent Virtual Circuits”, which are
long-lived (or network operator configured)
table entries
Signaled: a host may set up or delete a VC
dynamically and autonomously
Oracle: How do switches to know what
outgoing VCI they should use?
◦ This data is literally downstream of the current
switch!
Answer: We fill this data in “in reverse”, after
we’ve built a path from A to B.
◦ Then, a setup/connection packet from B to A is sent
informing each upstream hop of the VCI it should
use
We signal to set up (reserve a VCI entry) and
signal to reclaim these resources when done
At least one RTT delay before any payload is
communicated…
◦ Why?
Setup packets differ from payload packets
◦ Since setup contains the full GuID for the destination
◦ So, per-packet overhead is reduced relative to the
datagram approach
When we do get to send data, much network
topology is known in advance
◦ There is a receiver and route to that receiver, and the
receiver is ready to accept data
Resources are reserved in advance to avoid
contention
SWP is used in between node pairs along the
circuit
Flow control is used to prevent congestion,
and new circuits are declined if not enough
resources at a switch
Popular with telephony companies in the 80s
Physical medium : POTS links or ISDN
◦ ISDN integrates speech and data on the same line
◦ Pre-DSL
From Wiki:
“X.25 is today to a large extent replaced by
less complex protocols, especially the
Internet protocol (IP) “
We see the datagram approach is minimal
and doesn’t reserve resources in advance
◦ But, it also cannot make the same guarantee that
X.25 can
We can implement a QoS concept using the
connection model, as we set the service level
per connection
◦ QoS here: a performance or resource guarantee
My packets shouldn’t be delayed (queued) too long
My packets will always be accommodated at each
switch
Frame Relay is used for VPN construction
(4.1.8)
ATM is used to link telephony systems across
wide areas in a point-to-point configuration
Consider a pair of Ethernets you’d like to
connect
We could just place a “repeater” terminal that
collects all packets on one net and broadcasts
them to the other
◦ Shout louder!
◦ This forms an extended LAN
◦ The simplest version does no optimization
Note that a “bridge” here could be a host, but
it meets our definition of a switch.
Consider a shared-media example
Consider the star topology offered by
switching
◦ Note that each host has its own dedicated link
In the MAC example, link contention is an
issue
◦ In the switching example, I can send as much as the
switch can forward (or buffer) on my own link
Connecting two or more LANs
◦ Repeater
L1 – Physical Layer
Limitations: <= 2500m and <= 1024
nodes
◦ Bridge (or LAN switch)
L2 – link layer
No physical limitations
Fowarding frames using MAC address
Static configuration + partial dynamic
configuration (Spanning Tree Protocol)
◦ Router
A
B
C
L3 – Network Layer
Routing IP packets using IP address
Dynamic configuration
Port 1
Bridge
Port 2
X
28
Y
CSS432: Switching and
Fowarding
Z
Learning Bridges
Do not forward when unnecessary
Ex. A frame sent from A to B
Maintain forwarding table
A
B
Based on datagram switching
C
Port 1
Bridge
Port 2
X
Z
Port
1
1
1
2
2
2
Learn table entries based on source address
Y
Host
A
B
C
X
Y
Z
Ex. An entry for A is registered upon receiving a frame from A
Ex. When receiving a frame from B, don’t forward to Port 2
Table is an optimization; need not be complete
Entries are expired after a specific period of time
CSS432: Switching and Fowarding
29
How could a network come to have cycles in
it?
◦ Perhaps it’s a multi-site distributed net where no
one administrator knows the complete topology
◦ Introduced by accident?
◦ More likely: introduced for redundancy!
However, Learning Bridges can fail if a cycle
exists, so we need a strategy to address
graph cycles.
Algorithm deactivates ports to remove cycles
◦ The spanning tree determines which bridges to use, and
which bridges should “sit out”
Note that a bridge may forward on some ports, but not
others
Formalized in the IEEE 802.1 Specification
◦ Bridges adopt this distributed algorithm (as we’ll see)
Concept: remove edges from your graph until no
cycles exist (the tree is a subset of the graph)
◦ Oddity: vertices in this graph are both hosts and
switches
When the network has settled, certain bridges
will be designated to forward packets over
their IO ports based on their distance to the
root (or ID number if a tie)
Other bridges or ports will simply be disabled
Each bridge decides the ports over which it
will and will not forward frames
Elect the smallest ID as the root
◦ Roots always forward over all ports
Each bridge computes the distance between it
and the root
◦ Usually a per-hop count
Trades this information with its neighbors,
keeping track of “best” paths
◦ Ie, shortest hop count in this context
◦ Bridges that offer the best paths become designated
Finally all bridges determine the root feeder,
which is the only bridge that forwards to the root
◦ Chosen so it is closest to the root
STP Overview
Each bridge has unique id (e.g., B1, B2, B3)
Select a bridge with smallest id as root
Select a bridge on each LAN closest to root as designated
bridge (use id to break ties)
A
Each bridge forwards frames
over each LAN if it is a
designated bridge
2 hops
B
B3
C
B5
1 hop
D
B2
1 hop
B5 < B7
E
1 hop
B7
K
F
B1
root
G
H
1 hop
B6
I
1 hop B4
B4 < B6
J
CSS432: Switching and Fowarding
34
STP Details (use p. 191)
Bridges exchange configuration messages (Y, d, X)
Y: the id of root to be
d: #hops from X to Y
A
X: the sending bridge id
Initially, each bridge believes it is the root
B3
When learn not the root, stop generating
C
configuration messages
(1, 1, 2)
in steady state, only the root
B2
generates messages
E
When learn not a designated bridge, stop
forwarding configuration messages
in steady state, only designated
G
bridges forward configuration
messages
B6
If any bridge does not receive
I
configuration message after a period of
time, it starts generating configuration
messages claiming to be the root.
CSS432: Switching and Fowarding
B
(1, 1, 5)
B5
D
B7
K
(1, 0, 1)
F
(1, 0, 1)
(1, 0, 1)
B1
H
B4
J
35
STP:
◦ It won’t forward frames over alternative paths for the sake of:
Routing around a congested bridge
Routing along a shorter path like one from a node on B to
another node on K
◦ Scales linearly, and uses broadcast mechanism
Bridges in general:
◦ Not scalable (“tens of”)
STP
Broadcast (forwarding all broadcast/multicast frames in the
current practice)
◦ Homogenous networks only (uses network’s frame header)
Ethernet to Ethernet
Token ring to Token ring
ATM to ATM
Idea: Partition LANS using coloring/tiling to limit the number
Of network segments that will broadcast
36
“It is never safe to design network software
under the assumption that it will run over a
single Ethernet segment.”
“Bridges happen.”
◦ Drop frames if congested (rare on Ethernet alone)
◦ Frames could be reordered in an extended LAN
Not in a singular Ethernet segment
Many ways to build economy & high-end switches
◦ More advanced fabrics are implemented in high-end (core)
switches
The high level concepts overlap, however
One idea: Get a box and a few NICs (DMA)
◦ Not a bad experimentation setup for new protocols
◦ Or cross-protocol examination
Not so hot for performance
Another idea: Custom Hardware
◦ A shared-memory switch
memory with dual ports
Crossbar switch
Switches that attempt to self-route (3.4-3.5, Batcher & Banyan)
Workstation
CPU
I/O
ctlr
Main memory
I/O Bus
NIC
LAN A
NIC
LAN B
NIC
LAN C
Advantage: flexible because a workstation has a CPU.
Example
◦ 33MHz 32bit I/O bus
1Gbps for one way from NIC to main memory
500Mbps for a round trip between NIC and main memory
Enough to support five 100Mbps Ethenet
◦ What if a packet is very small like 64byes
The workstation has 500,000 packets per second (pps).
Throughput: 500,000 x 64 x 8 = 256Mbps
39
CSS432: Switching and
Fowarding
Control
processor
Input Port
Input Port
Input Port
Shared bus
DMA from port to port
Output Port
Output Port
Output Port
Shared memory
A simple design
Shared bus or memory becomes a bottleneck. (Max. 16 bus masters)
40
CSS432: Switching and
Fowarding
Without a collision, all inputs delivered to each output
All inputs may go to the same output which causes a
collision in the output buffer.
41
CSS432: Switching and
Fowarding
Connection-oriented packet switching
◦ Uses signaling (Protocol Q.2931)
WAN, but more recently LANs
Runs on various physical mediums
◦ SONET
◦ Shared Media such as Wireless
◦ Shared-Media like Ethernet (with LANE)
Packets are called cells, which are fixed
length (48 + 5 Bytes)
LAN packets V.S. ATM cells
◦ Consider also CISC v.s. RISC
In this light, certain features of ATM shine
Observations for a short and simple approach:
◦ Its easier to build HW to do simple (short) jobs
◦ The processing of data is simpler when fixed format
RISC ISA commonly has only a few instruction formats
Off topic: 802.5 & Dec.Intel.Xerox Ethernet standard
Meaning: Compatibility can be simpler with a common format
◦ Simple and short data {frames, instructions} can often be
“trained” or “pipelined”
Observation: homogenous packet length
lends to homogenous switching structures
◦ Short and uniform structures can make the task of
exploiting parallelism easier
Either at the hardware level
See simultaneous multithreading
Or along protocol stack (simultaneous packet
processing, self-organizing streams, etc.)
◦ Uniformity at higher levels tends to promote
uniform hardware designs
Since this is not custom, often cheaper to build this
fast, scalable hardware
Fixed length instructions help to align, fetch,
prefetch, optimize, synchronize, reorder etc.
◦ See the original 360 and Robert Tomasulo
Variable length instructions are more complex by
design,
◦ possibly requiring multiple cycles to fetch a longer
instruction
And/or more trips across the bus to and from memory
All said and done, Ethernet LANs are just as
convincing due to their speed, cost, success &
adoption rate
Error detection is implemented at endpoints
◦ End-to-end but not at each switch (i.e., at data
link layer)
Congestion control
◦ Admission control
If switches are completely reserved, decline
connections
Fixed-size cells can make this easier
One Approach: use some SONET overhead to
point to the start of the cell in the payload
Another Approach: CRC every 5 bytes
◦ If you see no error, you’re likely at an ATM header
Repeat this approach looking for the same results
every 53 bytes
See p.199 for the frame format
Not exaustive
ATM offers Qos features
ATM offers flow control, LANs are “best effort”
ATMS are conservative resource-wise
◦ Connectionless protocols are minimalist
ATM can guarantee resources ahead of time
◦ Useful esp. for voice-grade guarantees
Fixed length V.S. variable length packets
No broadcast (natively) V.S. only broadcast
Layers were built ontop of ATM to support
other styles of networks and services
◦ AAL 1-2 is for voice grade guaranteed bit rates
◦ AAL 3-4 is for packet data over ATM
This requires S&R, since MTU for Ethernet >> 53B
When packets are being discarded frequently
due to lack of resources
◦ arrivalRate > sendRate + bufferSpace for some t