datagram network
Download
Report
Transcript datagram network
INFO 203
IT for Engineers
The Network Layer
Dr. Jennifer Booker
INFO 203 week 7
1
www.ischool.drexel.edu
The Network Layer
• So, the transport layer provides
process to process communication
• The network layer is expected to provide
host to host communication
• Cool.
• Um, how?
INFO 203 week 7
2
www.ischool.drexel.edu
The Network Layer
• The Network Layer has to do two things:
– Forwarding is the process within a single
router to determine which outgoing link a
packet has to take
– Routing is the process (and algorithm) of
choosing the best path (route) between
source and destination
• Forwarding is like deciding which turn to make
at one intersection
• Routing is deciding which roads to take
INFO 203 week 7
3
www.ischool.drexel.edu
The Network Layer
• Recall the network layer is expected to
– Receive segments from the transport layer
– Encapsulate them into datagrams (how much does data weigh?)
– And pass them through the network
• The job of most routers is to look at the network
header information, and determine which link to
pass the datagram
– The application and transport layer information are
invisible and irrelevant to routers
INFO 203 week 7
4
www.ischool.drexel.edu
The Network Layer
• A router has a forwarding table which tells
which link to take, based on the header’s
destination address
• The forwarding table is written based on
output from a routing algorithm
– Routing algorithms may be centrally
controlled and then downloaded to each
router; or each router may follow their own
algorithm
INFO 203 week 7
5
www.ischool.drexel.edu
The Network Layer
• A packet switch is a device that transfers a
packet from an input link to an output link
– Some are link-layer switches, which use the
link layer header info
– The rest we call routers, which use network
layer header info
• Another function in the network layer can
be connection setup
– Only for virtual circuit networks (ATM, X.25)
INFO 203 week 7
6
www.ischool.drexel.edu
Network Service Model
• What services could we expect from a
network layer?
– Guaranteed delivery of all packets
– Delivery within a specified time (bounded delay)
– Delivery of packets in order
– Guaranteed minimal bandwidth
– Guaranteed maximum jitter (delay variation)
– Security services
•
Would be nice, huh?
INFO 203 week 7
7
www.ischool.drexel.edu
Network Service Model
• What do we get from the Internet?
– Best-effort service
• Meaning, none of the above!!
• Some VC networks, such as ATM, can
provide many of the ideal services
– VC setup and teardown involve the hosts and
all routers along the path, whereas TCP only
involved the hosts
INFO 203 week 7
8
www.ischool.drexel.edu
Network Service Model
• Refining our earlier definition, the network
layer can provide connection-based or
connection-less service
– A network that provides only a connectionbased service at the network layer is a virtual
circuit (VC) network
– A network that provides only connectionless
service at the network layer is a
datagram network
INFO 203 week 7
9
www.ischool.drexel.edu
Datagram Networks
• Datagram networks stamp each packet
with the address of the destination host,
and send it into the network
– There is no state information about
connections, because there aren’t any
connections within the network!
INFO 203 week 7
10
www.ischool.drexel.edu
Datagram Networks
• Each router between hosts uses the
address to forward the packet using a
forwarding table
– If our addresses had 32 bits, there could be
4,294,967,296 entries in that table!
INFO 203 week 7
11
www.ischool.drexel.edu
Datagram Networks
• Fortunately, we don’t need to look at ALL
of the address to determine its correct link
(a key observation!)
– Instead, match the address’ prefix with
forwarding table entries
– Use the longest prefix matching rule
• Match the longest prefix possible in the
forwarding table
• For this to be practical, large ranges of addresses
should go to each link, or the table will be huge!
INFO 203 week 7
12
www.ischool.drexel.edu
Longest prefix matching rule
• The router just finds the longest prefix and
uses that entry in the routing table to
forward the packet
Prefix
Link
11001000 00010111 00010
0
11001000 00010111 00011000
1
11001000 00010111 00011
2
Otherwise
3
INFO 203 week 7
13
www.ischool.drexel.edu
Datagram Networks
• So even though there is no connection
data, routers in datagram networks need
to maintain the forwarding tables
– The routing algorithm typically updates them
every 1-5 minutes
– Hence it’s quite possible for the later packets
of a long session to follow a different path
than the earlier packets!
INFO 203 week 7
14
www.ischool.drexel.edu
More History
• The VC network came about because of
its similarity to telephone networks
• But the Internet was connecting complex
computers, so the datagram network was
created because the computers could handle
more complex operations than the routers
(recall our IMP friends from Chapter 1)
– This also makes it easier to connect dissimilar
networks, and create many new applications
– “Hosts are smart, routers are stupid”
INFO 203 week 7
15
www.ischool.drexel.edu
Router Innards
• Now look at forwarding in more detail
• A router has four kinds of parts
– Input ports
– Output ports
– Switch fabric between the inputs and outputs
– And a routing processor to control the switch
fabric, using the routing protocols
INFO 203 week 7
16
www.ischool.drexel.edu
Router Innards
Router forwarding plane (HW)
Router control plane (SW)
INFO 203 week 7
17
www.ischool.drexel.edu
Router
• Notice that the router forwarding plane is
done in hardware to speed processing
– For a 10 Gbps connection and 64-byte
datagrams, the input port only has 51.2 ns to
process each packet!
• In contrast, router control plane functions
(processing) is done at the ms time scale
or slower, so they can be executed on a
traditional CPU
INFO 203 week 7
18
www.ischool.drexel.edu
Router Innards
• The input and output ports include
– The physical connection to the network, and
– Take the signal through the data link layer
• The input ports also look up the
destination address, decides how to
forward the packet, and creates control
packets to send to the routing processor
– The three boxes represent the physical layer,
data link layer, and lookup/forward module
INFO 203 week 7
19
www.ischool.drexel.edu
Input Ports
• The routing processor determines the
forwarding table contents, and shadow
copies it to each input port
– This avoids a processing bottleneck
• Looking up where to forward packets is
simple in concept – the challenge is
maintaining line speed
– Want to process each packet in less time
than it takes to receive the next one
INFO 203 week 7
20
www.ischool.drexel.edu
Switching Fabric
• The input ports determine the output port
needed; switching fabric makes it happen
• Switches handle staggering data rates
(e.g. 60 Tbps for the Cisco Nexus 9516),
so their technology is constantly being
pushed
• Many approaches for switching fabric have
been used
INFO 203 week 7
21
www.ischool.drexel.edu
Output Ports
• The output ports take packets from the
output port memory (queue) and transmit
them over the outgoing link
• Hence the three functions of output ports
are
– Queuing
– Data link processing
– Physical line termination
INFO 203 week 7
22
www.ischool.drexel.edu
Queuing
• We’ve discussed buffers in connection
with output ports, but they also exist
with input ports
• Packet loss can occur at input or output
queues, depending on
– Input traffic load
– Switching fabric speed
– Line speed
INFO 203 week 7
23
www.ischool.drexel.edu
Switching Fabric Speed
• For a router with n input and n output ports
• If the switching fabric has a speed n times as
fast as the input line speed, no queuing can
occur at the inputs
– But the output ports can easily become overloaded
if many inputs all feed the same output port
• A packet scheduler at the output port decides
which packet is next for transmission
INFO 203 week 7
24
www.ischool.drexel.edu
Packet Scheduler
• The packet scheduler needs rules
– Could use first come, first served
(FCFS) approach
– Could use weighted fair queuing (WFQ)
• The packet scheduler affects the quality
of service of the connection
• Dropping and marking strategies are
Active Queue Management (AQM)
algorithms
INFO 203 week 7
25
www.ischool.drexel.edu
Incoming Buffer
• If the switch fabric is too slow, packets
have to wait in the input queue before
moving to an output queue
– Head-of-the-line (HOL) blocking is when a
packet waits for a packet to cross, even
though its output port is open
INFO 203 week 7
26
www.ischool.drexel.edu
The Internet Protocol (IP)
• Now see how all this applies to the Internet
– We’ll cover both the existing IPv4 and IPv6 (versions
4 and 6)
• The network layer has three major parts
– Internet Protocol, which handles addressing
– Routing protocols (e.g. RIP, OSPF, BGP), which
choose the best path for packets
– Internet Control Message Protocol (ICMP),
which handles error reporting and signaling
INFO 203 week 7
27
www.ischool.drexel.edu
Datagram Format
• A segment in the transport layer becomes
one or more datagrams in the network
layer
– First discuss IPv4, then show how IPv6
is different
INFO 203 week 7
28
www.ischool.drexel.edu
Datagram Format
• The IPv4 datagram header has at least
five 4-byte (32-bit) fields, like TCP
– Version number, header length, type of service, and
datagram length in bytes
– Identifier, some flags, and fragmentation offset
– Time-to-live, upper layer protocol, and
header checksum
– Source IP address (32 bits)
– Destination IP address (32 bits)
– Then options, followed by the segment data
INFO 203 week 7
29
www.ischool.drexel.edu
Datagram Format
• Version number is 4 bits for the IP version
• Header length is 4 bits for the number of bytes in
the IP header (usually 20 B)
• Type of service (TOS) is 8 bits which allow one
to specify different levels of service
(real time or not)
• Datagram length in bytes is the total of the
header plus the actual data segment
– Is a 16 bit field, but typical length is under 1500 B
INFO 203 week 7
30
www.ischool.drexel.edu
Datagram Format
• The Identifier, flags, and fragmentation
offset all relate to IP fragmentation
(breaking a segment into multiple
datagrams)
• Time-to-live (TTL) is a countdown integer,
to prevent packets from wandering in the
network for 40 years
– It increments down one with each router, and
kills the datagram when it gets to zero
INFO 203 week 7
31
www.ischool.drexel.edu
Datagram Format
• Protocol is the transport layer protocol
– Only used when get to the destination host
– E.g. 6=TCP, 17=UDP; see RFC 3232 for others
• Header checksum – hey, didn’t we have a
transport checksum?
– Yes, but this only covers the IP header, not the
segment data
– And TCP might be run over other network protocols,
e.g. our VC buddy, ATM
INFO 203 week 7
32
www.ischool.drexel.edu
Datagram Format
• Source and destination IP addresses
we’ll discuss in more detail soon
• Option fields allow for rarely used
functions, but slow IP processing
– Hence these are not allowed in IPv6
• The Data in the datagram can be the
TCP or UDP segment, or contain other
message formats such as ICMP
INFO 203 week 7
33
www.ischool.drexel.edu
Fragmentation
• A frame can hold up to the Maximum
Transmission Unit (MTU) bytes of data
– But not all link-layer protocols can handle the
same size packets
• Ethernet handles up to 1500 B frames
• Some WAN protocols only handle 500 B frames
• Fragmentation breaks up a segment into
multiple pieces, then reassembles them at
the destination host
– Was removed in IPv6
INFO 203 week 7
34
www.ischool.drexel.edu
IPv4 Addressing
• Recall that hosts have to have interfaces to the
network, over which to send datagrams
• Routers need many interfaces, since they are
connected to multiple links
• Therefore every IP address is associated with an
interface, not a host or router
– IPv4 addresses are 32 bits (4 bytes), written in dotted
decimal notation (byte.byte.byte.byte)
INFO 203 week 7
35
www.ischool.drexel.edu
IPv4 Addressing
• Every Internet address visible to the must have a
unique IP address
– Local networks can hide many systems behind one IP
using network address translation (NAT)
• IP addresses are given out as hierarchically as
possible, so many local addresses have the
same prefix or subnet (leftmost bits in the IP
address)
– Subnet = IP network = network in much literature
(terms vary)
INFO 203 week 7
36
www.ischool.drexel.edu
IPv4 Addressing
• How many bits of the address are used to
define the subnet is given as a suffix after
a slash, e.g. 213.1.3.0/24 means the first
24 bits of the address are the subnet mask
– Often the links of a router each point to a
different subnet, e.g. in Fig 4.15
– Subnets also can be defined for the interfaces
between routers
– A subnet is essentially an isolated part of a
larger network
INFO 203 week 7
37
www.ischool.drexel.edu
Fig 4.15 – Subnet example
223.1.1.1
Subnet
223.1.1.0/24
223.1.2.1
223.1.1.2
223.1.1.4
223.1.3.27
223.1.1.3
Subnet
223.1.2.0/24
223.1.2.9
223.1.3.1
223.1.2.2
223.1.3.2
Subnet
223.1.3.0/24
INFO 203 week 7
38
www.ischool.drexel.edu
Pre-CIDR
• Internet domains originally had prefixes of
– Class A=8, Class B=16, or Class C=24 bits
• Led to lots of wasted address space!
– Class A 16,777,216 hosts per domain
– Class B 64k hosts
– Class C 256 hosts
INFO 203 week 7
39
www.ischool.drexel.edu
CIDR
• Now we use Classless Interdomain
Routing (CIDR, RFC 4632) to avoid that
limitation
– Any subnet of the form a.b.c.d/x can be used
– The x is called the prefix or network prefix
– Outside of the network (subnet), only the
prefix is used for routing
• The rest of the address defines hosts within
the network
Image from http://www.naturalandsustainable.com/category/hard-cider/
INFO 203 week 7
40
www.ischool.drexel.edu
CIDR
• So if a prefix is of the form a.b.c.d/21,
– 21 bits of the address are the prefix
– The remaining 32-21= 11 bits are unique
to each device within that subnet
– Giving you room for 2^11 = 2048 hosts
• The a.b.c.d part of the CIDR address can
be anything that fits within the prefix length
in binary
INFO 203 week 7
41
www.ischool.drexel.edu
Broadcast Address
• The IP broadcast address is a special IP
address 255.255.255.255 (or all ones,
111111111.11111111.11111111.11111111)
• When the destination address is that
value, the message goes to all hosts
within the subnet
– Routers usually won’t forward these
messages; but might
INFO 203 week 7
42
www.ischool.drexel.edu
Obtaining IP Addresses
• Typically an ISP gets a block of IP addresses,
and assigns them to customers
– E.g. the ISP might get 200.23.16.0/20,
which it breaks down into smaller subnets
for each customer – 200.23.16.0/23 for one,
200.23.18.0/23 for another, etc.
– That way, routing knows anything starting with
200.23.16.0/20 goes to that ISP, and the ISP
routes it more specifically to each customer,
who then routes it to each specific host
INFO 203 week 7
43
www.ischool.drexel.edu
Obtaining IP Addresses
Organization 0
200.23.16.0/23
Organization 1
200.23.18.0/23
Organization 2
200.23.20.0/23
Organization 7
.
.
.
.
.
.
Fly-By-Night-ISP
“Send me anything
with addresses
beginning
200.23.16.0/20”
Internet
200.23.30.0/23
ISPs-R-Us
“Send me anything
with addresses
beginning
199.31.0.0/16”
The use of a prefix for multiple subnets is called
address or route aggregation, or route summarization
INFO 203 week 7
44
www.ischool.drexel.edu
Managing IP Addresses
• While ideally it would be nice to have a
unique subnet for everything, in reality it
gets messier – many ISPs might have
several subnet ranges assigned to them
• ICANN manages IP addresses, based
on RFC 2050, as well as managing
domain names
INFO 203 week 7
45
www.ischool.drexel.edu
Getting a Host IP Address
• An organization assigns host addresses within
its subnet
– Routers have IP addresses manually assigned
• Hosts can be manually assigned, but usually use
Dynamic Host Configuration Protocol (DHCP)
– DHCP sets the host IP address, the subnet mask,
defines the first-hop router (default gateway), and
local DNS server
– DHCP is often known as a plug-and-play protocol,
because it makes network admin much easier!
INFO 203 week 7
46
www.ischool.drexel.edu
DHCP
• For example, an ISP can use DHCP to
assign IP addresses to dialup customers
– Need fewer IP addresses than you have
customers, since all won’t be online at once
– Need to manage which IP addresses are in
use, and which are available to be assigned
• DHCP is also handy for mobile clients,
such as connecting to Dragonfly
INFO 203 week 7
47
www.ischool.drexel.edu
DHCP
• Dynamic Host Configuration Protocol
(DHCP) makes our lives much easier
• DHCP is client/server based
– There must be at least one DHCP server to
tell everyone else what their IP addresses are
• A router can act as a DHCP relay agent,
so that multiple subnets can share one
DHCP server
INFO 203 week 7
48
www.ischool.drexel.edu
DHCP
• A new host on a subnet follows a four-step
process to get an address
– DHCP server discovery (what servers exist?)
– DHCP server offer(s) (lease me an IP!)
– DHCP request (I accept your offer)
– DHCP ACK (server acknowledges)
• Once the client is connected with its assigned IP, the
lease can be renewed
• One minor drawback is that an IP address can’t be kept
between subnets, bad for mobile clients
INFO 203 week 7
49
www.ischool.drexel.edu
Network Address Translation
• Network Address Translation (NAT) allows
local networks to define IP addresses that
are invisible to the outside world
– The NAT router looks like a device with one IP
address to the outside world, but usually uses
DHCP to assign IP addresses from private
networks to local devices
• It doesn’t have to use private networks, you could
use publicly visible IP addresses
INFO 203 week 7
50
www.ischool.drexel.edu
Private networks
• NAT typically uses prefixes reserved for
private networks, per RFC 1918:
– “The Internet Assigned Numbers Authority
(IANA) has reserved the following three
blocks of the IP address space for private
internets:
• 10.0.0.0/8
• 172.16.0.0/12
• 192.168.0.0/16”
INFO 203 week 7
51
www.ischool.drexel.edu
Network Address Translation
rest of
Internet
local network
(e.g., home network)
10.0.0/24
10.0.0.4
10.0.0.1
10.0.0.2
138.76.29.7
10.0.0.3
All datagrams leaving local
network have same single source
NAT IP address: 138.76.29.7,
different source port numbers
Datagrams with source or
destination in this network
have 10.0.0/24 address for
source, destination (as usual)
INFO 203 week 7
52
www.ischool.drexel.edu
Network Address Translation
• The NAT router keeps a translation table
– Destination address and port number
– Source local host IP AND port number
• Hence NAT has to change the addressing
of every datagram in & out of the network!
– Some purists object to this
• Need workarounds for P2P applications;
Universal Plug and Play (UPnP) does that
INFO 203 week 7
53
www.ischool.drexel.edu
ICMP
• ICMP is an old (1981) protocol (RFC 792)
to communicate error messages across
the network layer
– E.g. “Destination network unreachable”
– ICMP is a nudge above IP, since ICMP sends
IP datagrams, instead of a TCP or UDP
segment
• ICMP messages have a type and code
field (p. 354), plus the first 8 bytes of the
offending IP datagram
INFO 203 week 7
54
www.ischool.drexel.edu
ICMP & Ping
• ICMP message also convey other kinds of
information, such as congestion control,
bad IP header data, TTL expired, etc.
• Ping uses an ICMP message type 8, code
0, which is an “echo request”
– The reply should be type 0, code 0, “echo
reply”
INFO 203 week 7
55
www.ischool.drexel.edu
Traceroute
• Traceroute sends UDP segments with bad
port numbers and successive TTL (1, then
2, then 3, etc.) and times each datagram
– When each TTL occurs, an ICMP warning
message is sent from that router, which
returns to give the round trip time (RTT) and
the router’s information
INFO 203 week 7
56
www.ischool.drexel.edu
Traceroute
– When a datagram gets to the other host, the
UDP segment has a weird port number, which
prompts an ICMP message of type 3, code 3,
“destination port unreachable”
– That tells traceroute the other host has been
reached, so no more datagrams are needed
– Sneaky!
INFO 203 week 7
57
www.ischool.drexel.edu
ICMP and Firewalls
• Firewalls typically inspect the headers of
packets to look for threatening contents
– Pings coming from outside your network can
map IP addresses, for example
– Port scans can look for open ports
• An Intrusion Detection System (IDS) goes
further by looking at packet contents
(data), and comparing them to known
attacks
INFO 203 week 7
58
www.ischool.drexel.edu
IPv6
• The IETF realized that the Internet would run out
of IP address space, and CIDR, NAT, and
DHCP aren’t enough to save it
– By 1996, 100% of Class A addresses were used,
62% of Class B addresses, and 37% of Class C
• IPv6 was first called IPng (next generation)
– IPv6 is defined by RFC 2460
• What’s different from IPv4?
INFO 203 week 7
59
www.ischool.drexel.edu
IPv6 Datagram
• The IP addresses went from 32 to 128 bits
– 2128= 340,282,366,920,938,463,463,374,607,431,770,000,000
– Really, we won’t run out of IP addresses.
Ever.
• In contrast, the number of cells in 7 billion people
is about 7E9*1E12= 7E21, a factor of 49 million
billion under the 3.4E38 possible addresses
INFO 203 week 7
60
www.ischool.drexel.edu
IPv6 Datagram
• Adds an anycast address type, which can
go to any in a group of hosts
• Header is fixed 40-bytes (2x4 B + 2x16 B)
• Adds flow labeling and priority, where a
flow is a group of packets requiring special
handling (real time service, or paid priority
enhancement)
INFO 203 week 7
61
www.ischool.drexel.edu
IPv6 Datagram
• IPv6 addresses can be a 16-value dotted
decimal notation, e.g.
128.91.45.157.220.40.0.0.0.0.252.87.212.200.31.255 or the hex
equivalent 805B.2D9D.DC28.0000.0000.FC57.D4C8.1FFF
– There are lots of rules for abbreviating IPv6
addresses; most common is ‘::’ which hides a bunch
of zeroes
• Removes from IPv4
– Fragmentation, Header checksum, and Options
INFO 203 week 7
62
www.ischool.drexel.edu
IPv6 Datagram
• Specifically, IPv6 headers have the following
fields:
–
–
–
–
IP version, now obviously a ‘6’
Traffic class, similar to the TOS field
Flow label, an identifier for a given flow
Payload length = number of bytes in the data
• Does not count the header, since that’s a fixed 40 B
–
–
–
–
Next header is the protocol field from IPv4
Hop limit acts like the time-to-live (TTL) field
Source and destination addresses, are 128 bits each
Then the data
INFO 203 week 7
63
www.ischool.drexel.edu
ICMPv6
• ICMP has been updated for new
messages under IPv6 in RFC 4443
• It also takes over the Internet Group
Management Protocol (IGMP) which we’ll
get to later – it involves joining and leaving
multicast groups
INFO 203 week 7
64
www.ischool.drexel.edu
IPv6 Adoption
• The adoption of IPv6 has been slow, partly
because of CIDR, NAT, and DHCP
• However large scale technology changes
typically take a long time
– How many phone lines are optical yet?
– Network protocols are very slow to change,
whereas apps are easy to change
• IPv6 will probably be around a long time!
INFO 203 week 7
65
www.ischool.drexel.edu
IP Security
• IPv4 was designed in the 1970’s, long
before anyone expected the Internet to be
a public medium – and hence it has no
security in it
• IPsec was created to work with IPv4 or
IPv6 and add security to the network layer
• It allows TCP and UDP traffic to take place
in a secure environment
INFO 203 week 7
66
www.ischool.drexel.edu
IP Security
• IPsec
– Allows hosts to negotiate encryptiion
protocols
– Use that protocol to encrypt each datagram
– Verify that the header and data retain their
integrity
– Authenticate the origin of a trusted source
• This is covered more in chapter 8
INFO 203 week 7
67
www.ischool.drexel.edu
Routing Algorithms
• Mostly have focused on forwarding – now
address routing
• Both datagram and VC networks need to
perform routing, i.e. find good paths
between sender and receiver
– A host is typically attached to its default router
(first hop), which we’ll call the source router;
similarly the destination has a destination
router
INFO 203 week 7
68
www.ischool.drexel.edu
Routing Algorithms
• A “good” route typically minimizes cost,
but may also avoid other concerns (e.g.
ownership of networks, privacy of data,
etc.)
• Use a graph to show routing problems,
with N nodes (routers) and E edges (links)
– Assume the cost of each edge is a given:
c(x,y) = cost of edge between nodes x and
y(x,y) is the edge between those nodes
INFO 203 week 7
69
www.ischool.drexel.edu
Routing Algorithms
• The cost of an edge not available is infinite
• A path is defined by a sequence of nodes
(x1, x2, x3, …, xn)
– The cost of a path is the sum of the edge
costs along it; c(x1,y1)+c(x2,y2)+…+c(xn, yn)
• Some path between nodes x and y is the
least-cost path
– If all edges have the same cost, the shortest
path is also the least-cost path
INFO 203 week 7
70
www.ischool.drexel.edu
Routing Algorithms
• Two key ways to classify routing are:
– A global routing algorithm uses knowledge of
the entire network to calculate the best path
• Also called link-state (LS) algorithms
– A decentralized routing algorithm finds the
least cost path in an iterative decentralized
manner – no node has complete knowledge
of the network
• Only the local costs are known
• The distance-vector (DV) algorithm is one example
INFO 203 week 7
71
www.ischool.drexel.edu
Link-State Routing Algorithm
• The Link-State (LS) algorithm uses complete
knowledge of network topology and link costs
• The identity and cost of links for each router are
broadcast using a link-state broadcast, such as
the Internet’s OSPF protocol
• The actual routing is calculated using Dijkstra’s
algorithm (named for Edsger Dijkstra)
– Dijkstra’s algorithm is iterative
INFO 203 week 7
72
www.ischool.drexel.edu
Dijkstra’s Algorithm
• It finds the lowest cost path from the
source to all other nodes
• Complexity of this algorithm is the need to
search n(n+1)/2 nodes, which is O(n2) (the
order of n squared)
INFO 203 week 7
73
www.ischool.drexel.edu
Oscillations
• If the cost of a path depends on the
direction through that path, algorithms can
undergo oscillations where the best path
changes from clockwise to counterclockwise with each iteration
• To avoid this, don’t run the algorithm on
all nodes at the same time
– Or don’t use load-based link costs
INFO 203 week 7
74
www.ischool.drexel.edu
Distance-Vector (DV) Routing
• The Distance-Vector Routing Algorithm is
iterative, asynchronous, and distributed
– Nodes get data from directly attached
neighbors, and distribute the results only to
their neighbors
• The DV algorithm follows the BellmanFord equation
– Each node asks its neighbors how much it
costs to get to the rest of the network
INFO 203 week 7
75
www.ischool.drexel.edu
Distance-Vector Routing
– The those nodes ask their neighbors, and so
on throughout the network
• As each node gets cost data from its
neighbors, the cost to get anywhere in the
network approaches the ideal (minimum)
value
• This is the ‘gossip’ approach to routing
INFO 203 week 7
76
www.ischool.drexel.edu
Distance-Vector Routing
• This depends on asynchronous data
exchange among nodes
– And after all nodes have exchanged
information, the routing won’t change
(becomes quiescent) until there’s a change in
link cost or a dead link
• Many protocols use some variation on
this approach, including ARPAnet, the
Internet’s RIP and BGP protocols,
Novell IPX, ISO IDRP, etc.
INFO 203 week 7
77
www.ischool.drexel.edu
DV Changes
• If the cost of a link decreases, updates to
its neighbors will generally occur
peacefully
• If a cost goes up, leftover incorrect
information can cause a routing loop
(bounce back and forth between nodes)
– Large cost increases can result in thousands
of bounces before the problem corrects itself,
hence known as the count-to-infinity problem
INFO 203 week 7
78
www.ischool.drexel.edu
DV Changes
• Fix somewhat with the poisoned reverse
– Pretend the cost to go backward on a link is
infinite, so it won’t try to bounce back
– But if the loop involves more than two nodes,
this doesn’t help
• Other routing approaches have been used
– Network flow problems
– Circuit-switched routing algorithms
INFO 203 week 7
79
www.ischool.drexel.edu
Hierarchical Routing
• LS and DV assume the network is a herd
of connected routers – all peers or equals
– Scaling for LS routing is daunting for huge
number of routers
– Most administrators want autonomy to decide
their structure
• What happens if there’s structure to
routers?
– Organize routers into autonomous systems
(AS)
INFO 203 week 7
80
www.ischool.drexel.edu
Autonomous Systems (AS)
• Under AS, groups of routers
– Are under control of one administration authority
– Use one routing protocol (LS or DV) within that group,
their intra-autonomous system routing protocol
– Connect to other groups via gateway routers
• Routing information separates routing within the
AS from routing outside the AS
– Need to know which outside addresses are best
reached from which gateway routers
INFO 203 week 7
81
www.ischool.drexel.edu
Autonomous Systems (AS)
3c
3a
3b
AS3
1a
2a
1c
1d
1b
2c
AS2
2b
AS1
Example of three AS’ and their interconnections.
1b, 1c, 2a, and 3a are all gateway routers.
INFO 203 week 7
82
www.ischool.drexel.edu
Autonomous Systems (AS)
• In order for the AS’ to talk to each other,
they need to use the same inter-AS
routing protocol; called BGP4 for the
Internet
– BGP4 defines which subnets are reachable
from various gateway routers (assuming more
than one exists)
• One common strategy is hot-potato
routing, where you send a packet to the
cheapest gateway router
INFO 203 week 7
83
www.ischool.drexel.edu
Autonomous Systems (AS)
• AS’ communicate to each other about new
destinations nearby
• Large ISPs may set up dozens of AS’ just
for themselves; smaller ISPs might be
one AS
• Now look at two intra-AS routing protocols
(RIP and OSPF) and the inter-AS routing
protocol BGP
INFO 203 week 7
84
www.ischool.drexel.edu
RIP
• The Routing Information Protocol (RIP) is
an older intra-AS routing protocol
– Based on work by Xerox and part of the BSD
Unix distribution in 1982
– RIP version 2 is defined by RFC 2453
• Works based on the DV model
– Cost is based on hop count; each link has cost=1
– Hop is the number of subnets crossed to get from
source to destination
INFO 203 week 7
85
www.ischool.drexel.edu
RIP
• Max cost allowed in RIP is 15 hops
• Routing updates are ~ every 30 sec using RIP
response messages or advertisements
• Each RIP router maintains a routing table
– The routing table contains the destination subnet, the
next router to get there, and the number of hops to
that destination
– Exchanging routing tables allows routers to find the
cheapest routes
INFO 203 week 7
86
www.ischool.drexel.edu
RIP
• If a neighboring router doesn’t provide an
update for three minutes, it’s assumed to
be dead (rest in peace?), and the routing
table is adjusted accordingly
• RIP messages go over UDP using port
520
• In Unix, the daemon ‘routed’ (route dee)
implements RIP
INFO 203 week 7
87
www.ischool.drexel.edu
OSPF (think sunscreen?)
• OSPF* and its cousin, IS-IS are widely
used for intra-AS routing
– OSPF version 2 is defined by RFC 2328
– IS-IS is defined by RFC 1195
• OSPF uses LS routing, and creates a
complete topological map of the entire AS
• Then it follows Dijkstra’s algorithm to find
the shortest paths everywhere in the AS
* OSPF = Open Shortest Path First, IS = Intermediate System
INFO 203 week 7
88
www.ischool.drexel.edu
OSPF
• Link cost can be 1 (just count hops) or
weighted inversely to the link’s capacity
(to put more traffic where it can be
handled well)
INFO 203 week 7
89
www.ischool.drexel.edu
OSPF
• All routers in the AS broadcast state
information to all other routers
– 1) when there’s a change in link cost or
status, or
– 2) every 30 minutes to say they’re alive
• OSPF messages are carried straight
over IP
INFO 203 week 7
90
www.ischool.drexel.edu
OSPF
• OSPF advantages include
– Security – exchanges between OSPF routers
must be authenticated, either by simple
password or MD5 encryption
– Use multiple paths that are the same cost
– Also handles multicast (MOSPF)
– Allows creation of hierarchy within the AS
• Defines Areas, which connect to the Boundary
Routers through Area Boundary Routers and
maybe Backbone Routers
INFO 203 week 7
91
www.ischool.drexel.edu
OSPF Internal Hierarchy
INFO 203 week 7
92
www.ischool.drexel.edu
BGP
• So, RIP or OSPF can be used for routing
within an AS
– But when the source and destination hosts
cross many AS’, need BGP, the Border
Gateway Protocol (currently BGP4)
• BGP gives AS’ the means to
– Get subnet info from neighboring AS’
– Propagate that info to routers within the AS
– Find good routes to subnets
INFO 203 week 7
93
www.ischool.drexel.edu
BGP
• BGP is massively complex (RFC 4271)
• BGP uses semi-permanent TCP
connections (using port 179) between
routers that connect AS’, and between
routers within an AS
• Each AS is identified by an ASN
(AS number)
– ASNs are defined by ICANN and RFC 1930
INFO 203 week 7
94
www.ischool.drexel.edu
Broadcast and Multicast
• So far everything has focused on one
source and one destination trying to
communicate (unicast)
• Broadcast routing sends a packet from a
source to all other nodes in the network
• Multicast routing sends from a source
node to selective other network nodes
INFO 203 week 7
95
www.ischool.drexel.edu
Broadcast Routing
• A simple way to handle broadcasting is to make
N copies of a packet, and send one to each of
the N destination nodes (hosts)
– This is N-way-unicast, since it really isn’t a broadcast
method at all
• Major disadvantages of this simple approach:
– It’s really inefficient, and overloads the first link
– It’s hard to know all target addresses, unless you add
on a broadcast membership protocol
INFO 203 week 7
96
www.ischool.drexel.edu
Uncontrolled Flooding
• A possible approach is to send a packet to
its neighbors, who send it to their
neighbors, etc.
• Massive problems include
– Cycle never ends if there are loops in the
network
– Multiple interconnections result in a broadcast
storm when a node gets e.g. three messages
to broadcast to all their neighbors, who get
multiple broadcast messages, and so on
INFO 203 week 7
97
www.ischool.drexel.edu
Controlled Flooding
• Try flooding, but with more logic to prevent
a broadcast storm
• Several possible approaches
– Sequence-number-controlled flooding adds its
address and a broadcast sequence number in
the packet
• Nodes check for having received this sequence
number (e.g. broadcast #1254) from them already;
if not, duplicate it and send to neighbors
INFO 203 week 7
98
www.ischool.drexel.edu
Controlled Flooding
– Reverse path forwarding (RPF) or reverse
path broadcasting (RPB) is subtle
• When a packet is received, send it out on all other
links ONLY IF it was received from the shortest
unicast path back to the source
• Otherwise, throw it out
INFO 203 week 7
99
www.ischool.drexel.edu
Spanning-Tree Broadcast
• While the controlled flooding approaches do
avoid a broadcast storm, they can still send
duplicate packets
• A spanning tree diagram connects all the nodes
in a network exactly once
– One that has minimum cost is a minimum
spanning tree
• Hence a possible broadcast approach is to
construct a minimum spanning tree and use it
INFO 203 week 7
100
www.ischool.drexel.edu
Spanning-Tree Broadcast
• Once defined, the spanning tree can be
used to initiate a broadcast from any node
– Each node only knows which adjacent nodes
are part of the tree
• Many algorithms can be used to create
spanning trees, such as the center-based
approach
INFO 203 week 7
101
www.ischool.drexel.edu
Reality v Broadcast Algorithms
• Broadcast algorithms are used at the
application and network layers
– Gnutella uses app-layer broadcasting, with a
time-to-live hop number countdown to give
limited-scope flooding
– OSPF and IS-IS use sequence-controlled
flooding to broadcast link-state
advertisements (LSAs)
INFO 203 week 7
102
www.ischool.drexel.edu
Multicast
• Multicast sends a packet only to select
nodes in a network
– There also may be more than one sender
• Examples of uses include
– Bulk software upgrades
– Streaming media to a class or meeting
– Shared apps like teleconferencing
– Data feeds (stock prices)
– Interactive gaming
INFO 203 week 7
103
www.ischool.drexel.edu
Multicast
• Key problems are
– How to identify the receivers of the message
– How to address those receivers
• In unicast, the IP address of the recipient
was enough; but now, does every address
get the list of all recipients?
– Addressing could be larger than the message
• Solve using address indirection
INFO 203 week 7
104
www.ischool.drexel.edu
Multicast
• Address indirection uses a single identifier
(here, a class D multicast address) for the
group of receivers, and address the packet
only with that single identifier
– The single identifier is a multicast group
• So how do we manage this multicast
group? Create an RFC! (duh!)
– Internet Group Management Protocol
INFO 203 week 7
105
www.ischool.drexel.edu
IGMP
• The Internet Group Management Protocol
(IGMP), version 3, RFC 3376, works
between a gateway router (first hop router)
and its hosts – only within its LAN
• IGMP allows a host to tell the router that a
hosted app wants to join a multicast group
– Then the router communicates to other
routers using a network-layer multicast routing
algorithm, e.g. PIM, DVMRP, or MOSPF
INFO 203 week 7
106
www.ischool.drexel.edu
Multicast Routing
• Multicast routing algorithms need to
ensure that all routers with hosts in the
group get the desired packets
– Other routers might have to get them too,
but avoid that where possible
• Two major approaches are used for
multicast routing
– Using a group-shared tree
– Using a source-based tree
INFO 203 week 7
107
www.ischool.drexel.edu
Using a group-shared tree
• Like the spanning-tree algorithm, build a
tree that includes all edge routers with
hosts in the group
– Uses a single tree to allow sending from any
sender; kind of a global approach
• A central node is used to coordinate the
process, so new routers send messages
to it to get added to the tree
– Also called a center-based tree approach
INFO 203 week 7
108
www.ischool.drexel.edu
Using a source-based tree
• Focuses on making a shared routing tree
based on a specific source sender
– Uses the RPF (reverse path forwarding)
algorithm, tweaked for multicast
– Can result in thousands of unwanted packets
to routers with no group members
• Routers who get unwanted packets send
a pruning message to a router upstream
from it
INFO 203 week 7
109
www.ischool.drexel.edu
Multicast in the Internet
• The first multicast routing algorithm is the
Distance-Vector Multicast Routing
Protocol (DVMRP, RFC 1075)
– Uses source-based trees with RPF and
pruning
– Uses a DV algorithm to find the shortest path
to the source
– Also monitors downstream dependent routers
– Has graft messages to, yes, undo a pruning
INFO 203 week 7
110
www.ischool.drexel.edu
Multicast in the Internet
• The Protocol-Independent Multicast (PIM,
RFC 3973) routing protocol is widely used
– Uses dense or sparse modes, depending on
the density of routers with group member
hosts
– Dense mode uses flood-and-prune RPF
– Sparse mode uses center-based tree, like the
core-based tree (CBT) protocol
– Can switch from group-shared tree to sourcebased tree after joining
INFO 203 week 7
111
www.ischool.drexel.edu
Multicast in the Internet
• PIM sparse domains can be joined at
rendevous points using Multicast Source
Discovery Protocol (MSDP, RFC 4611)
• A third option for multicast is SourceSpecific Multicast (SSM, RFC 4607)
– Under SSM only one host can send traffic into
the multicast tree, which makes defining the
tree a lot easier
INFO 203 week 7
112
www.ischool.drexel.edu
Multicast in the Internet
• BGP can also support multicast (RFC
4271)
• RFC 5110 is good for more discussion of
multicast routing
• Increasingly multicast is being handled at
the application layer, such as End System
Multicast (ESM)
INFO 203 week 7
113
www.ischool.drexel.edu
Multicast Babel?
• So far assumed all routers use the same
multicast protocol
– Within an AS this should be true
– But different AS’ could run different protocols
• RFC 2715 defines interoperability rules for
multicast routing protocols to play nicely
with each other
– DVMRP is the de facto standard, but PIM and
BGP are also viable
INFO 203 week 7
114
www.ischool.drexel.edu
Are We Dead Yet?
• Diving into the network core, we’ve covered
–
–
–
–
–
–
Service models for datagram and VC networks
Router components and how they work
IPv4 and IPv6 datagram formats
Allocation of IP addresses
NAT and ICMP
Link-state and distance-vector routing algorithms
INFO 203 week 7
115
www.ischool.drexel.edu
Are We Dead Yet?
– Routing within and among AS’
– Routing protocols RIP, OSPF, BGP
– Broadcast routing algorithms – uncontrolled &
controlled flooding, spanning-tree
– Multicast routing algorithms – IGMP, DVMRP,
and PIM and a few more…
• And you thought the network layer was
just IP
INFO 203 week 7
116
www.ischool.drexel.edu