IP Address - Zoo - Yale University
Download
Report
Transcript IP Address - Zoo - Yale University
Routing and Forwarding
11/16/2009
1
Exam 1
Highest 96, lowest 79, medium 87
2
Recap
Internet routing is divided into intra-AS
(intradomain) and inter-AS (interdomain)
routing
BGP (Border Gateway Protocol), a path-
vector protocol, is the de facto standard
interdomain protocol
Recap: BGP Routing Decision Process
route
selection
policy:
rank paths
routing cache
select best
path
export path
to neighbors
export
policy:
which
paths
export to
which
neighbors
Recap: Understanding BGP Policies Using P-Graph
Nodes in P-graph are feasible
2
paths
A directed edge from path
N1P1 to P1
210
20
0
130
10
3
1
320
30
A directed edge from a lower
ranked path to a higher
ranked path
If the P-graph has no loop,
then BGP policy converges.
1 3 0
2 1 0
3 2 0
1 0
2 0
3 0
P-graph
Internet System and Protocol
Assume a BGP path SABCD to destination AS
D. Consider the business relationship
between each pair:
S
A B
C
D
Three types of business relationships:
PC (provider-customer)
CP (customer-provider)
PP (peer-peer)
6
Typical Export Policies Imply Patterns of Routes
Invariant 1 of valid BGP routes (with labels
representing business relationship)
P
C
P
C
P
C
?P
C
Dest
Reason: only route learned from customer is sent to provider; thus
after a PC, it is always PC to the destination
Typical Export Policies Imply Patterns of Routes
Invariant 2 of valid BGP routes (with labels
representing business relationship)
CP
······
?
CP
CP/PP
Dest
Reason: routes learned from peer or provider are sent to only customers;
thus all relationship before is CP
Stability of BGP Routing
Suppose
1.
2.
there is no loop formed by provider-customer
relationship in the Internet
each AS uses typical route selection policy:
C > E/P
3.
each AS uses the typical export policies
Then BGP policy routing always converges !
Case 1: A Link is PC
Proof by contradiction. Assume a loop in P-graph. Consider a fixed link.
in the loop
PC
PC
PC
Case 2: Link is CP/PP
CP/PP
CP/PP
CP
CP
CP/PP
AS D
Routing: Example
E
d1
d
d2
AS A
(OSPF)
a2
F
i
AS C
a1
How to specify?
AS B
(OSPF intra
routing)
b
AS I
12
Outline
Recap
BGP policy routing
Internet addressing
13
IP Addressing Scheme: Requirements
We need an address to uniquely identify
each destination
Routing scalability needs flexibility in
aggregation of destination addresses
we
should be able to aggregate a set of
destinations as a single routing unit
Preview: the unit of routing in the Internet
is a network---the destinations in the routing
protocols are networks
14
IP Address: An IP Address Identifies an Interface
IP address: 32-bit
identifier for an
interface
interface:
routers typically
have multiple
interfaces
host may have
multiple interfaces
223.1.1.1
223.1.2.1
223.1.1.2
223.1.1.4
223.1.1.3
223.1.2.9
223.1.3.27
223.1.2.2
223.1.3.2
223.1.3.1
%/sbin/ifconfig -a
223.1.3.2 = 11011111 00000001 00000011 00000010
223
1
3
2
15
IP Addressing
IP address:
network part
host part
What’s a network ?
(from IP address
perspective)
is a unit of routing:
can be routed
together (depend
on the routing
protocol)
223.1.1.2
223.1.1.1
223.1.1.4
223.1.1.3
223.1.9.2
223.1.7.0
223.1.9.1
223.1.7.1
223.1.8.1
223.1.2.6
223.1.2.1 223.1.2.2
223.1.8.0
223.1.3.27
223.1.3.1
223.1.3.2
16
IP Addressing: Class-ful Addressing
given notion of “network”, let’s re-examine IP addresses:
“class-ful” addressing in the original IP design:
class
A
0 network
B
10
C
110
D
1110
1.0.0.0 to
127.255.255.255
host
128.0.0.0 to
191.255.255.255
host
network
network
multicast address
host
192.0.0.0 to
223.255.255.255
224.0.0.0 to
239.255.255.255
32 bits
Problem of class-ful addressing?
17
IP Addressing: CIDR
(Static)classful addressing:
inefficient use of address space, address space exhaustion
not flexible for aggregation
• e.g., a class A net allocated enough addresses for 16 million
hosts; a class B address may also be too big
CIDR: Classless InterDomain Routing
network portion of address of arbitrary length
address format: a.b.c.d/x, where x is # bits in network
portion of address
network
part
host
part
11001000 00010111 00010000 00000000
200.23.16.0/23
Some systems use mask (1’s to indicate network bits), instead of the /x format
18
CIDR Address Aggregation
AS D
d
d1
AS A
(OSPF)
130.132.1/24
i
i->a1: I can reach
130.132/22; my
path: I
a2
a1
130.132.2/24
AS I
intradomain
routing uses /24
130.132.3/24
19
CIDR Address Aggregation
B
x00/24: B
A
x01/24: C
C
x10/24: E
G
E
x11/24: F
F
20
IP Addressing: How to Get One?
Q: How does an ISP get its block of
addresses?
A: ICANN: Internet Corporation for Assigned
Names and Numbers
allocates addresses
manages DNS
assigns domain names, resolves disputes
Use
%whois –h whois.arin.net “n <org>”
to check addresses allocated to <org>
21
Example: Yale
Yale address blocks:
%whois –h whois.arin.net “n yale university”
Yale BGP routing:
Using one of the looking glass servers:
http://www.bgp4.as/looking-glasses
22
Routing Table Size of BGP (number of globally
advertised networks)
Active BGP Entries (http://bgp.potaroo.net/as1221/bgp-active.html)
23
Routing Table Prefix Length Distr.
180000
160000
140000
120000
100000
80000
60000
40000
20000
0
8
13
18
23
28
24
IP addresses: How to Get One?
Q: How does a host get an IP address?
Static configured
wintel: control-panel->network->configuration>tcp/ip->properties
unix:
%/sbin/ifconfig eth0 inet 192.168.0.10 netmask
255.255.255.0
DHCP: Dynamic Host Configuration Protocol:
dynamically get address from as server
“plug-and-play”
25
DHCP: Dynamic Host Configuration Protocol
Goal: allow host to dynamically obtain its IP address
from network server when it joins network
can renew its lease on address in use
allows reuse of addresses (only hold address while
connected)
support for mobile users who want to join network
DHCP msgs:
host broadcasts “DHCP discover” msg
DHCP server responds with “DHCP offer” msg
host requests IP address: “DHCP request” msg
DHCP server sends address: “DHCP ack” msg
26
Network Address Translation: Motivation
A local network uses just one public IP address as far as outside
world is concerned
Each device on the local network is assigned a private IP address
rest of
Internet
local network
(e.g., home network)
192.168.1.0/24
192.168.1.1
192.168.1.2
192.168.1.3
138.76.29.7
192.168.1.4
All datagrams leaving local
network have same single source
NAT IP address: 138.76.29.7,
different source port numbers
Datagrams with source or
destination in this network
have 192.168.1/24 address for
source, destination (as usual)
27
NAT: Network Address Translation
Implementation: NAT router must:
outgoing datagrams: replace (source IP address, port
#) of every outgoing datagram to (NAT IP address,
new port #)
. . . remote clients/servers will respond using (NAT
IP address, new port #) as destination addr.
remember (in NAT translation table) every (source
IP address, port #) to (NAT IP address, new port #)
translation pair
incoming datagrams: replace (NAT IP address, new
port #) in dest fields of every incoming datagram
with corresponding (source IP address, port #)
stored in NAT table
28
NAT: Network Address Translation
NAT translation table
WAN side addr
LAN side addr
1: host 192.168.1.2
2: NAT router
sends datagram to
changes datagram
138.76.29.7, 5001 192.168.1.2, 3345
128.119.40.186, 80
source addr from
……
……
192.168.1.2, 3345 to
138.76.29.7, 5001,
S: 192.168.1.2, 3345
updates table
D: 128.119.40.186, 80
2
S: 138.76.29.7, 5001
D: 128.119.40.186, 80
138.76.29.7
S: 128.119.40.186, 80
D: 138.76.29.7, 5001
3: Reply arrives
dest. address:
138.76.29.7, 5001
3
192.168.1.2
1
192.168.1.1
192.168.1.3
S: 128.119.40.186, 80
D: 192.168.1.2, 3345
4
192.168.1.4
4: NAT router
changes datagram
dest addr from
138.76.29.7, 5001 to 192.168.1.2, 3345
29
Network Address Translation: Advantages
No need to be allocated range of addresses
from ISP: - just one public IP address is
used for all devices
16-bit port-number field allows 60,000
simultaneous connections with a single LAN-side
address !
can change ISP without changing addresses of
devices in local network
can change addresses of devices in local network
without notifying outside world
Devices inside local net not explicitly
addressable, visible by outside world (a
security plus)
30
Network Address Translation: Problems
If both hosts are behind NAT, they will have
difficulty establishing connection
NAT is controversial:
routers
should process up to only layer 3
violates end-to-end argument
• NAT possibility must be taken into account by app
designers, e.g., P2P applications
address
shortage should instead be solved by
having more addresses --- IPv6 !
31
Outline
Admin. and recap
BGP
IP addressing
IP forwarding
32
IP Datagram Format
IP protocol version
number
header length
(bytes)
“type” of data
max number
remaining hops
(decremented at
each router)
upper layer protocol
to deliver payload to
how much overhead
with TCP?
20 bytes of TCP
20 bytes of IP
= 40 bytes + app
layer overhead
32 bits
head. type of
length
ver
len service
fragment
16-bit identifier flgs
offset
time to upper
Internet
layer
live
checksum
total datagram
length (bytes)
for
fragmentation/
reassembly
32 bit source IP address
32 bit destination IP address
Options (if any)
data
(variable length,
typically a TCP
or UDP segment)
E.g. timestamp,
record route
taken, specify
list of routers
to visit.
33
Data Forwarding: Steps
Error checking, e.g., check header checksum;
if error, set up error flag
Decrement TTL; if TTL == 0, set error flag
If error, drop the packet, and generate
ICMP report
34
The Network Layer
Host, router network layer functions:
Transport layer: TCP, UDP
Network
layer
The IP protocol
•addressing
•datagram format
Routing protocols
•path selection
•RIP, OSPF, BGP
forwarding
ICMP protocol
•error reporting
•router “signaling”
Link layer
Physical layer
35
ICMP: Internet Control Message Protocol
communicate network-level
information
error reporting:
unreachable host, network,
port, protocol
echo request/reply (used by
ping)
network-layer “above” IP:
ICMP msgs carried in IP
datagrams
ICMP message: type, code plus
first 8 bytes of IP datagram
causing error
type
code
checksum
ICMP message body
Type
0
3
3
3
3
3
3
4
Code
0
0
1
2
3
6
7
0
8
9
10
11
12
0
0
0
0
0
description
echo reply (ping)
dest network unreachable
dest host unreachable
dest protocol unreachable
dest port unreachable
dest network unknown
dest host unknown
source quench (congestion
control - not used)
echo request (ping)
route advertisement
router discovery
TTL expired
bad IP header
traceroute is developed by a clever use
of ICMP
36
Data Forwarding: Steps
If no error, look up packet destination
address in forwarding table:
if datagram for a host on directly attached
network, it is the job of the link layer now
otherwise,
• lookup: find next-hop router, and its outgoing interface
• if needed, do fragmentation
• forward packet to outgoing interface (to the next hop
neighbor)
try %netstat –rn to see the forwarding table
37
Forwarding Look up
default: -
#
prefix
a)
b)
c)
d)
e)
f)
g)
h)
i)
j)
k)
00001
00010
00011
001
0101
011
10
100
1010
1011
1100
interface
0
1
a
g
f
d
e
abc
h
i
jk
The networks are represented by
a decision tree, e.g., a Patricia Trie
to look for the longest match of
the destination address
38
Example 1 (same network): A->B
src
forwarding table in A
dst
misc
data
fields 223.1.1.1 223.1.1.3
Look up dest address
find dest is on same net
link layer will send the
datagram directly inside a
link-layer frame
Dest. Net. next router
Nhops
223.1.1/24
1
223.1.2/24 223.1.1.4
2
223.1.3/24 223.1.1.4
2
0.0.0.0/0 223.1.1.4
To Internet
A 223.1.1.1
223.1.4.1
223.1.2.1
B
223.1.1.2
223.1.1.4
223.1.2.9
223.1.2.2
223.1.1.3
223.1.3.1
223.1.3.27
E
223.1.3.2
39
Example 2 (Different Networks): A-> E
forwarding table in A
misc
data
fields 223.1.1.1 223.1.2.3
look up dest address in
forwarding table
routing table: next hop
router to dest is 223.1.1.4
link layer sends datagram to
router 223.1.1.4 inside a linklayer frame
the dest. of the link layer
frame is 223.1.1.4
Dest. Net. next router
Nhops
223.1.1/24
1
223.1.2/24 223.1.1.4
2
223.1.3/24 223.1.1.4
2
0.0.0.0/0 223.1.1.4
To Internet
A 223.1.1.1
223.1.4.1
223.1.2.1
B
223.1.1.2
223.1.1.4
223.1.2.9
223.1.2.3
223.1.1.3
223.1.3.1
223.1.3.27
E
223.1.3.2
40
Example 2 (Different Networks): A-> E
misc
data
fields 223.1.1.1 223.1.2.3
Arriving at 223.1.1.4,
destined for 223.1.2.2
look up dest address in
router’s forwarding table
E on same network as router’s
interface 223.1.2.9
router, E directly attached
link layer sends datagram to
223.1.2.2 inside link-layer
frame via interface 223.1.2.9
datagram arrives at
223.1.2.2!! (hooray!)
forwarding table in router
Dest. Net router Nhops interface
223.1.1/24
223.1.2/24
223.1.3/24
0.0.0.0/0
A
B
-
223.1.1.1
1
223.1.1.4
1
223.1.2.9
1
223.1.3.27
223.1.4.1
To Internet
223.1.4.1
223.1.2.1
223.1.1.2
223.1.1.4
223.1.2.9
223.1.2.3
223.1.1.3
223.1.3.1
223.1.3.27
E
223.1.3.2
41
Backup Slides
42
Another Method for Checking
Convergence: Dispute Wheels
A dispute
wheel
prefers
R1Q2 over Q1
u2 prefers
R2 Q3 over
Q2
etc
u1
2
210
20
R1
Q2
Q1
u1
130
10
1
3
Q3
d
u3
Qn
Rn
un
0
R2
u2
R3
320
30
43
Patterns of Valid Routes
Consider how a path is
extended according to the
export policies
case 1 (format: Destination,
link type, …)
• Dest, …CP Dest, … CP CP
• Dest, …CP Dest, … CP PC
• Dest, …CP Dest, … CP PP
case 2
Valid consecutive link types
(starting from source to
destination, i.e., reserve)
PC PC
CP PC
PP PC
CP CP
CP PP
• Dest, …PC Dest, … PC PC
case 3
• Dest, …PP Dest, … PP PC
44
Case 1: The first link of Q1 is a PC link
Then the first link of R1Q2 must be a PC
link
because u1 chooses R1Q2 and C > E/P
Thus all remaining links along R1Q2 are PC
links
because only PC follows PC
u1
Thus the first link of Q2 is
a PC link
……
All links along R1, …, Rn are
PC links
The network has a PC loop.
contradiction !
PC
Rn
R1
PC
Q1
u2
Qn
un
d
Rn-1
Q2
R2
45
Case 2: The first link of Q1 is a CP/PP link
All links along Rn are CP links because all
links before CP/PP are CP
The first link of Qn must be a CP/PP link
because C > E/P
……
CP
Rn
u1
CP/PP
Q1
All links along R1, …, Rn are CP links
The network has a PC loop.
contradiction !
u2
Qn
un
CP/PP
Rn-1
d
Q2
R2
46
Backup: IP Multicast
47
IP Fragmentation & Reassembly
Network links have MTU
(max.transfer size) largest possible link-level
frame.
different link types,
different MTUs, e.g.
Ethernet MTU is 1500
bytes
Large IP datagram divided
(“fragmented”)
one datagram becomes
several datagrams
“reassembled” only at
final destination
IP header bits used to
identify, order related
fragments
fragmentation:
in: one large datagram
out: 3 smaller datagrams
reassembly
48
IP Fragmentation and Reassembly
Example
4000 byte
datagram
MTU = 1500 bytes
length ID fragflag offset
=4000 =x
=0
=0
One large datagram becomes
several smaller datagrams
length ID fragflag offset
=1500 =x
=1
=0
length ID fragflag offset
=1500 =x
=1
=1480
length ID fragflag offset
=1040 =x
=0
=2960
49
IP Multicast: Service Model
128.59.16.12
128.119.40.186
multicast
group
226.17.30.197
128.34.108.63
128.34.108.60
Multicast group concept: use of indirection
Open group model
A group is identified by a location-independent
logical address (class D IP address: prefix 1110)
Anyone can send packets to the “logical” group address
Anyone can join a group and receive packets
Normal, best-effort delivery semantics of IP
Needed: infrastructure to deliver mcast-addressed datagrams to
all hosts that have joined that multicast group
50
Multicast Across LANs
Goal: find a tree (or trees) connecting routers having
local mcast group members
source-based: different tree from sender to each receiver
– Distance-vector multicast routing protocol (DVMRP)
– Protocol-independent multicast-dense mode (PIM-DM)
shared-tree: same tree used by all group members
– Core-Based Tree (CBT)
– Protocol-independent multicast-sparse mode (PIM-SM)
shared tree
source-based trees
51
Source Tree:
Reverse Path Flooding (RPF)
A router x forwards a packet from source (S)
iff it arrives via neighbor y, and y is on the
shortest path from x back to S
A packet is replicated to all but the incoming
interface
1
S
1
y
x
1
1
z
t
1
a
52
Reverse Path Forwarding:
Improvement
Basic idea: forward a packet from S only on
child links for S
A child link of router x for source S
a link that has x as parent on
the shortest path from the
link to S
a child x notifies its parent y
(through the routing protocol)
that it has selected y as its
parent
S
y
x
z
t
a
53
Reverse Path Forwarding: Pruning
No
need to forward datagrams down subtree
with no mcast group members
“prune” msgs sent upstream by router with
no downstream group members
LEGEND
S: source
R1
router with attached
group member
R4
R2
P
R5
R3
R6
P
R7
P
router with no attached
group member
prune message
links with multicast
forwarding
54
Pruning
Prune (Source, Group) at a leaf router if no members
send No-Membership Report (NMR) up tree
If all children of router R prune (S,G)
propagate prune for (S,G) to its parent
What do you do when a member of a group (re)joins?
send a Graft message to upstream parent
How to deal with failures?
prune dropped
flow is reinstated
down stream routers re-prune
Note: again a soft-state approach
55
Implementation of
Source Trees in the Internet
Multicast OSFP (MOSFP)
Membership is part of the link state distribution; calculate
source specific, pre-pruned trees
Reverse Path Forwarding
Distance Vector Multicast Routing Protocol (DVMRP)
Protocol Independent Multicast – Dense Mode (PIM-DM)
• very similar to DVMRP
Difference: PIM uses any unicast routing algorithm to
determine the path from a router to the source; DVMRP
uses distance vector
Question: the state requirement of Reverse Path
Forwarding
56
Building a Shared Tree
Steiner Tree: minimum cost
tree connecting all routers
with attached group members
A Steiner tree is not a
spanning tree because you
do not need to connect all
nodes in the network
Problem is NP-hard
Excellent heuristics exists
Not used in practice:
computational complexity
information about entire network needed
monolithic: rerun whenever a router needs to join/leave
57
Center (Core) based Shared Tree
Single delivery tree shared by all
One router identified as “center” of tree
Tree construction is receiver-based
edge router sends unicast join-msg addressed to center
router
join-msg “processed” by intermediate routers and
forwarded towards center
join-msg either hits existing tree branch for this center,
or arrives at center
path taken by join-msg becomes new branch of tree for
this router
A sender unicasts a packet to center
The packet is distributed on the tree when it hits the tree
58
Example: M3 Joins
Group members: M1, M2
core
M1
M2
M3
shared tree
join message
S1
Discussion: what is property of the constructed tree?
59
Example: M1 Sends Data
Group members: M1, M2, M3
M1 sends data
core
M1
M2
control (join) messages
data
M3
S1
60
Shared Tree Protocols in the Internet
Core Based Tree
Protocol Independent Multicast (PIM)
Sparse mode
The catch: how do you know the center?
session announcement
61
Mbone: Tunneling
Q: How to connect “islands” of multicast
routers in a “sea” of unicast routers?
physical topology
logical topology
mcast datagram encapsulated inside “normal” (non-multicast-
addressed) datagram
normal IP datagram sent thru “tunnel” via regular IP unicast to
receiving mcast router
receiving mcast router unencapsulates to get mcast datagram
62