Broadcast Routing
Download
Report
Transcript Broadcast Routing
Where we are in the big picture
host
host
HTTP message
HTTP
TCP segment
TCP
router
IP
Ethernet
interface
5/23/05
HTTP
IP packet
Ethernet
interface
IP
TCP
router
IP packet
SONET
interface
SONET
interface
1
IP
IP packet
Ethernet
interface
IP
Ethernet
interface
CS118/Spring05
Broadcast Routing
Goal: Deliver packets from a source to all other nodes
Why letting source send to everyone not a good solution
the source may not know all the destination addresses
Redundant transmission over links near the source
R1
replicate
R1
replicate
R2
R2
R3
R4
R4
in-network
replication
source
replication
5/23/05
R3
2
CS118/Spring05
In-network replication
Flooding: when a node receives a broadcast packet,
sends a copy to all neighbors
Problems: packet looping
Controlled flooding: node broadcast a packet if it hasn’t
seen the same packet before
Node must keep track of packet ids already seen
Reverse Path Forwarding (RPF): a node N forwards
packet if it arrived on shortest path between N and
source
source
R1
R4
Make use of forwarding table
of unicast routing
R2
R3
5/23/05
3
R6
R5
R7
CS118/Spring05
Spanning Tree
First construct a spanning tree
Nodes forward copies only along spanning tree
No redundant packets received by any node
All sources send data along the same tree
A
A
B
c
F
E
c
D
F
G
E
D
G
(b) Broadcast initiated at D
(a) Broadcast initiated at A
5/23/05
B
4
CS118/Spring05
Creating a center-based Spanning Tree
First pick a center node (E in this example)
Each node sends unicast join message towards
the center node
Message
forwarded until it arrives at a node already
belonging to spanning tree
A
B
c
F
5/23/05
A
B
c
D
F
E
E
D
G
G
(a) Stepwise construction
of spanning tree
(b) Constructed spanning
tree
5
CS118/Spring05
Multicast Routing
Goal: build a tree to reach all members in a mcast group
shared-tree: same tree used by all group members
minimal spanning tree (Steiner)
center-based tree
source-based tree: one tree from each sender to receivers
Each tree made of shortest paths to all mcast members
2 ways to build: link-state (MOSPF), RPF (DVMRP)
5/23/05
Shared tree
6
Source-based trees
CS118/Spring05
Shared-Tree: Steiner Tree
Steiner Tree: minimum cost tree connecting all
routers with attached group members
problem is NP-complete
excellent heuristics exists
not used in practice:
computational
complexity
Infeasible to get/keep information about entire
network
monolithic: rerun whenever a router needs to
join/leave
5/23/05
7
CS118/Spring05
Center-based shared routing tree
one router identified as “center” of tree
router with mcast member attached sends unicast
join-msg addressed to the center router
join-msg
processed by intermediate routers before
being forwarded towards center
join-msg either hits existing tree branch for this
center, or arrives at center
path taken by join-msg becomes new branch of tree
R1
R4
R2
R3
5/23/05
8
R6
R5
R7
CS118/Spring05
Building Shortest Path Tree: MOSPF
OSPF: a link-state routing protocol
Each node maintains a complete network graph
Use Dijkstra algorithm to compute the shortest path
MOSPF (Multicast OSPF)
Carry multicast membership in link-state packet
Router computes shortest path tree for each source
S: source
R1
R4
R2
R5
R3
5/23/05
R6
LEGEND
router with attached
group member
router with no attached
group member
R7
9
CS118/Spring05
Reverse Path Forwarding: example
S: source
LEGEND
R1
R4
router with attached
group member
R2
router with no attached
group member
R5
R3
R6
datagram will be
forwarded
datagram will not be
forwarded
R7
• result is a source-specific reverse shortest path
broadcast tree
– may not be a good tree if link cost is asymmetric
– It's a broadcast tree, reaching every node
5/23/05
10
CS118/Spring05
Trim Broadcast Tree to Mcast Tree by Pruning
no need to forward packets down branches which
has no mcast group members
router with no downstream group members sends
“prune” message upstream
Routers
keep state regarding prune msgs
S: source
LEGEND
R1
router with attached
group member
R4
R2
P
R5
R3
5/23/05
R6
P
R7
11
P
router with no attached
group member
prune message
links with multicast
forwarding
CS118/Spring05
Distance-Vector Multicast Routing Protocol
(DVMRP, specified in RFC1075)
DVMRP routers run RIP to build routing table
Use reverse path forwarding to build source-based tree
initial datagram to mcast group floods everywhere via RPF
Edge routers with no members send prune msg upstream
soft state: router periodically delete prune state
Sender may have finished by then
R1
If not, downstream router prune again
routers can quickly graft to tree
By canceling the prune state
R2
R4
P
R5
R3
5/23/05
12
R6
P
R7
CS118/Spring05
PIM: Protocol Independent Multicast
independent from underlying unicast routing
algorithm
Either
get "next hop" information for each node from
the unicast forwarding table, or
Use unicast routing to forward mcast join message
two different multicast distribution scenarios:
Dense: group members densely packed, in “close”
proximity
Sparse: # networks with group members small wrt to
the total # of interconnected networks, group
members widely dispersed
5/23/05
13
CS118/Spring05
Consequences of Sparse-Dense Dichotomy:
Dense
group membership by
routers assumed until
routers explicitly prune
data-driven construction
on mcast tree (e.g., RPF)
Implementation:
flood-and-prune RPF
similar to DVMRP but
using info from underlying
unicast protocol for RPF
checking
5/23/05
Sparse
no membership until
routers explicitly join
receiver- driven
construction of mcast tree
(e.g., center-based)
14
CS118/Spring05
PIM - Sparse Mode
Build center-based shared tree
router sends join msg to
rendezvous point (RP)
Data sources:
unicast packets to RP, which
forwards down RP-rooted tree
RP can send stop msg to
source if no receivers joined
the group
5/23/05
R1
intermediate routers update state
and forward join msgs
R4
join
R2
R3
join
R5
join
R6
all data multicast
from rendezvous
point
R7
rendezvous
point
“no one is listening!”
15
CS118/Spring05
Components of the IP Multicast Architecture
hosts
host-to-router protocol
Internet Group Management Protocol
routers
multicast routing protocols
(MOSPF, DVMRP, PIM)
IGMP operates between Router and local Hosts on the
same network (e.g., Ethernet)
Router queries local Hosts for mcast group membership info
Hosts respond with membership reports
5/23/05
16
CS118/Spring05
IP Multicast Address
131.179.26.38
18.4.157.100
Class D IP addresses:
1 110
group ID
in “dotted decimal” notation: 224.0.0.0 — 239.255.255.255
Two administrative categories:
5/23/05
well-known multicast addresses, assigned by IANA
(the rest) transient addresses, assigned & reclaimed dynamically
17
CS118/Spring05
How IGMP Works (I)
One router is elected the “querier” on each
local/physical network
querier periodically sends Membership Query
message to “all-systems group” (224.0.0.1) with
TTL=1
routers: Q
hosts:
5/23/05
18
CS118/Spring05
How IGMP Works (II)
On receipt, a host starts a random timer [0–10 sec] for each
multicast group it wants to join
when a host’s timer for group G expires, it sends a Membership
Report to group G (TTL = 1)
other members of G hear the report, stop their timers
routers hear all reports
Normal case: only one report message per group is sent in response
to a query
when a host first joins a group, it can send unsolicited reports
immediately
Q
G
5/23/05
G
G
19
G
CS118/Spring05
How IGMP Works: Leaving a Multicast Group
host sends a Leave Group msg to group address
G if it was the most recent host to report
membership in that group
Upon receiving Leave Group msg: query router
sends a few queries to group G with a small maxresponse-time
if
no Membership Report heard, stop data forwarding
Q
G
5/23/05
20
G
G
G
CS118/Spring05
IGMP message types
IGMP Msg type
Sent by
Purpose
membership query: general
router
query for current active multicast
groups
membership query: specific
router
query for specific m-cast group
membership report
host
host wants to join group
leave group
host
host leaves the group
IP header
5/23/05
21
CS118/Spring05
Chapter 5: the Data Link Layer: overview
data delivery between two physically connected
devices
implementation of various link layer
technologies:
Ethernet,
hubs, bridges, IEEE 802.11 LANs, PPP
Address mapping between LAN and IP: ARP
M
Ht
M
Hn Ht
Hl Hn Ht
M
M
application
transport
network
link
physical
data link
protocol
phys. link
network
link
physical
Hl Hn Ht
M
frame
adapter card
5/23/05
22
CS118/Spring05
Link Layer Services
sending data over a physical link
1. bit encoding: transmitting sequence of 1’s and 0’s by signals
2. Framing: defining the beginning & end of a data chunk
3. bit error detection
4. reliable transmission
MAC (Medium Access Control) addresses to identify source,
destination nodes
Different address, not IP address!
Channel access if shared medium
e.g, Ethernet, wireless, etc.
pacing between sender and receivers
implemented in “adapter” (e.g., PCMCIA card, Ethernet card)
Flow Control:
Link type: Half-duplex vs. full-duplex
5/23/05
23
CS118/Spring05
Data Framing
Terminology: for a block of data
at link layer: normally called a data frame
at network layer: a packet
at transport level: TCP segment; UDP datagram
A frame/packet has a header field
optionally there may be a trailer field too
data
Byte-Oriented Framing Protocol: delineate frame with a
special bit sequence: 01111110
Q: What if the bit sequence 01111110 occurs in data stream?
5/23/05
24
CS118/Spring05
Byte stuffing
Sender: adds (“stuffs”) extra < 01111110> byte
after each apperance of < 01111110>
Receiver:
single 01111110: flag byte
two back-to-back 01111110 bytes: discard first
byte, continue data reception
stuffing changes the total length of data to be sent
5/23/05
25
CS118/Spring05
Error Detection
EDC= Error Detection and Correction bits (redundancy)
D = Data protected by error checking
• Error detection not 100% reliable!
• protocol may miss some errors, though rarely
• larger EDC field gives better detection and correction
5/23/05
26
CS118/Spring05
Parity Checking
Single Bit Parity:
Two Dimensional Bit Parity:
Detect single bit errors
Detect and correct single bit errors
consider a data frame as
a m n matrix
A parity bit for each row
n-bit checksum,
and
A parity bit for each column
m-bit checksum
5/23/05
27
CS118/Spring05
Cyclic Redundancy Check (CRC)
consider a data frame as a bit sequence M(x)
e.g. 10011010 M(x) = x7 + x4 + x3 + x1
k-term polynomial has a degree of (k-1)
• e.g. 10011010 represents a 8-term polynomial
use a (r+1)-bit generator polynomial G(x) to
compute the checksum for M(x)
sender
makes the transmitted bit sequence dividable by
G(x) by appending r-bit remainder to M(x)
receiver divides the received sequence by G(x)
M
M
5/23/05
28
CS118/Spring05
How to compute CRC
1. append r zero bits to M(x) to get xrM(x)
M(x) = 10011010 = x7 + x4 + x3 + x1, G(x) = 1101=x3 + x2 + 1, r = 3
then xrM(x) = x10 + x7 + x6 + x4 10011010000
2. divide xrM(x) by G(x)
10011010000
/ 1101 = 11111001, remainder 101
3. subtract the remainder from xrM(x) T(x), check-summed
frame to be transmitted
xrM(x) / G(x) = Q(x), remainder R(x), T(x) = xrM(x) - R(x)
In reality: just append R(x) to the end of data frame M(x) :
10011010 101
Because: xrM(x) = shifting M(x) to the left by r bits, and
for XOR: r bits of 0's – R(x) = R(x)
At receiving end: if the received frame P(x) divisible by G(x),
P(x)/G(x) = 0, P(X) considered error free
5/23/05
29
CS118/Spring05
An example CRC computation
11111001
1101 10011010000
1101
1001
1101
1000
1101
1011
1101
1100
1101
1000
1101
remainder 101
11111001
1101 10011010101
1101
1001
1101
1000
1101
1011
1101
1100
1101
1101
1101
0
computation by sender
5/23/05
computation by receiver
30
CS118/Spring05
Multiple Access protocols
The problem: single shared communication channel, only
one node can send successfully at a time
3 broad classes:
Channel Partitioning
• divide channel into smaller “pieces” (time slots, frequency band, code
modulation)
• allocate piece to node for exclusive use
“Taking turns”
• coordinate shared access to avoid collision
5/23/05
Random Access
• Detect and resolve collisions
31
CS118/Spring05
Channel Partitioning: TDMA and FDMA
TDMA: Time Division Multiple Access
channel divided into N fixed length time slots, one per station
each station takes turns to access channel
unused slots go idle
example: 6-station LAN, 1,3,4 send data, slots 2,5,6 idle
FDMA
channel spectrum divided into
frequency bands
each station assigned fixed
frequency band
unused frequency bands go idle
example: 6-station LAN, 1, 3, 4
send data, frequency bands 2,5,6 idle
5/23/05
frequency bands
32
CS118/Spring05
“Taking Turns” MAC protocols
channel partitioning: commonalities
Share
channel efficiently with constant, uniform load
But inefficient with random, non-uniform load
• delay in channel access
• 1/N bandwidth allocated even if only 1 active node!
“taking turns” protocols: on-demand channel allocation
Polling: master node asks slave nodes to transmit in turn
• Concerns: polling latency, single point of failure
Token
5/23/05
passing
33
CS118/Spring05
“Taking Turns” by Token Passing
One token message passed from one node to next
sequentially
whoever gets the token can send one data frame, then
passes token to next node
A master station generates the token and monitors its
circulation
concerns:
token overhead
latency
single point of failure (token)
5/23/05
34
CS118/Spring05
Random Access protocols
When node has packet to send
transmit at full channel data rate R.
no a priori coordination among nodes
If two or more nodes transmitting at the same time
“collision”
random access MAC protocol specifies:
how to detect collisions
how to recover from collisions
Examples of random access MAC protocols:
ALOHA
slotted ALOHA
CSMA and CSMA/CD
5/23/05
35
CS118/Spring05
ALOHA
If a station has data to send: just send
collision probability:
frame sent at t0 collide with other packets sent in [t0-1, t0+1]
P(success by given node) = P(node transmits) .
P(no other node transmits in [t0-1,t0] .
P(no other node transmits in [t0,t0+1]
= p . (1-p)N-1 . (1-p)N-1
P(success by any node) = N p . (1-p) 2(N-1) choosing optimum p as n ∞ ...
= 1/(2e) = 0.18
5/23/05
36
CS118/Spring05
Slotted Aloha
Assumptions:
All frames the same size; clocks in all nodes are
synchronized
Divide time into equal size slots (= pkt trans.
time)
If 2 or more nodes transmit in the same slot, all
nodes detect collision
5/23/05
37
CS118/Spring05
Slotted Aloha
Operations:
When a node gets data to send, it transmits at beginning
of next slot
If no collision, node can send new frame in next slot
If collision: node retransmits frame in each subsequent
slots with probability p, until successful.
Success (S), Collision (C), Empty (E) slots
5/23/05
38
CS118/Spring05
Slotted Aloha efficiency
Q: what is the max fraction of slots successful?
each node transmits in a slot with probability
prob. successful transmission S is
p
by a given node: S= p (1-p)(N-1)
by any of N nodes
S = Prob (only one transmits) = N p (1-p)(N-1)
… choosing optimum p as n ∞ ...
= 1/e = 0.37
0.4
0.3
Slotted Aloha
0.2
0.1
Pure Aloha
0.5
5/23/05
At best: channel
use for useful
transmissions 37%
of time!
1.0
1.5
G = offered load = N*p
2.0
39
CS118/Spring05
CSMA: Carrier Sense Multiple Access
listen before transmit
If channel sensed idle: transmit
If channel sensed busy, wait
p-persistent CSMA: when channel becomes idle, retry
immediately with probability p
Non-persistent CSMA: retry
after a random interval
collisions still possible:
propagation delay two or
more nodes may send
simultaneously
Chance of collision goes up
with distance between nodes
To cut the loss early: CSMA/CD
5/23/05
40
CS118/Spring05
CSMA/CD (Collision Detection)
Collision Detection: compare
transmitted with received signals
Abort collided transmissions
5/23/05
41
CS118/Spring05
Classification of
different multiaccess protocols
Multiple data sources sharing
the same communication link
(Ethernet, FDDI, Appletalk, etc)
controlled access
random access
ALOHA,
slotted
ALOHA
adaptive
to demand
CSMA/CD
token
passing
polling
Static
allocation
TDMA
FDMA
CDMA
5/23/05
42
CS118/Spring05
LAN Addresses and ARP
32-bit IP address: network-layer address
used to get IP packet to destination host
LAN (or MAC or physical) address:
used to get frame from one interface to another physically
connected interface (same network)
48 bit MAC address (for most LANs)
burned in the adapter ROM
Each adapter on LAN has
a unique LAN address
Broadcast address:
FF-FF-FF-FF-FF-FF
5/23/05
43
CS118/Spring05
LAN Address (more)
MAC address allocation administered by IEEE
manufacturer buys portion of MAC address space (to
assure uniqueness)
Analogy:
(a) MAC address: like Social Security Number
(b) IP address: like postal address
MAC flat address => portability
can move LAN card from one LAN to another
IP hierarchical address NOT portable
depends on network to which one attaches
5/23/05
44
CS118/Spring05
Recall earlier routing discussion
Node A sends IP packet to B:
• look up network address of B,
find B is on the same net as A
• link layer send datagram to B
inside link-layer frame
A
223.1.1.1
223.1.2.1
B
223.1.1.2
223.1.1.4 223.1.2.9
223.1.1.3
223.1.3.27
223.1.3.1
frame source,
dest address
B’s MAC A’s MAC
addr
addr
223.1.2.2
E
223.1.3.2
datagram source,
dest address
A’s IP
addr
B’s IP
addr
IP payload
datagram
frame
5/23/05
45
CS118/Spring05
ARP: Address Resolution Protocol
Question: how to determine
MAC address of B
given B’s IP address?
Each IP node (Host,
237.196.7.78
1A-2F-BB-76-09-AD
237.196.7.23
Router) on LAN has
ARP table
ARP Table: IP/MAC
address mappings for
some LAN nodes
237.196.7.14
LAN
71-65-F7-2B-08-53
237.196.7.88
5/23/05
< IP address; MAC address; TTL>
58-23-D7-FA-20-B0
0C-C4-11-6F-E3-98
46
TTL (Time To Live):
time after which address
mapping will be
forgotten (typically 20
min)
CS118/Spring05
ARP protocol
A knows B's IP address, wants to learn B's MAC address
A broadcasts ARP query pkt, containing B's IP address
Dest MAC address = FF-FF-FF-FF-FF-FF
all machines on LAN receive ARP query
B receives ARP packet, replies to A with its (B's) physical layer
address
reply sent to A’s MAC address (unicast)
A caches (saves) IP-to-physical address pairs until information
becomes old (times out)
soft state: information that times out (goes away) unless refreshed
ARP is “plug-and-play”:
nodes create their ARP tables without intervention from net
administrator
5/23/05
47
CS118/Spring05
Routing to another LAN
A creates IP packet with source A, destination B
A uses ARP to get R’s physical layer address for 111.111.111.110
A creates Ethernet frame with R's MAC address as destination
A’s data link layer sends the Ethernet frame
R’s data link layer receives Ethernet frame
R extracts IP datagram from Ethernet frame, sees it’s destined to B
R uses ARP to get B’s physical layer address
R creates frame containing A-to-B IP packet, sends to B
A
R
5/23/05
B
48
CS118/Spring05
Ethernet
first widely used LAN technology, dominant LAN today
Kept up with speed race: 10 Mbps – 10 Gbps
Bus topology popular through mid 90s
Now star topology prevails
Connection choices: hub or switch (more later)
hub or
switch
5/23/05
49
CS118/Spring05
Schedule for next week
Makeup lecture: Monday 6/6
8:00-9:50AM,
5422BH
6:00-7:50PM, 5419BH
Tuesday 6/7: regular class hours
Wednesday 6/8: office hour cancelled
Thursday 6/9: no class
Saturday 6/11:
Office
hour: 10:00AM - 1:00PM
Final exam: 3:00-6:00PM, 2444BH
5/23/05
50
CS118/Spring05
Ethernet Frame Structure
46 – 1500 bytes
Sending adapter encapsulates IP datagram (or other
network layer protocol packet) in Ethernet frame
Preamble:
7 bytes with pattern 10101010 followed by one byte with pattern
10101011
used to synchronize receiver, sender clock rates
Addresses: 6 bytes
Type: indicates the higher layer protocol
IEEE802.3 changed the “type” field to “length”, defined a
separate type field carried in the data part
CRC: checked at receiver, if error, drop the frame
5/23/05
51
CS118/Spring05
Ethernet Frame length limit
All packets/frames have an upper bound on length
Ethernet has a lower bound for reliable collision detection:
cable max length 2500m 2 Pdelay 51.2 sec
transceiver can send & listen at the same time
If the received data stream differs from the one transmitted
collision
to detect collision: the sender must still be transmitting when
garbled signal propagates back
minimum frame = 64bytes = 512bits, take 51 sec to transmit at
10Mbps
A
5/23/05
B
52
CS118/Spring05
Ethernet MAC protocol: CSMA/CD
A: sense channel, wait if necessary until it is idle
transmit and monitor the channel;
If detect another transmission
then {
abort and send jam signal;
update # collisions (n++);
delay for K x 512bits trans time
goto A
}
else {done with the frame; set #collisions to zero (n = 0)}
Jam Signal: make sure all other transmitters are aware of collision; 48
bits
Exponential Backoff algorithm:
first collision (n=1): choose K from {0, 1}
after second collision (n =2): choose K from {0, 1, 2, 3}…
after 10 collisions (n=10), choose K from {0,1,2,3,4,…,1023}
5/23/05
53
CS118/Spring05
CSMA/CD efficiency
Tprop = max. propagation delay between 2 nodes in LAN
Ttrans = time to transmit a max-size frame
efficiency
1
1 5t prop /t trans
Efficiency goes to 1
as Tprop goes to 0
as Ttrans goes to infinity
Much better than ALOHA, but still decentralized, simple,
and cheap
Best effort data delivery
5/23/05
54
CS118/Spring05
223.1.1.1
DHCP
server
223.1.2.5
223.1.1.2
223.1.1.4
223.1.2.1
DHCP client-server
arriving new host
scenario
223.1.2.9
223.1.2.2
223.1.1.3
223.1.3.1
223.1.3.27
DHCP discover
223.1.3.2
DHCP server:
223.1.2.5
arriving
client
src : 0.0.0.0, 68
dest.: 255.255.255.255,67
yiaddr: 0.0.0.0
transaction ID: 654
DHCP offer
src: 223.1.2.5, 67
dest: 255.255.255.255, 68
yiaddrr: 223.1.2.4
transaction ID: 654
Lifetime: 3600 secs
DHCP request
src: 0.0.0.0, 68
dest:: 255.255.255.255, 67
yiaddrr: 223.1.2.4
transaction ID: 655
Lifetime: 3600 secs
time
DHCP ACK
5/23/05
55
src: 223.1.2.5, 67
dest: 255.255.255.255, 68
yiaddrr: 223.1.2.4
transaction ID: 655
Lifetime: 3600 secs
CS118/Spring05
DHCP Relay & IP Tunneling
Either have a DHCP server on each network, or a
DHCP-relay to forward request to server
Unicast to server
host
Broadcast
DHCP
relay
A1
Other
networks
DHCP
server
A2
broadcast
Source=A1
Dest. =A2
broadcast
IP tunnel:
Put another IP header in
front of the original packets,
with Source=tunnel entry, Destination=tunnel exit
5/23/05
56
CS118/Spring05