ComputerNetworksOverview

Download Report

Transcript ComputerNetworksOverview

In The Name of God, The Merciful, The Compassionate
Advanced Computer Networks
Department of Computer Engineering
Sharif University of Technology – Kish Campus
Fall 2007 – CE 693
Dr. Hamid R. Rabiee
Background Information
Overview of Computer Networks
1
Introduction
Basic concepts
 Terminology

2
Ubiquitous Computing
Computers everywhere.
 Also means ubiquitous communication

– Users connected anywhere/anytime.
– PC (laptop, palmtop) equivalent to cell phone.

Networking computers together is critical!
3
Computer Network
Provide access to local and remote resources.
 Collection of interconnected end systems:

– Computing devices (mainframes, workstations,
PCs, palm tops)
– Peripherals (printers, scanners, terminals).

Applications: location transparency.
4
Computer Networks (cont’d)

Components:
– End systems (or hosts),
– Routers/switches/bridges, and
– Links (twisted pair, coaxial cable, fiber, radio,
etc.).
5
Communication Model
Network
Source
Destination
6
Example
Modem
PTN
Source
Source System
Modem
Destination
Destination System
PTN: Public Telephone Network
7
Connecting End Systems
Dedicated link
Multiple access / shared medium
8
Connecting End Systems (cont’d)
Router
Switched network
Router: switching element; a.k.a., IMPs (Interface
Message Processors) in ARPAnet’s terminology.
9
Shared Communication
Infrastructure

Shared medium:
– Examples: ethernet, radio.
– How to acquire channel: medium access control
protocols.

Switched networks:
– Shared infrastructure consisting of point-topoint links.
– Circuit- versus packet-switching.
10
Circuit Switching




Establish dedicated path (circuit) between source and
destination.
Example: telephone network.
+’s: dedicated resources(stream-oriented).
-’s: lower resource utilization (e.g.,bursts).
11
Packet Switching
S1




D1
D2
S2
Data split into transmission units, or packets.
Routers: store packets briefly store packets and forward
them: store-and-forward.
Efficient resource use: statistical multiplexing.
Ability to accommodate bursts.
12
(Switched) Network Topologies
Star
Ring
Tree
Irregular
13
Protocol

Set of rules that allow peering entities to
communicate.
– Example: 2 friends talking on the phone.
– Peering entities or peers: user application
programs, file transfer services, e-mail services,
etc.
14
Network Architecture
Protocol layers: reduce design complexity.
 Main idea: each layer uses the services from
lower layer and provide services to upper
layer.

– Higher layer shielded from the implementation
details of lower layers.
– Interface between layers must be clearly
defined: services provided to upper layer.
15
Example 1: ISO OSI Model
ISO: International Standards Organization
 OSI: Open Systems Interconnection.

Application
Presentation
Session
Transport
Network
Data link
Physical
16
OSI ISO 7-Layer Model
Physical layer: transmission of bits.
 Data link layer: reliable transmission over
physical medium; synchronization, error
control, flow control; media access in
shared medium.
 Network layer: routing and forwarding;
congestion control; internetworking.

17
OSI ISO 7-Layer Model (cont’d)
Transport layer: error, flow, and congestion
control end-to-end.
 Session layer: manages connections
(sessions) between end points.
 Presentation layer: data representation.
 Application layer: provides users with
access to the underlying communication
infrastructure.

18
Example 2: TCP/IP Model

Model employed by the Internet.
TCP/IP
Application
Application
Presentation
Transport
Session
Transport
Internet
Network
Access
Physical
ISO OSI
Network
Data link
Physical
19
TCP/IP Protocol Suite:
Physical layer: same as OSI ISO model.
 Network access layer: medium access and
routing over single network.
 Internet layer: routing across multiple
networks, or, an internet.
 Transport layer: end-to-end error,
congestion, flow control functions.
 Application layer: same as OSI ISO model.

20
The Internet: Some History


Late 1970’s/ early 1980’s: the ARPANET (funded by
ARPA).
– Connecting university, research labs and some government
agencies.
– Main applications: e-mail and file transfer.
Features:
–
–
–
–
Decentralized, non-regulated system.
No centralized authority.
No structure.
Network of networks.
21
The Internet (cont’d)
Early 1990’s, the Web caused the Internet
revolution: the Internet’s killer app!
 Today:

– Almost 60 million hosts as of 01.99.
– Doubles every year.
22
Topics for Further Reading

Some Internet governing entities:
– IAB
– IETF
– IRTF
The Internet’s standardization process.
 Other network standardization bodies.
 Other networks (Bitnet, SNA, etc).

23
Physical Layer
Sending raw bits across “the wire”.
 Issues:

– What’s being transmitted.
– Transmission medium.
24
Basic Concepts
Signal: electro-magnetic wave carrying
information.
 Time domain: signal as a function of time.

– Analog signal: signal’s amplitude varies
continuously over time, ie, no discontinuities.
– Digital signal: data represented by sequence of
0’s and 1’s (e.g., square wave).
25
Time Domain

Periodic signals:
– Same signal pattern repeats over time.
– Example: sine wave
» Amplitude (A)
» Period (or frequency) (T = 1/f)
» Phase(f)
s (t )  A sin( 2 ft  f )
s (t  T )  s (t )
26
Frequency Domain
Signal consists of components of different
frequencies.
 Spectrum of signal: range of frequencies
signal contains.
 Absolute bandwidth: width of signal’s
spectrum.

27
Example:
s(t )  sin( 2f1t )  1/ 3sin( 2(3 f1 )t )
S(f)
1


2
3
f
Spectrum of S(f) extends from f1 to 3f1.
Bandwidth is 2f1.
28
Bandwidth and Data Rate

Data rate: rate at which data is transmitted;
unit is bits/sec or bps (applies to digital
signal).
– Example: 2Mbits/sec, or 2Mbps.
Digital signal has infinite frequency
components, thus infinite bandwidth.
 If data rate of signal is W bps, good
representation achieved with 2W Hz
bandwidth.

29
Baud versus Data Rate
Baud rate: number of times per second
signal changes its value (voltage).
 Each value might “carry” more than 1 bit.

– Example: 8 values of voltage (0..7); each value
conveys 3 bits, ie, number of bits = log2V.
Thus, bit rate = log2V * baud rate.
 For 2 levels, bit rate = baud rate.

30
Data Transmission 1

Analog and digital transmission.
– Example of analog data: voice and video.
– Example of digital data: character strings
» Use of codes to represent characters as sequence of bits
(e.g., ASCII).

Historically, communication infrastructure for
analog transmission.
– Digital data needed to be converted: modems
(modulator-demodulator).
31
Digital Transmission

Current trend: digital transmission.
– Cost efficient: advances in digital circuitry
(VLSI).

Advantages:
– Data integrity: better noise immunity.
– Security: easier to integrate encryption
algorithms.
– Channel utilization: higher degree of
multiplexing (time-division mux’ing).
32
Transmission Impairments

Cause received signal to differ from
original, transmitted signal.
– Analog data: quality degradation
– Digital data: bit errors.

Types of impairments:
– Attenuation.
– Delay distortion.
– Noise.
33
Attenuation 1
Weakening of the signal’s power as it
propagates through medium.
 Function of medium type

– Guided medium: logarithmic with distance.
– Unguided medium: more complex (function of
distance and atmospheric conditions).
34
Attenuation 2

Problems and solutions:
– Insufficient signal strength for receiver to
interpret it: use amplifiers/repeaters to
boost/regenerate signal.
– Error due to noise interference (level is not high
enough to be distinguished from noise): use
amplifiers/repeaters.
– Attenuation increases with frequency: special
amplifiers to amplify high-frequencies.
35
Delay Distortion

Speed of propagation in guided media
varies with frequency.
– Different frequency components arrive at
receiver at different times.

Solution: equalization techniques to
equalize distortion for different frequencies.
36
Noise
Noise: undesired signals inserted anywhere
in the source/destination path.
 Different categories: thermal (white),
crosstalk, impulse, etc.

37
Decibel and Signal-to-Noise
Ratio

Decibel (dB): measures relative strength of
2 signals.
– Example: S1 and S2 with powers P1 and P2.
NdB = 10 log10 (P1/P2)

Signal-to-noise ratio (S/N):
– Measures signal quality.
– S/NdB = 10 log10 (signal power/noise power)
38
Channel Capacity 1
Rate at which data can be transmitted over
communication channel.
 Noise-free channel: Nyquist Theorem

– Limitation of data rate is signal’s bandwidth.
– Given channel bandwidth W, highest signal rate
(or baud rate) is 2W.
– From receiver’s point of view: sampling at rate
2W can reconstruct signal.
39
Channel Capacity 2

Using data rate,
– C = 2W log2V, where V is number voltage levels.
Same bandwidth, increasing number of signal
levels, increases data rate, but more complex
signal recognition at receiver and more noiseprone.
 This is a theoretical upper bound, since
channels are noisy.

40
Channel Capacity 3

Noisy channel: Shannon’s Theorem
– Given channel with W (Hz) bandwidth and S/N
(dB) signal-to-noise ratio, C (bps) is
» C = W log2 (1+S/N)
– Theoretical upper bound since assumes only
thermal noise (no impulse noise, etc).
41
Transmission Media
Physically connect transmitter and receiver
carrying signals in the form electromagnetic
waves.
 Types of media:

– Guided: waves guided along solid medium such as
copper twisted pair, coaxial cable, optical fiber.
– Unguided: “wireless” transmission (atmosphere,
outer space).
42
Guided Media: Examples 1

Twisted Pair:
– 2 insulated copper wires arranged in regular spiral.
Typically, several of these pairs are bundled into a
cable.
– Cheapest and most widely used; limited in
distance, bandwidth, and data rate.
– Applications: telephone system (home-local
exchange connection).
– Unshielded and shielded twisted pair.
43
Examples 2

Coaxial Cable
– Hollow outer cylinder conductor surrounding
inner wire conductor; dielectric (non-conducting)
material in the middle.
– Applications: cable TV, long-distance telephone
system, LANs.
– +’s: Higher data rates and frequencies, better
interference and crosstalk immunity.
– -’s: Attenuation and thermal noise.
44
Examples 3

Optical Fiber
– Thin, flexible cable that conducts optical
waves.
– Applications: long-distance
telecommunications, LANs.
– +’s: greater capacity, smaller and lighter, lower
attenuation, better isolation,
45
Unguided, Wireless Media
Microwave: directional, LOS transmission.
 Satellite: directional, LOS, large delay, high
bandwidth.
 Radio: omnidirectional (broadcast), single hop
(cellular), multi-hop (ad hoc net’s).
 Infrared: directional, LOS transmission,
cannot penetrate obstacles and used outdoors.

46
Data Encoding
Transforming original signal just before
transmission.
 Both analog and digital data can be encoded
into either analog or digital signals.

47
Digital/Analog Encoding
Encoding:
g(t)
(D/A) Encoder
g(t)
Digital Medium
Source
Source System
Decoder
Destination
Destination System
Modulation:
g(t)
g(t)
(D/A) Modulator
Source
Source System
Analog Medium
Demodulator
Destination
Destination System
48
Encoding Considerations
Digital signaling can use modern digital
transmission infrastructure.
 Some media like fiber and unguided media
only carry analog signals.
 Analog-to-analog conversion used to shift
signal to use another portion of spectrum for
better channel utilization (frequency
division mux’ing).

49
Digital Transmission
Terminology
Data element: bit.
 Signaling element: encoding of data
element for transmission.
 Unipolar signaling: signaling elements have
same polarization (all + or all -).
 Polar signaling: different polarization for
different elements.

50
More Terminology
Data rate: rate in bps at which data is
transmitted; for data rate of R, bit duration
(time to emit 1 bit) is 1/R sec.
 Modulation rate = baud rate (rate at which
signal levels change).

51
Digital Transmission: ReceiverSide Issues

Clocking: determining the beginning and
end of each bit.
– Transmitting long sequences of 0’s or 1’s can
cause synchronization problems.

Signal level: determining whether the signal
represents the high (logic 1) or low (logic 0)
levels.
– S/N ratio is a factor.
52
Comparing Digital Encoding
Techniques
Signal spectrum: high frequency means
high bandwidth required for transmission.
 Clocking: transmitted signal should be selfclocking.
 Error detection: built in the encoding
scheme.
 Noise immunity: low bit error rate.

53
Digital-to-Digital Encoding
Techniques
Nonreturn to Zero (NRZ)
 Multilevel Binary
 Biphase
 Scrambling

54
NRZ Techniques
Use of 2 different voltage levels.
 NRZ-L: positive voltage represents one binary
value; negative voltage, the other.
 NRZI (Nonreturn to zero, invert on ones):
transition (low-to-high or high-to-low)
represents “1”; no transition, “0”.
 NRZI is an example of differential encoding:
decoding based on comparing polarity of
adjacent signal elements.

55
Multilevel Binary
Use more than 2 signal levels.
 Bipolar-AMI: “0”: no signal; “1”: positive and
negative pulse; consecutive “1”s alternate in
polarity: avoid synchronization loss.
 Pseudoternary: opposite representation.
 Long sequence of 0’s or 1’s still a problem for
bipolar-AMI and pseudoternary respectively.

56
Biphase

Manchester: transition in the middle of bit period.
– Carries data and provides clocking.
– Low-to-high: “1”.
– High-to-low: “0”.

Differential Manchester:
– Mid-bit transition only provides clocking.
– “0”: transition in the beginning of bit interval.
– “1”: no transition.
57
Scrambling


Avoid long sequences of 0’s or 1’s.
Bipolar with 8-zeros substitution (B8ZS)
– Inserts transitions when transmitting 8 consecutive “0”s.

High-density bipolar-3 zeros (HDB3)
– Inserts pulses when transmitting 4 consecutive “0”s.

Receiver must recognize insertions and re-generate
original signal.
58
Digital-to-Analog Encoding
Transmission of digital data using analog
signaling.
 Example: data transmission of a PTN.
 PTN: voice signals ranging from 300Hz to
3400 Hz.
 Modems: convert digital data to analog
signals and back.
 Techniques: ASK, FSK, and PSK.

59
Amplitude-Shift Keying
2 binary values represented by 2
amplitudes.
 Typically, “0” represented by absence of
carrier and “1” by presence of carrier.
 Prone to errors caused by amplitude
changes.

60
Frequency-Shift Keying

2 binary values represented by 2
frequencies.
s (t )  A cos( 2f1t ), "1"
s (t )  A cos( 2f 2t ), "0"
 Frequencies f1 and f2 are offset from carrier
frequency by same amount in opposite
directions.
 Less error prone than ASK.
61
Phase-Shift Keying
Phase of carrier is shifted to represent data.
 Example: 2-phase system.

s(t )  A cos( 2f ct  ), "1"
s(t )  A cos( 2f ct ), "0"

Phase shift of 90o can represent more bits:
aka, quadrature PSK.
62
Analog-to-Digital Encoding
Analog data transmitted as digital signal, or
digitization.
 Codec: device used to encode and decode
analog data into digital signal, and back.
 2 main techniques:

– Pulse code modulation (PCM).
– Delta modulation (DM).
63
Pulse Code Modulation 1
Based on Nyquist (or sampling) theorem: if
f(t) sampled at rate > 2*signal’s highest
frequency, then samples contain all the
original signal’s information.
 Example: if voice data is limited to 4000Hz,
8000 samples/sec are sufficient to
reconstruct original signal.

64
PCM 2

Analog signal -> PAM -> PCM.
– PAM: pulse amplitude modulation; samples of
original analog signal.
– PCM: quantization of PAM pulses; amplitude
of PAM pulses approximated by n-bit integer;
each pulse carries n bits.
65
Delta Modulation (DM)
Analog signal approximated by staircase
function moving up or down by 1
quantization level every sampling interval.
 Bit stream produced based on derivative of
analog signal (and not its amplitude): “1” if
staircase goes up, “0” otherwise.
 Parameters: sampling rate and step size.

66
Analog-to-Analog Encoding
Combines input signal m(t) and carrier at fc
producing s(t) centered at fc.
 Why modulate analog data?

– Shift signal’s frequency for effective transmission.
– Allows channel multiplexing: frequency-division
multiplexing.

Modulation techniques: AM, FM, and PM.
67
Amplitude Modulation (AM)

Carrier serves as envelope to signal being
modulated.
S AM (t )  [1  m(t )] cos( 2f ct )
Signal m(t) is being modulated by carrier
cos(2p fct).
 Modulation index: ratio between amplitude
of input signal to carrier.

68
Angle Modulation
FM and PM are special cases of angle
modulation.
 FM: carrier’s amplitude kept constant while
its frequency is varied according to message
signal.
 PM: carrier’s phase varies linearly with
modulating signal m(t).

69
Spread Spectrum 1
Used to transmit analog or digital data using
analog signaling.
 Spread information signal over wider
spectrum to make jamming and
eavesdropping more difficult.
 Popular in wireless communications

70
Spread Spectrum 2

2 schemes:
– Frequency hopping: signal broadcast over
random sequence of frequencies, hoping from
one frequency to the next rapidly; receiver must
do the same.
– Direct Sequence: each bit in original signal
represented by series of bits in the transmitted
signal.
71
Transmission Modes
Assuming serial transmission, ie, one
signaling element sent at a time.
 Also assuming that 1 signaling element
represents 1 bit.
 Source and receiver must be in sync.
 2 schemes:

– asynchronous and
– synchronous transmission.
72
Asynchronous Xmission 1
Avoid synchronization problem by
including sync information explicitly.
 Character consists of a fixed number of bits,
depending on the code used.
 Synchronization happens for every
character: start (“0”) and stop (“1”) bits.
 Line is idle: transmits “1”.

73
Asynchronous Xmission 2

Example: sending “ABC” in ASCII
0 10000010 1 0 01000010 1 0 110000 1 1111…
Timing requirements are not strict.
 But problems may occur.

– Significant clock drifts + high data rate =
reception errors.

Also, 2 or more bits for synchronization:
overhead!
74
Synchronous Xmission 1
No start or stop bits.
 Synchronization via:

– Separate clock signal provided by transmitter or
receiver; doesn’t work well over long distances.
– Embed clocking information in data signal
using appropriate encoding technique such as
Manchester or Differential Manchester.
75
Synchronous Xmission 2
Need to identify start/end of data block.
 Block starts with preamble (8-bit flag) and
may end with postamble.
 Other control information may be added for
data link layer.

8 -bit Control
flag
Data
8 -bit
Control
flag
76
Data Link Layer
So far, sending signals over transmission
medium.
 Data link layer: responsible for error-free
(reliable) communication between adjacent
nodes.
 Functions: framing, error control, flow
control, addressing (in multipoint medium).

77
Flow Control

What is it?
– Ensures that transmitter does not overrun
receiver: limited receiver buffer space.
– Receiver buffers data to process before passing
it up.
– If no flow control, receiver buffers may fill up
and data may get dropped.
78
Stop-and-Wait

Simplest form of flow control.
– Transmitter sends frame and waits.
– Receiver receives frame and sends ACK.
– Transmitter gets ACK, sends other frame, and waits, until
no more frames to send.


Good when few frames.
Problem: inefficient link utilization.
– In the case of high data rates or long propagation delays.
79
Sliding Window 1
Allows multiple frames to be in transit at
the same time.
 Receiver allocates buffer space for n
frames.
 Transmitter is allowed to send n (window
size) frames without receiving ACK.
 Frame sequence number: labels frames.

80
Sliding Window 2
Receiver ack’s frame by including sequence
number of next expected frame.
 Cumulative ACK: ack’s multiple frames.
 Example: if receiver receives frames 2,3,
and 4, it sends an ACK with sequence
number 5, which ack’s receipt of 2, 3, and
4.

81
Sliding Window 3
Sender maintains sequence numbers it’s
allowed to send; receiver maintains
sequence number it can receive. These lists
are sender and receiver windows.
 Sequence numbers are bounded; if frame
reserves k-bit field for sequence numbers,
then they can range from 0 … 2k -1 and are
modulo 2k.

82
Sliding Window 4

Transmission window shrinks each time
frame is sent, and grows each time an ACK
is received.
83
Example: 3-bit sequence number
and window size 7
A
B
0 1 2 3 4 5 6 7 0 1 2 3 4...
0123456701234
0
1
2
0123456701234
0123456701234
RR3
0123456701234
0123456701234
0123456701234
0123456701234
RR4
3
45 0 1 2 3 4 5 6 7 0 1 2 3 4
6
0 1 2 3 4 5 6 7 0 1 2 3844
Sliding Window (cont’d)
RR n acknowledges up to frame n-1.
 There is also RNR n, which ack’s up to
frame n-1 but no longer accepts more
frames.
 RNR shuts down the receive window and
consequently the transmission window.
 Need subsequent RR to re-open window.

85
Piggybacking
When both endpoints transmit, each keeps 2
windows, transmitter and receiver windows.
 Each send data and need to send ACKs.
 When sending data, transmitter can
“piggyback” the acknowledgment
information.
 When no data, send just the ACK.

86
Duplicate ACKs
When no data, must re-send last ACK.
 Duplicate ACKs: report potential errors.

87
Error Detection
Transmission impairments lead to
transmission errors: change of 1 or more
bits in transmitted frame.
 Transmission errors defined using
probabilities: transmission medium modeled
as a statistical system.

88
Error Probabilities 1

Definitions:
– Pb probability of single bit error (bit error rate);
constant and independent for each bit.
– P1 probability frame received with no errors.
– P2 probability frame received with 1 or more
undetected errors.
– P3 probability frame received with 1 or more
detected bit errors, but no undetected ones.
89
Error Probabilities 2
If no error detection mechanism, P3 = 0.
 P1 = (1 - Pb)F and P2 = (1- P1), where F is
size of frame in bits.
 P1 decreases as Pb increases.
 P1 decreases as F increases.

90
Example

64-kbps ISDN channel’s bit error rate is less
than 10-6. User requirement of at most 1 frame
with undetected bit error per day. Frame is
1000 bits.
– In a day, 5.529 x 106 frames transmitted.
– Required frame error rate of 1/ 5.529 x 106, or P2
= 0.18 x 10-6.
– But Pb = 10-6, so P1 = (1-Pb)F = 0.999 and P2 = 1 P1 = 10-3, which is >>> required P2
91
Error Detection Schemes




Transmitter adds additional bits for error detection.
Transmitter computes error detection bits as function
of original data.
Receiver performs same calculation and compares
results. If mismatch, then error.
P3 probability error detection scheme detects error; P2
residual error rate or probability error goes
undetected.
92
Parity
Simplest error detection scheme.
 Append parity bit to data block.
 Example: ASCII transmission

– 1 parity bit appended to each 7-bit ASCII
character.
– Even parity: 8-bit code has even number of 1’s.
– Odd parity: 8-bit code has odd number of 1’s.
93
Parity Check

Example: transmitting ASCII “G” (1110001)
using odd parity.
– Code transmitted is 11100011.
– Receiver checks received code and if odd number
of 1’s, assumes no error.
– Suppose it receives 11000011, then detects error.
– NOTE: If more than 2 bits in error, may not be
detected.
94
Cyclic Redundancy Check
CRC is one of the most effective and common
error detecting schemes.
 Let M be m-bit message, G (r+1)-bit pattern.

– Transmitter appends r 0’s to M, 2r*M.
– Divide 2r*M by G and add remainder to 2r*M
forming T (m+r bits), which is transmitted.
– Receiver computes T/G; if remainder, then error.
95
CRC Example
Frame M 1010001101 = x9+x7+x3+x2+x0.
 Pattern G 110101.
 Dividing (frame*25) by pattern results in
01110.
 Thus T 101000110101110.
 Receiver can detect errors unless received
message Tr is divisible by G.

96
CRC
Patterns are expressed as polynomials G(x).
 Example:

– CRC-16 = x16+x15+x2+1
– CRC-CCITT = X16+x12+x5+1
97
CRC-Based Detection

If suitably selected polynomials, CRC can
detect:
– All single-bit errors.
– All double-bit errors, as long as P(X) has at least
three 1’s.
– Any odd number of errors as long as P(X)
contains factor (X+1).
– Any burst error whose length is <= sizeof(FCS).
98
Error Control


Mechanisms to detect and correct transmission
errors.
Consider 2 types of errors:
– Lost frame: frame is sent but never arrives.
– Damaged frame: frame arrives but in error.


Error control: combination of error detection,
feedback (ACK or NACK) from receiver, and
retransmission by source.
Coupled with flow control feedback.
99
ARQ
ARQ: automatic repeat request.
 Works by creating a reliable data link from
an unreliable one.
 3 versions:

– Stop-and-wait ARQ.
– Go-back-N ARQ.
– Selective-reject ARQ.
100
Stop-and-Wait ARQ
Single outstanding frame at any time.
 Simple but inefficient.
 Use of timers to trigger retransmission of data
or ACKs.
 2 types of errors:

– Damaged or lost frame.
– Damaged or lost ACK.

Sequence numbers alternate between 0 and 1.
101
Stop-and-Wait ARQ: Example
Sender
Frame 0
ACK1
Frame 1
ACK 0
Frame 0
Receiver
Timeout
Frame 0
ACK 1
Timeout
Frame 0
ACK 1
B discards
duplicate.
102
Go-Back-N ARQ
Variation of sliding window for error control.
 Allows a window’s worth of frames to be in
transit at any time.
 RR: ack’s receipt of frame.
 REJ: negative acknowledgment indicating the
frame in error.
 Destination discards frame in error plus
subsequent frames.

103
Go-Back-N ARQ Example
S
f0
f1
f2
rr3
R
f3
f4 rr4
f5
f6
Error
f7
rej5
Discarded
f5
5, 6, 7
f6
rexm.
rr6
f7
S
f7
Time
out
f0
R
rr0
f1
rr(P bit
=1)
rr2
f2
104
Go-Back-N ARQ Issues

For k-bit sequence number, maximum
window size is (2k-1).
– If window size is too large, ACKs may be
ambiguous: not clear if ACK is a duplicate
ACK (errors occurred).
– Example: 3-bit sequence number and 8 -frame
window.
» Source transmits f0, gets back rr1, then sends f1--f0,
and gets back another rr1. ???
105
Selective-Reject ARQ
Only frames transmitted are the ones that
are NACK’ed (SREJ) or that timeout.
 More efficient than Go-Back-N regarding
amount of reXmissions.
 But, receiver must buffer out-of-order
frames.
 More restriction on maximum window size;
for k-bit sequence #’s, 2k-1 window.

106
Example Data Link Layer
Protocol

High-Level Data Link Control (HDLC)
– Widely-used (ISO standard).
– Single frame format.
– Synchronous transmission.
107
HDLC: Frame Format
flag address control
8
bits
–
–
–
–
8
ext.
8 or
16
data
variable
FCS
16 or
32
flag
8
Flag: frame delimiters (01111110).
Address field for multipoint links.
16-bit or 32-bit CRC.
Refer to book (pages 176-185) for more details.
108
Other DLL Protocols 1

LAPB: Link Access Procedure, Balanced.
–
–
–
–

Part of the X.25 standard.
Subset of HDLC.
Link between user system and switch.
Same frame format as HDLC.
LAPD: Link Access Procedure, D-Channel.
– Part of the ISDN standard.
109
Other DLL Protocols 2

LLC: Logical Link Control.
– Part of the 802 protocol family for LANs.
– Link control functions divided between the
MAC layer and the LLC layer.
– LLC layer operates on top of MAC layer.
Dst.
MAC
MAC
control
addr
Src.
MAC
addr
Dst. Src. LLC
LLC LLC ctl. Data
addr addr
FCS
110
Other DLL Protocols 3

SLIP: Serial Line IP
– Dial-up protocol.
– No error control.
– Not standardized.

PPP: Point-to-Point Protocol
– Internet standard for dial-up connections.
– Provides framing similar to HDLC.
111
Multiplexing
Sharing a link/channel among multiple
source-destination pairs.
 Example: high-capacity long-distance
trunks (fiber, microwave links) carry
multiple connections at the same time.

..
.
112
Multiplexing Techniques

3 basic types:
– Frequency-Division Multiplexing (FDM).
– Time-Division Multiplexing (TDM).
– Statistical Time-Division Multiplexing
(STDM).
113
FDM 1
High bandwidth medium when compared to
signals to be transmitted.
 Widely used (e.g., TV, radio).
 Various signals carried simultaneously
where each one modulated onto different
carrier frequency, or channel.
 Channels separated by guard bands
(unused) to prevent interference.

114
FDM 2
1
2
N
Frequency
Time
115
TDM 1
TDM or synchronous TDM.
 High data rate medium when compared to
signals to be transmitted.

N
2
1
Frequency
Time
116
TDM 2
Time divided into time slots.
 Frame consists of cycle of time slots.
 In each frame, 1 or more slots assigned to a
data source.

U1 U2 ...
1 2 ...
frame
UN
N 1
2
...
N
Time
117
TDM 3
No control info at this level.
 Flow and error control?

– To be provided on a per-channel basis.
– Use DLL protocol such as HDLC.
Examples: SONET (Synchronous Optical
Network) for optical fiber.
 +’s: simple, fair.
 -’s: inefficient.

118
Statistical TDM 1
Or asynchronous TDM.
 Dynamically allocates time slots on demand.
 N input lines in statistical multiplexer, but
only k slots on TDM frame, where k < n.
 Multiplexer scans input lines collecting data
until frame is filled.
 Demultiplexer receives frame and distributes
data accordingly.

119
STDM 2
Data rate on mux’ed line < sum of data rates
from all input lines.
 Can support more devices than TDM using
same link.
 Problem: peak periods.

– Solution: multiplexers have some buffering
capacity to hold excess data.
– Tradeoff data rate and buffer size (response
time).
120
Local Area Networks 1

Interconnect devices over short distances.
– Within same floor,
– Building,
– Campus.

Characterized by low delays.
121
LANs 2

Typically use broadcast medium.
– Hosts share same communication medium.
– Also called multiple-access networks.

LANs are characterized by:
– Topology.
– Transmission medium.
– Medium access control mechanism.
122
LAN Protocol Architecture

LAN protocol standards collectively known
as IEEE 802 reference model.
OSI
Application
Presentation
Session
Transport
Upper
layer
protocols
Network
Data link
Physical
LLC
MAC
Physical
123
IEEE
802
LAN Protocols
MAC sublayer: performs functions that
control access to shared medium.
 LLC: performs flow and error control and
provides services to upper layer.

124
802 standards 1
Text book page 367.
 LLC: IEEE 802.2

– connectionless and connection oriented
services.
– Reliable and unreliable.
125
802 standards 2

MAC + physical layers
– 802.3
» Bus/tree/star topologies.
» CSMA/CD.
– 802.4
» Bus/tree/star topologies.
» Token bus.
802.5
Ring topology.
Token ring.
FDDI
Dual bus (optical).
Token ring.
– 802.11
» Wireless.
» CSMA.
126
Encapsulation
Application data
TCP
header
IP
header
MAC
header
LLC
header
MAC
trailer
LLC PDU
MAC frame
TCP segment
IP datagram
127
MAC Frame Format
Dst.
MAC
MAC
control
addr
Src.
MAC
addr
Dst. Src.
LLC LLC
addr addr
LLC PDU
CRC
MAC control: protocol information (protocol type, version #).
Destination MAC address: physical address of LAN destination.
Source MAC address: physical address of the LAN source.
128
LAN Topologies
Star
Ring
Tree
Central node
Bus
129
Bus Topology
Use of multipoint medium.
 Stations attach to bus through tap.

– Full-duplex communication allows data to be sent
to/received from bus.

Transmission from any station propagates in
both directions and is received by all.
– At each end, terminator absorbs and removes
signal from bus.
130
Tree Topology
Tree is generalization of bus.
 Headend: start of 1 or more cables
(branches).
 Transmission from one station propagates to
all others.

131
Issues

Inherently, broadcast.
– Frames to transmit data.
– Need for specifying the destination.
– Addresses.

Multi-access.
– Need for controlling access to medium.
» Avoid collisions.
» MAC protocol.
132
Ring Topology 1
Stations attach to repeaters.
 Repeaters are linked to each other by pointto-point links forming a closed loop.
 Links are unidirectional.
 Repeaters: receive data from one link and
repeat it on the other with no buffering.

133
Ring 2
Stations transmit/receive via repeater.
 Frames circulate past all stations;
destination copies frame as it goes by;
source removes frame.
 Ring shared by multiple stations.

– Need MAC protocol.
» Determine when each station may insert frame.
134
Star Topology
Each station directly connected to central node
via point-to-point link.
 Central node’s modes of operation:

– Broadcast mode: node broadcasts received frame
on all other links; logically works like bus.
– Switching mode: node sends frame out only on the
link to the destination.

Central node as single-point of failure.
135
Medium Access Control
Control access to shared medium.
 Where and how?
 Where: centralized versus decentralized.
 How: synchronous versus asynchronous.

136
Centralized versus Distributed
MAC

Centralized approaches:
– Controller grants access to medium.
– Simple, greater control: priorities, qos.
– But, single point of failure and performance
bottleneck.

Decentralized schemes:
– All stations collectively run MAC to decide
when to transmit.
137
Synchronous versus
Asynchronous

Synchronous approaches:
– Static channel allocation.
– Examples: FDM, TDM.
– Simple but inefficient.

Asynchronous or dynamic:
– Example: STDM.
– 3 categories: round-robin, reservation, and
contention.
138
Round-Robin MAC




Each station is allowed to transmit; station may
decline or transmit (bounded by some maximum
transmit time).
Centralized (e.g., polling) or distributed control of
who is next to transmit.
When done, station relinquishes and right to transmit
goes to next station.
Efficient when many stations have data to transmit
over extended period (stream).
139
Reservation
Time divided into slots.
 Station reserves slots in the future.
 Multiple slots for extended transmissions.
 Suited to stream traffic.

140
Contention






No control.
Stations try to acquire the medium.
Distributed in nature.
Perform well for bursty traffic.
Can get very inefficient under heavy load.
NOTE: round-robin and contention are the most
common.
141
Standardized MACs
Techniques
Round robin
Reservation
Contention
Bus
Topologies
Ring
Token bus
(802.4)
Polling
(802.11)
DQDB
(802.6)
Token ring
(802.5; FDDI)
CSMA/CD
(802.3)
CSMA(802.11)
142
LLC for LANs
Similar functions as general LLCs.
 But it has to interface with MAC sublayer.
 LLC functions:

– Addressing: source and destination.
» LLC address versus MAC address.
– Control data exchange between 2 users.
» User as higher-layer protocol in the station.
143
LLC Services

3 different services:
– Unacknowledged connectionless (type 1).
» No error or flow control.
» No delivery guarantees.
– Connection-mode (type 2).
» Logical connection established.
» Flow and congestion control provided.
– Acknowledged connectionless (type 3).
» No logical connection.
» Flow and error control.
144
LLC (802.2) Protocol
Similar to HDLC (ISO standard).
 LLC PDU:

1 byte 1 byte
1 or 2 bytes
DSAP SSAP
LLC control
variable
Information
145
Wireless LANs

Use wireless transmission media.
– Infrared (IR): limited to indoors and single
room (IR light doesn’t penetrate walls).
– Radio
» Narrowband microwave.
» Spread Spectrum LANs.

For wireless LAN technology comparison,
see table on page 398.
146
Wireless LAN Applications
Nomadic access (e.g., users roaming around
campus).
 LAN interconnection (e.g., across
buildings).
 Ad Hoc Networks (e.g., disaster relief
crew).

147
MAC Protocols

Contention-based
– ALOHA and Slotted ALOHA.
– CSMA.
– CSMA/CD.

Round-robin : token-based protocols.
– Token bus.
– Token ring.
148
The ALOHA Protocol



Developed @ U of Hawaii in early 70’s.
Packet radio networks.
“Free for all”: whenever station has a frame to send,
it does so.
– Station listens for maximum RTT for an ACK.
– If no ACK, re-sends frame for a number of times and then
gives up.
– Receivers check FCS and destination address to ACK.
149
Collisions
Invalid frames may be caused by channel
noise or
 Because other station(s) transmitted at the
same time: collision.
 Collision happens even when the last bit of
a frame overlaps with the first bit of the
next frame.

150
ALOHA’s Performance 1
t0
t0+t
t0+2t
t0+3t
Time
vulnerable
151
ALOHA’s Performance 2
S = G e-2G, where S is the throughput (rate
of successful transmissions) and G is the
offered load.
 S = Smax = 1/2e = 0.184 for G=0.5.

152
Slotted Aloha
Doubles performance of ALOHA.
 Frames can only be transmitted at beginning
of slot: “discrete” ALOHA.
 Vulnerable period is halved.
 S = G e-G.
 S = Smax = 1/e = 0.368 for G = 1.

153
ALOHA Protocols
Poor utilization.
 Key property of LANs: propagation delay
between stations is small compared to frame
transmission time.
 Consequence: stations can sense the
medium before transmitting.

154
Carrier-Sense Multiple Access
(CSMA) 1
Station that wants to transmit first listens to
check if another transmission is in progress
(carrier sense).
 If medium is in use, station waits; else, it
transmits.
 Collisions can still occur.
 Transmitter waits for ACK; if no ACKs,
retransmits.

155
CSMA 2
Effective when average transmission time >>
propagation time.
 Collisions can occur only when 2 or more
stations begin transmitting within short time.
 If station transmits and no collisions during
the time leading edge of frame propagates to
farthest station, then NO collisions.

156
CSMA 3

Maximum utilization is function of frame
size and propagation time.
– Longer frames or shorter propagation time,
higher utilization.
157
CSMA Flavors

1-persistent CSMA (IEEE 802.3)
– If medium idle, transmit; if medium busy, wait
until idle; then transmit with p=1.
– If collision, waits random period to re-send.

Non-persistent CSMA: after collision, node
waits a random time before retransmitting.

P-persistent: when channel idle detected,
transmits packet in the first slot with p.
158
CSMA/CD 1
CSMA with collision detection.
 Problem: when frames collide, medium is
unusable for duration of both (damaged)
frames.
 For long frames (when compared to
propagation time), considerable waste.
 What if station listens while transmitting?

159
CSMA/CD Protocol
1. If medium idle, transmit; otherwise 2.
2. If medium busy, wait until idle, then
transmit with p=1.
3. If collision detected, transmit brief jamming
signal and abort transmission.
4. After aborting, wait random time, try again.
160
CSMA/CD Performance
Wasted capacity restricted to time to detect
collision.
 Time to detect collision < 2*maximum
propagation delay.
 Rule in CSMA/CD protocols: frames long
enough to allow collision detection prior to
end of transmission.

161
IEEE 802.3 LAN Standards
802.3: 10 Mbps Ethernet.
 802.3u: 100Mbps (Fast) Ethernet.
 802.3z: 1Gbps (Gigabit) Ethernet.

162
Ethernet
Most popular CSMA/CD protocol.
 1-persistent.
 Developed at Xerox Parc (1976).
 Different implementations (10Mbps):

– Notation: <bps><signaling><max seg size
(100’s of meters)>
– Table page 409.
163
Ethernet Implementations
 10Base5
(thick net): up to 500m
segments and 100 stations; coaxial
cable(10mm); baseband (Manchester); bus.
 10Base2 (thin net): up to 200m segments
and 30 stations; coaxial cable(5mm);
baseband (Manchester); bus.
 10BaseT: up to 100m segments;
unshielded TP; baseband (Manchester); star.
164
Baseband and Broadband
Signaling techniques.
 Baseband: signals transmitted without
modulation; digital signals represented by
different voltages (e.g., using Manchester
encoding).
 Broadband: analog signaling; if digital,
modulation required.

165
Ethernet (cont’d)

Multiple segments can be connected using
repeaters.
Repeater
166
Ethernet Frame Format
8
Preamble
6
6
2
DA
SA
Type
4
Data
1
CRC Postamble
Type: identifies upper layer protocol (for demux’ing)
Data: 0-1500 bytes (min. is 46 bytes).
DA and SA: destination and source addresses.
Example: 6:2b:3e:0:0:1d
Broadcast: all 1’s.
Multicast: first bit is 1.
Promiscuous mode: stations accept all frames.
167
Ethernet Transmission

If channel idle:
– Send frame immediately (p=1).
– Waits 2t between back-to-back transmissions.

If channel busy:
– Wait till free, then transmit (p=1).

If collision:
– Jam for 512 bits (for both ends to detect collision).
– Waits for 0-2t (1st try), 0-4t (2nd try),...
168
Token Bus 1
IEEE 802.4 (1985).
 Token: special-purpose frame that
circulates when all stations are idle.
 Physically, token bus is linear or treeshaped topology; logically, it operates as
ring.
4

5
3
token
6
1
2
169
Token Bus 2
In CSMA/CD (802.3) starvation may occur,
i.e., stations can wait forever to transmit.
 In token bus, every station has a chance to
transmit (token).
 No collisions! i.,e., contention-free.

170
Token Bus 3
Token passes around in pre-defined order.
 Once station acquires token, it can start
transmitting.
 When done, passes the token onto next
station.

171
Token Bus 4
Limited efficient due to passing of the
token.
 Issues:

– Adding/removing stations.
– Lost token problem.
172
Token Ring 1
IEEE 802.5 and FDDI.
 Most commonly used MAC protocol for
ring topologies.
 Also uses special-purpose, circulating
frame, or token (3 bytes).
 Station that wants to transmit waits till
token passes by.

173
Token Ring 2

When station wants to transmit:
– Waits for token.
– Seizes it by changing 1 bit and token becomes
start-of-frame sequence.
– Station appends remainder of frame.

When station seizes token and begins
transmission, there’s no token on the ring;
so nobody else can transmit.
174
Token Ring 3

Transmitting station inserts new token when:
– Station completes frame transmission and
– Leading edge of frame returns to it after a roundtrip.
If ring length < frame length, 1st. condition
implies 2nd.
 2nd. condition ensures only 1 data frame at a
time on the ring.

175
Token Ring 4
Under light load, inefficiency due to waiting
for the token to transmit.
 Under heavy load, round-robin: fair and
efficient.
 Issues:

– Token maintenance.
» Token loss or duplication.
» Monitoring station can be responsible for ring
maintenance (removing duplicates, inserting token)
176
Token Ring Frame Format
1
SD
1
AC
1
FC
SD
AC
FC
2 or 6
DA
2 or 6
SA
Data
4
FCS
1
1
ED
FS
Token frame
SD: starting delimiter; indicates starting of frame.
AC: access control; PPPTMRRR; PPP and RRR priority and
reservation; M monitor bit; T token or data frame.
FC: frame control; if LLC data or control.
DA and SA: destination and source addresses.
FCS: frame check sequence.
ED: ending delimiter; contains the error detection bit E; contains
frame continuation bit I (multiple frame transmissions).
FS: frame status.
177
Token Ring Revisited


Single priority: priority and reservation bits = 0.
Transmitter seizes token.
–
–
–
–
–
Sets token bit to 1.
Token’s SD and AC are first 2 fields.
Station transmits 1 or more frames.
Until done or token-holding timer expires.
When AC of last frame returns, sets token bit to 0, appends
ED: new token.
178
Detecting Errors

Frame status bits (end delimiter).
– A bit: address recognized.
– C bit: frame copied.
» A=0, C=0: destination non-existent or not active.
» A=1, C=0: destination exists but frame not copied.
» A=1, C=1: frame received.
179
Token Ring Priority
Optional priority mechanism in 802.5.
 3 priority bits: 8 priority levels.
 Service priority: priority of current token.

– Station can only transmit frame with priority >=
service priority.
– Reservation bits allow station to influence
priority levels trying to reserve next token.
180
Early Token Release
Typically, station waits for frame to come
back before issuing a new token.
 Problem: low ring utilization.
 ETR option:

– Station may release token as soon as it
completes transmission.
181
Ethernet versus Token Ring

Token ring:
–
–
–
–
–
Efficient at heavy traffic.
Guaranteed delay.
Fair.
Supports priorities.
But, ring/token maintenance overhead.
» Centralized monitoring.

Ethernet is simple!
182
High-Speed LANs
FDDI
 100VG-AnyLAN
 Fast Ethernet
 Gigabit Ethernet

183
FDDI 1
Fiber Distributed Data Interface.
 Similar to 802.5 with some changes due to
higher data rates.
 100Mbps, token ring LAN.
 Also suitable for MANs.
 Fiber or TP as transmission medium.
 Up to 100 repeaters and up to 2 Km (fiber) or
100m (TP) between repeaters.

184
FDDI 2

2 counter-rotating fiber rings; only one used
for transmission; the other for reliability,
i.e., self-healing ring.
Normal
operation
Under
failure
Line
failure
185
FDDI 3
Primary
ring
SAS
CON
DAS: dual attachment
SAS: single attachment
CON: concentrator
DAS
Secondary
ring
186
FDDI 4

Basic differences to 802.5:
– Station waiting for token, seizes token by
failing to repeat it (completely removes it).
Original 802.5 technique impractical (high data
rate).
– Station inserts new frame.
– Early token release by default.
187
FDDI 5

FDDI can also be implemented using
twisted pair (copper): CDDI.
– Cheaper.
– 100m.
THT: token holding time.
 TRT: token rotation time.

188
100VG-ANYLAN 1





VG: voice grade; ANYLAN: support multiple frame
types.
802.12 (uses new MAC scheme and not CSMA/CD).
Intended to be 100Mbps extension to Ethernet like
100BASE-T.
MAC scheme: demand priority (determines order in
which nodes share network).
Supports both 802.3 and 802.5 frames.
189
100VG-ANYLAN 2

Topology: hierarchical star.
Level 1 hub
Level 2
hub
Level 2
hub
190
MAC Protocol 1

Single-hub network
– Station issues request to central hub and waits
permission to transmit.
– High- and low-priority requests.
– Hub scans its ports for requests in RR order,
e.g., port 1, 2,…, n; it keeps 2 separate pointers
for high- and low-priority traffic.
– Services high-priority requests in order; then
low-priority ones.
191
MAC Protocol 2

Hierarchical topology
1.1
1.2
1.6
1.4
1.3.1
1.3.2
1.3.3
1.5.1
1.5.2
1.5.3
1.7
192
Fast Ethernet
100 Mbps Ethernet.
 IEEE 802.3u, 1995.
 Medium alternatives: 100BASE-TX
(twisted pair) 100BASE-FX (fiber).
 IEEE 802.3 MAC and frame format.
 10-fold increase in speed => 10-fold
reduction in diameter (200m).

193
Gigabit Ethernet
IEEE 802.3z (1996).
 Currently over fiber: 1000Base-F.
 Modified MAC layer due to high data rates.

194
Wireless LANs
IEEE 802.11.
 Distributed access control mechanism (DCF)
based on CSMA with optional centralized
control (PCF).

Contention-free
Service (polling)

MAC
layer
PCF
DCF
Physical Layer
Contention
Service
(CSMA)
195
MAC in Wireless LANs
Distributed coordination function (DCF) uses
CSMA-based protocol (e.g., ad hoc networks).
 CD does not make sense in wireless.

– Hard for transmitter to distinguish its own
transmission from incoming weak signals and
noise.

Point coordination function (PCF) uses polling
to grant stations their turn to transmit (e.g.,
cellular networks).
196
Switched Ethernet
Point-to-point connections to multi-port hub
acting like switch; no collisions.
 More efficient under high traffic load: break
large shared Ethernet into smaller segments.

Switch
Hub
197
LAN Interconnection
Extend LAN coverage.
 Interconnect different types of LAN.
 Connect to an internetwork.
 Reliability and security.

198
Interconnection Schemes

Hubs or repeaters: physical-level
interconnection.
– Devices repeat/amplify signal.
– No buffering/routing capability.

Bridges: link-layer interconnection.
– Store-and-forward frames to destination LAN.
– Need to speak protocols of LANs it interconnect.

Routers: network-layer interconnection.
– Interconnect different types of networks.
199
Bridges 1

Operate at the MAC layer.
– Interconnect LANs of the same type, or
– LANs that speak different MAC protocols.
LAN A
1
LAN B
4
5
8
B
Frames for
5->8.
Frames for
1->4
200
Bridges 2

Function:
– Listens to all frames on LAN A and accepts
those addressed to stations on LAN B.
– Using B’s MAC protocol retransmits the
frames onto B.
– Does the same for B-to-A traffic.
201
Bridges 3
Behave like a station; have multiple
interfaces, 1 per LAN.
 Use destination address to forward unicast
frames; if destination is on the same LAN,
drops frame; otherwise forwards it.
 Forward all broadcast frames.
 Have storage and routing capability.

202
Bridges 4
No additional encapsulation.
 But they may have to do header conversion
if interconnecting different LANs (e.g.,
802.3 to 802.4 frame).
 May interconnect more than 2 LANs.
 LANs may be interconnected by more than
1 bridge.

203
Bridge Protocol Architecture

IEEE 802.1D specification for MAC
bridges.
LLC
MAC
PHY
Station
LAN
MAC
PHY PHY
Bridge
LAN
LLC
MAC
PHY
Station
204
Routing with Bridges
Bridge decides to relay frame based on
destination MAC address.
 If only 2 LANs, decision is simple.
 If more complex topologies, routing is
needed, i.e., frame may traverse more than 1
bridge.

205
Routing
Determining where to send frame so that it
reaches the destination.
 Routing by learning: adaptive or backward
learning.

206
Note on Terminology: Repeaters
and Bridges

Repeaters:
– Extend scope of LANs.
– Serve as amplifiers.
– No storage/routing capabilities.

Bridges:
– Also extend scope of LANs.
– Routing/storage capabilities.
207
Bridges

Operate at the data link layer.
– Only examine DLL header information.
– Do not look at the network layer header.
208
Routing with Bridges

3 algorithms:
– Fixed routing.
– Spanning tree.
– Source routing.
209
Fixed Routing
Fixed route for every source-destination
pair of LANs.
 Does not automatically respond to changes
in load/topology.
 Statically configured routing matrix (preloaded into bridge).
 If alternate routes, pick “shortest” one.
 Rij: first bridge on the route from i to j.

210
Fixed Routing: Example
1
2
3
Source LAN
A
LAN A
A
102
101
LAN B
LAN C
107
103
LAN D
E
106
105
104
F
G
5
6
7
C
101
102
102
B 101
C 102
101
D 101
103
102
E 107
104
102
F
4
B
102
101
G 102
101
105
106
D
103
E
107
105
103
104
105
103
107
105
104
103
103
103
F
106
106
106
105
106
105
106
107
107
G
106
105
Ex: E-> F: 107; 102; 105.
211
Fixed Routing



Each bridge keeps
column for each LAN
it attaches.
Table “From X”
derived from column
“x”.
Every entry that has
the number of the
bridge results in entry.
101
From A
Dest Next
B B
C
D
B
E
F
G
From B
A
C
D
E
F
G
A
A
A
A
212
Fixed Routing
Simple and minimal processing.
 Too limited for internets with dynamically
changing topology.

213
Spanning Tree Routing
Aka transparent bridges.
 Bridge routing table is automatically
maintained (set up and updated as topology
changes).
 3 mechanisms:

– Address learning.
– Frame forwarding.
– Loop resolution.
214
Address Learning 1
Problem: determine where destinations are.
 Bridges operate in promiscuous mode, i.e.,
accept all frames.
 Basic idea: look at source address of received
frame to learn where that station is (which
direction frame came from).
 Build routing table so that if frame comes
from A on interface N, save [A, N].

215
Address Learning 2
When bridges first start, all tables are
empty.
 So they flood: every frame for unknown
destination, is forwarded on all interfaces
except the one it came from.
 With time, bridges learn where destinations
are, and no longer need to flood for known
destinations.

216
Backward Learning

Bridges look at frame’s (MAC) source
address to find which machine is accessible
on which LAN.
A
B
LAN 4
C
LAN 1
B2
LAN 2
B1
If B1 sees frame from C on LAN 2, RT entry (C, LAN2).
Any frame to C on LAN1 will be forwarded.
But, frame to C on LAN2 will not be forwarded.
LAN 3
217
Address Learning 3
RT entries have a time-to-live (TTL).
 RT entries refreshed when frames from source
already in the table arrive.
 Periodically, process running on bridge scans
RT and purges stale entries, i.e., entries older
than TTL.
 Forwarding to unknown destinations reverts to
flooding.

218
Frame Forwarding

Depends on source and destination LANs.
– If destination LAN (where frame is going to) =
source LAN (where frame is coming from),
discard frame.
– If destination LAN != source LAN, forward
frame.
– If destination LAN unknown, flood frame.

Special purpose hardware used to perform RT
lookup and update in few microseconds.
219
Loops
Alternate routes: loops.
 Example:

–
–
–
–
1
2
LAN A
LAN A, bridge 101,
LAN B, bridge 104,
LAN E, bridge 107,
LAN A.
101
LAN B
107
103
104
E
4
5
220
Loop: Problems
B
LAN 1
B1
B2
LAN 2
A
1. Station A sends frame to B; bridges B1 and B2 don’t know B.
2. B1 copies frame onto LAN1; B2 does the same.
3. B2 sees B1’s frame to unknown destination and copies it onto LAN 2.
4. B1 sees B2’s frame and does the same.
5. This can go on forever.
221
Loop Resolution
Goal: remove “extra” paths by removing
“extra” bridges.
 Spanning tree:

– Given graph G(V,E), there exists a tree that
spans all nodes where there is only one path
between any pair of nodes, i.e., NO loops.
– LANs are represented by nodes and bridges by
edges.
222
Definitions 1
Bridge ID: unique number (e.g., MAC
address + integer) assigned to each bridge.
 Root: bridge with smallest ID.
 Cost: associated with each interface;
specifies cost of transmitting frame through
that interface.
 Root port: interface to minimum-cost path
to root.

223
Definitions 2
Root path cost: cost of path to root bridge.
 Designated bridge: on any LAN, bridge
closest to root, i.e., the one with minimum
root path cost.

224
Spanning Tree Algorithm 1
1. Determine root bridge.
 2. Determine root port on all bridges.
 3. Determine designated bridges.

225
Spanning Tree Algorithm 2
Initially all bridges assume they are the root
and broadcast message with its ID, root path
cost.
 Eventually, lowest-ID bridge will be known to
everyone and will become root.
 Root bridge periodically broadcasts it’s the
root.

226
Spanning Tree Algorithm 3
Directly connected bridges update their cost
to root and broadcast message on other
LANs they are attached.
 This is propagated throughout network.
 On any (non-directly connected) LAN,
bridge closest to root becomes designated
bridge.

227
Spanning Tree: Example
LAN 2
LAN 2
10
10
B3
10
B1
10
5
LAN 3
5
B4
5
LAN 5
5
B5
5
LAN 1
10
5
B2
LAN 4
10
10
B3
10
B1
10
5
LAN 3
5
B4
5
LAN 5
5
B5
5
LAN 1
10
5
B2
LAN 4
228
Spanning Tree: Example
B1
. Only designated bridges
on each LAN allowed to
forward frames.
LAN 2
LAN 1
B4
B3
. Bridges continue
exchanging info to react
to topology changes.
B5
LAN 5
B2
LAN 3
LAN 4
229
Source Routing 1
Route determined a priori by sender.
 Route included in the frame header as
sequence of LAN and bridge identifiers.
 When bridge receives frame:

– Forward frame if bridge is on the route.
– Discard frame otherwise.
230
Source Routing 2

Route: sequence of bridges and LANs.
LAN 3
X->Z: L1,B1,L3,B3,L2.
X->Z: L1,B2,L4,B4,L2
B3
LAN 2
B1
LAN 1
Z
B2
X
B4
LAN 4
231
Source Routing 4

No need to maintain routing table.
– Frame has all needed routing information.

However, stations need to find route to
destination.
232
Route Discovery 1

Finding all routes.
– If destination is unknown, source sends
broadcast route discovery frame.
– Frame reaches every LAN.
– When reply comes back, intermediate bridges
record their id.
– Source gets complete route information.

Problem: frame explosion.
233
Route Discovery 2

Alternative: single route request frame
forwarded according to spanning tree.
LAN 1
X
Z
X
B1
LAN 3
B3
Single-route
broadcast
LAN 2
Z
LAN 4
B4
234
Route Discovery 3
L2, B3, L3, B1, L1
X
LAN 1
B1
LAN 3
B3
L2, B4, L4, B2, L1
LAN 2
Z
LAN 4
B2
B4
235
Route Selection
Select minimum-cost route, e.g., minimumhop route.
 If tie, choose the one that arrived first.
 Routes are cached with a TTL; when TTL
expires, re-discover route.

236
Routers
Operate at the network layer, i.e., inspect
the network-layer header.
 Usually main router functionality
implemented in software.
 Store-and-forward.
 Ability to interconnect heterogeneous
networks: address translation, link speed
and packet size mismatch.

237
The Network Layer
238
Goals

Get data from source to destination.
– May require traversing many hops and
involving intermediate routers.
In contrast with data link layer: frames from
one end of a wire to the other.
 Network layer as lowest end-to-end
transmission layer: multiple hops.

239
Routing and Internetworking

Based on knowledge of network topology,
choose appropriate paths from source to
destination.
– Load balancing across routers and links.
– Avoid congestion.

Network interconnection: internetworking.
– Source and destination in different networks.
240
Design Issues
Services provided to transport layer.
 Design/implementation of the subnet.

Router
End system
Router
Router
Router
Subnet
241
[Circuit- versus PacketSwitching]

Circuit Switching
– Physical circuit (physical connection) is
establish between source and destination
throughout the network (involving switches and
links).
– This happens before any data can be sent.
242
Circuit Switching
243
Packet Switching
Special case of message switching.
 No physical path establishment ahead of time.
 As data moves from source to destination,
route is formed one hop at a time: store-andforward.
 On-demand resource acquisition as opposed to
circuit switching where resources reserved
statically beforehand.

244
Context

We are talking about packet switching
networks!
245
Services Provided to Transport
Layer




Network/transport layer interface: typically interface
between carrier (netwrk service provider) and end
user.
NSP has control over protocols up to network layer.
Network/transport interface needs to be very well
defined.
Types of service: connection-less versus connectionoriented
246
Connection-less service
Internet.
 E2E argument.

– Push functionality closer to users.
Error and flow control at higher layers.
 No delivery or ordering guarantees.
 Every packet must carry full destination
address (each packet independent of the
other).

247
Connection-oriented


Telephone and ATM networks.
Network-layer connection:
– Logical connection between network-layer processes at
sender and receiver.
– Connection ID used to identify PDUs.
– Connection set up (QoS, cost negotiation) and tear down.
– Full duplex communication.
– Reliable and ordered delivery.
248
Internet over ATM
Source first establishes ATM network-layer
connection to destination; then send IP
packets over it.
 Inefficient: duplicate functionality.

– Example: ordered delivery guarantees at the
ATM network layer and TCP packet reordering mechanism.
249
Network Layer Design
Connection-oriented versus connection-less
infrastructure.
 Connection-oriented: virtual circuit
 Connection-less: datagrams.

250
Virtual Circuit
Analogy to physical circuits used by
telephone networks.
 At connection establishment time, path
from source to destination is selected and
used throughout connection lifetime.
 When connection is over, virtual circuit
terminated.

251
Datagram
No logical connection.
 Each packet (datagram) routed
independently; successive packets may
follow different routes.
 More work at intermediate routers, but more
robust and adaptive to failures and
congestion.

252
Routers

For VCs, routers keep a table with (VC
number, outgoing interface) entries.
– Packets only need to carry VC number.

For datagrams, routing table.
– (destination, outgoing interface) entries.
– Each packet must carry destination address.
253
Combinations of Service and
Subnet Structure
Datagram
Connectionless
Connectionoriented
UDP
over
IP
TCP
over
IP
Virtual Circuit
UPD
over IP
over
ATM
ATM
over
ATM
254
Routing Algorithms 1
Routing is main function of network layer.
 Routing algorithm: decides which route a
packet should take from source to
destination.

– For router: which interface a packet should be
forwarded.
255
Routing Algorithms 2
If datagram network, decision is made for
every packet.
 If VC, decision is made only once when VC
is setup.

256
Routing Metrics

Routing algorithms can use different
metrics when building/selecting routes.
– Example:
» Number of hops.
» Delay.
» Bandwidth.
257
Adaptive and Non-adaptive
Routing

Non-adaptive routing:
– Fixed routing, static routing.
– Do not take current state of the network (e.g., load,
topology).
– Routes are computed in advance, off-line, and downloaded
to routers when booted.

Adaptive routing:
– Routes change dynamically as function of current state of
network.
– Algorithms vary on how they get routing information,
metrics used, and when they change routes.
258
Optimality Principle


General statement about optimal routes (topology,
routing algorithm independent).
If router J is on optimal path between I and K, then
the optimal path from J to K also falls along the same
route.
– Proof by contradiction.

Corollary:
– Set of optimal routes from all sources to destination form a
tree rooted at destination.
– Sink tree.
259
Adaptive and Non-adaptive
Routing

Non-adaptive routing:
– Fixed routing, static routing.
– Do not take current state of the network (e.g., load,
topology).
– Routes are computed in advance, off-line, and downloaded
to routers when booted.

Adaptive routing:
– Routes change dynamically as function of current state of
network.
– Algorithms vary on how they get routing information,
metrics used, and when they change routes.
260
Optimality Principle


General statement about optimal routes (topology,
routing algorithm independent).
If router J is on optimal path between I and K, then
the optimal path from J to K also falls along the same
route.
– Proof by contradiction.

Corollary:
– Set of optimal routes from all sources to destination form a
tree rooted at destination.
– Sink tree.
261
Static Algorithms
 Shortest-path
routing.
 Flooding.
262
Shortest Path Routing 1
Dijkstra (1959).
 Network represented by graph G(V, E),
where V is set of nodes and E is set of links
connecting nodes.
 What is “shortest”?

– Different metrics.
– Example: number of hops (static), geographic
distance (static), delay, bandwidth (raw versus
available), combination of a subset of these.
263
Dijkstra’s Shortest Path
Nodes labeled with distance to source
through best known path.
 At start, no known paths so all nodes
labeled with infinity.
 As algorithm progresses, nodes are labeled;
“tentative” labels may change, while
“permanent” labels don’t change.
 Label made permanent when it’s known to
be in the shortest path to source.

264
Dijkstra’s Algorithm: Example
B
2
7
A
6
E
2
1
A
6
B
A
6
3
(4,B)
E
F
4
G(5,E)
F
4
6
2
2
2
H(9,G)
B (2,A)
7
2
1
2
B (2,A)
2
2
1
A
6
F
4
2
A
6
2
1
2
C (9,B)
3
(6,E)
D
(4,B)
E
F
2
4
H
C (9,B)
3
(6,E)
D
7
(4,B)
E
G(5,E)
D
2
H
7
G(5,E)
B
C
3
3
E
G (6,A)
C (9,B)
3
D
C (9,B)
3
(6,E)
D
(4,B)
E
A
H
7
2
1
2
H
7
G (6,A)
2
2
4
G
B (2,A)
D
2
F
1
2
3
2
2
11
C
3
F
4
2
H(8,F)265
Flooding
Every incoming packet forwarded on every
outgoing link except the one it arrived on.
 Problem: duplicates.
 Constraining the flood:

– Hop count.
– Keep track of packets that have been flooded.

Robust, shortest delay (picks shortest path as
one of the paths).
266
Dynamic Routing Algorithms
Distance vector routing.
 Link state routing.

267
Distance Vector Routing 1




Each router keeps routing table (or routing vector)
giving best known distance to each destination and
the corresponding outgoing interface.
Routing tables are updated by exchanging routing
information with neighbors.
Aka, Bellman-Ford, Ford-Fulkerson.
Original ARPANET routing; also used by Internet’s
RIP.
268
Distance Vector 2

Routing table at each router:
– One entry per participating router.
– Each entry contains outgoing interface and
distance to corresponding destination.
– Metric: number of hops, delay, queue length.
– Each router knows distance to its neighbors.

Old ARPANET algorithm: DV where cost
metric is outgoing link queue length.
269
Routing Updates




Every T interval, routers exchange routing updates.
Routing update from router X consists of a vector
with all destinations and the corresponding distance
from X to them.
When router Y receives an update from X, it can
estimate its distance to router Z through X as Dyz =
Dyx + Dxz.
Router Y receives update from all its neighbors;
discards its RT and builds a new one.
270
Distance Vector: Example
2
5
2
2 9
1
1
3
4
79
3
3
1
1
6
5 2
Node Distance Next
1
0
-
2
3
2
3
2
4
4
5
1
2
4
4
6
4
4
Node Distance Next
1
0
-
2
3
2
5
2
3
4
5
6
1
6
8
4
3
3
T=T0
2
3
0
3
7
4
5
4
2
3
2
3
5
0
2
1
3
2
0
1
3
T=T1
T=T2
271
Problems
Routing loops.
 Slow convergence.
 Counting to infinity.

272
Count-to-Infinity 1

Good news propagates faster.
A
Initially, A down:
A comes up:
B
infinity
1
1
1
1
C
D
E
infinity infinity infinity
infinity infinity infinity
2
infinity infinity
2
3
infinity
2
3
4
(after 1 exchange)
(after 2 exchanges)
(after 3 exchanges)
(after 4 exchanges)
273
Count-to-Infinity 2

But, bad news propagate slower!
A
Initially, all up:
A goes down:
B
1
3
3
5
5
7
7
C
2
2
4
4
6
6
8
E
D
3
3
3
5
5
7
7
….
infinity
4
4
4
4
6
6
8
(after 1 exchange)
(after 2 exchanges)
(after 3 exchanges)
(after 4 exchanges)
(after 5 exchanges)
(after 6 exchanges)
274
Count-to-Infinity 3
Gradually routers work their way up to
infinity.
 Number of exchanges depends on how large
is infinity.
 To reduce number of exchanges, if metric is
number of hops, infinity=maximum
path+1.

275
Solution

Routing loops:
– Path vector: record actual path used in the DV.
– Previous hop tracing: records preceding router.

Count-to-infinity:
– Split horizon: router reports to neighbor cost
“infinity” for destination if route to that
destination is through that neighbor.
276
Split Horizon
Tries to make bad news spread faster.
 A node reports infinity as distance to node X
on link packets to X are sent on.
 Example, in the first exchange, C tells D its
distance to A but tells B its distance to A is
infinity.

– So B discovers its link to A is down and C’s
distance to A is infinity; so it sets its distance to A
to infinity.
277
Link State Routing 1
DV routing used in the ARPANET until 1979,
when it was replaced by link state routing.
 Used by the Internet’s OSPF.

278
Link State Routing 2

Link state routing is based on:
– Discover your neighbors and measure the
communication cost to them.
– Send updates about your neighbors to all other
routers.
– Compute shortest path to every other router.
279
Finding Neighbors
When router is booted, its first task is to
find who its neighbors are.
 Special single-hop “hello” packets.
 Cost metric:

– Number of hops: in this case, always 1.
– Delay: “echo” packets and measure RTT/2.
– Load?
280
Generating Link State Updates

Link state packets (LSP).
–
–
–
–

Sender identity.
Sequence number.
TTL.
List of (neighbor, cost).
When to send updates?
– Proactive: periodic updates; how often?
– Reactive: whenever some significant event is detected,
e.g., link goes down.

Where to send them? Everywhere: flood.
281
Processing Updates

When LSP is received:
– Check sequence number.
– If higher than current sequence number, keep it
and flood it; otherwise, discard it.
– Periodically decrement TTL.
» When TTL=0, purge LSP.
282
Computing Routes

Routers have global view of network.
– They receive updates from all other routers
with their cost to their neighbors.
– Build network graph.

Use Dijkstra’s shortest-path algorithm to
compute shortest paths to all other nodes.
283
DV versus LS

DV:
– Node tells its neighbors what it knows about everybody.
– Based on other’s knowledge, node chooses best route.
– Distributed computation.

LS:
– Node tells everyone what it knows about its neighbors.
– Every node has global view.
– Compute their own routes.
284
Hierarchical Routing

For scalability:
– As network grows, so does RT size, routing update
generation, processing, and propagation overhead, and
route computation time and resources.

Divide network into routing regions.
– Routers within region know how to route packets to all
destinations within region.
– But don’t know how to route within other regions.
– “Border” routers: route within regions.
285
Hierarchical Routing Example
1B
1A
1A Dest. Next Hops
2A
2B
1C
4A
2C
2D
5B
5A
3A
3B
5C
4B 4C
5E
5D
1A
1B
1C
2A
2B
2C
2D
3A
3B
4A
4B
4C
5A
5B
5C
5D
5E
1B
1C
1B
1B
1B
1B
1C
1C
1C
1C
1C
1C
1C
1B
1C
1C
1
1
2
3
3
4
3
2
3
4
4
4
5
5
6
5
286
Hierarchical Routing Example
1B
1A
2A
2B
1C
4A
3A
A
3B
2D
2C
5B
5A
5C
Dest. Next Hops
1A
1B
1C
2
3
4
5
1B
1C
1B
1C
1C
1C
1
1
2
2
3
4
4B 4C
5E
5D
287
Hierarchical Routing

Optimal paths are not guaranteed.
– Example: 1A->5C should be via 2 and not 3.

How many hierarchical levels?
– Example: 720 routers.
» 1 level: each router needs 720 RT entries.
» 2 levels: 24 regions of 30 routers: each router’s RT
has 30+23 entries.
» 3 levels: 8 clusters of 9 regions with 10 routers: each
router’s RT 10+8+7.
288
Many-to-Many Routing
Support many-to-many communication.
 Example applications: multi-point data
distribution, multi-party teleconferencing.

289
Broadcasting

Simplistic approach: send separate packet to
each destination.
– Simple but expensive.
– Source needs to know about all destinations.

Flooding:
– May generate too many duplicates (depending
on node connectivity).
290
Multidestination Routing
Packet contains list of destinations.
 Router checks destinations and determines
on which interfaces it will forward packet.

– Router generates new copy of packet for each
output line and includes in packet only the
appropriate set of destinations.
– Eventually, packets will only carry 1
destination.
291
Spanning Tree Routing
Use spanning tree (sink tree) rooted at
broadcast initiator.
 No need for destination list.
 Each on spanning tree forwards packets on all
lines on the spanning tree (except the one the
packet arrived on).
 Efficient but needs to generate the spanning
tree and routers must have that information.

292
Reverse Path Forwarding
Routers don’t have to know spanning tree.
 Router checks whether broadcast packet
arrived on interface used to send packets to
source of broadcast.

– If so, it’s likely that it followed best route and
thus not a duplicate; router forwards packet on
all lines.
– If not, packet discarded as likely duplicate.
293
Multicasting

Special form of broadcasting:
– Instead of sending messages to all nodes, send
messages to a group of nodes.

Multicast group management:
– Creating, deleting, joining, leaving group.
– Group management protocols communicate
group membership to appropriate routers.
294
Multicast Routing

Each router computes spanning tree covering
all other participating routers.
– Tree is pruned by removing that do not contain
any group members.
2
2
1
1,2
1,2
2
1
2
1
1
2
2
2
2
1
2
1
1
1
1
1,2
1,2
2
1
2
295
1
Shared Tree Multicasting

Source-rooted tree approaches don’t scale
well!
– 1 tree per source, per group!
– Routers must keep state for m*n trees, where m is number
of sources in a group and n is number of groups.

Core-based trees: single tree per group.
– Host unicast message to core, where message is multicast
along shared tree.
– Routes may not be optimal for all sources.
– State/storage savings in routers.
296
Congestion Control

Ideal network behavior:
Packets
delivered
Maximum capacity
Packets
sent
297
Network Congestion

What is network congestion?
– Too many packets in the network.
– Router queues are always full.
» Routers start dropping packets.
– Congestion can fuel itself.
» Packet drops lead to retransmissions.
» More traffic!
– May result in congestion collapse!
» Close to 0 throughput!
298
Infinite-Buffer Routers

Intuition says add more memory to routers
and that’ll avoid congestion.
– Nagle (1987) showed that infinite buffers
actually make congestion worse.
– More packets enqueued for long time; they time
out and are retransmitted; but still transmitted
by router.
– Therefore, more traffic.
299
Causes of Congestion

Mismatch in capacity among different parts of
the system.
– Mismatch in link speeds.
R
– Mismatch in router processing capability.
» Table lookup and update.
» Queue management.

Congestion in one point of network tends to
propagate backwards toward sender.
300
Congestion versus Flow Control

Congestion control tries to ensure the
network is able to carry offered traffic.
– Involves hosts and intermediate routers.

Flow control ensures that the
communication end-points are able to keep
up with one another.
– Involves only the end-points.
301
Congestion and Flow Control

Often mixed because tend to use same
feedback mechanisms.
– Example: “slow down” message received at
host may be caused by receiver not being able
to keep up with sender host or by network not
being able to handle additional traffic.
302
Congestion Control Principles

From control theory point of view:
– Open and closed loop solutions.

Open loop solutions:
– Avoidance approach.
» Tries to make sure problem doesn’t happen.
» Doesn’t take current network state into account.

Closed loop solutions:
– Feedback loop.
303
Closed Loop Solutions

3 components:
– Monitoring.
– Feedback generation.
– Operation adjustment.

Monitoring metrics:
–
–
–
–
Packet loss.
Average queue length.
Number of retransmitted packets.
Average packet delay.
304
Feedback

Send information about the problem once it’s
detected.
– Router that detects problem sends packet to traffic
source(s).
– Special-purpose bit in every packet that router sets
when it detects congestion above certain level to
warn neighbors.
– Special probe messages to detect congested areas
so they can be avoided.

Stability: avoid oscillations.
305
Congestion Control Taxonomy

Open loop algorithms:
– Act at source.
– Act at destination.

Closed loop algorithms:
– Explicit feedback.
– Implicit feedback.
306
Open Loop Approaches

Traffic Shaping
– Avoid traffic burstiness by forcing packets to be
transmitted at more predictable rate.
– Used in ATM networks.
– Regulates average transmission rate.
– In contrast to sliding window protocols which
regulate amount of data in transit.
– Service agreement between user and carrier.
» Important to real-time traffic such as audio, video.
307
Leaky Bucket 1
Host
Unregulated
flow
Network
interface
1. No matter the rate water enters
bucket, the outflow is constant.
2. Once bucket full, water spills and
lost.
Regulated
flow
Network
308
Leaky Bucket 2
Equivalent to a single-server queuing
system with constant service time.
 Same size packets (e.g., ATM cells): use
packets as unit.
 Variable-sized packets: use numbr of bytes
per clock tick.

309
Token Bucket
More flexible.
 Allows packets to go out as fast as they come
in provided there are enough tokens.
 Leaky bucket holds tokens generated every T
sec.
 Allows hosts to save up for later.

– Hosts can accumulate up to n tokens, when n is
bucket size.
310
Leaky and Token Bucket
Token bucket throws away tokens but never
packets.
 Can be used between host and network and
between routers.
 Token bucket can still produce bursts.

– Insert leaky bucket after token bucket.
311
Flow Specifications

Way for user/application to specify traffic patterns
and desired quality of service.
– Before connection established or data is sent, source
provides flow spec to network.
– Network can accept, reject, or counter-offer.

Example: flow spec language by Partridge (1992).
– Traffic spec: maximum packet size, maximum
transmission rate.
– Service desired: maximum acceptable loss rate, maximum
delay and delay variation.
312
Closed Loop Approaches

Virtual circuit networks:
– Admission control:
» Once congestion is detected, no more virtual circuits
are set up until problem is gone.
– Avoid congested areas.
– Resource reservation based on service
agreement.
» Resources include space (table, buffer) in routers,
link bandwidth.
313
Choke Packets 1



Closed loop approach.
Can be used in both VC and DG networks.
Main idea:
– Routers detect congestion.
» Example: routers measure utilization of its output lines; if it goes
above threshold, congestion warning.
» New packet using line in warning state will be forwarded normally
(tagged for no more choke packets), but generates choke packet
back to source with destination.
314
Choke Packets 2

Hosts receiving choke packets:
– Decrease their traffic to the problematic
destination.
– Ignore other choke packets for the same
destination for some period of time.
– After that period, if more choke packets for same
destination, reduce traffic even more, etc.

Reducing traffic:
– Adjust window size, leaky bucket rate, etc.
315
Hop-by-Hop Choke Packets
Goal is to provide quick relief at congestion
point.
 Choke packet takes effect at every hop it
passes through.
 Intermediate nodes reduce traffic on
corresponding output line.

– More buffers since input traffic stays the same
until choke packet reaches previous hop.
316
Fair Queuing

Problem with choke packets:
– Route sends signal, but it’s up to host to react.
– Well-behaved hosts loose!

Fair queuing makes compliance attractive.
– Routers have multiple queues per output line.
– One queue per source.
– Router scans queues in round robin, transmitting
first packet on next queue.
317
Weighted Fair Queuing
Enable different priorities.
 Different queues may have different
priorities.
 Handle various types of traffic differently.

318
Load Shedding 1


If everything else fails, routers simply drop packets.
Choosing packets to drop:
– Randomly.
– Some packets are worth more than others.
» Application dependent


Data distribution: old packets more important than new.
Real-time applications: new more important than old.
– Applications need to mark packets with their priority
319
Load Shedding 2
Marking packets required special bits in
packet header.
 ATM cells have 1 bit in the header reserved
for this purpose.
 When routers sense some congestion build
up, better to start dropping packets early
rather than waiting until it becomes
completely swamped.

320
Internetworking

Interconnection of 2 or more networks
forming an internetwork, or internet.
– LANs, MANs, and WANs.

Different networks man different protocols.
– TCP/IP, IBM’s SNA, DEC’s DECnet, ATM,
Novell and AppleTalk (for LANs).
– Also, satellite and cellular networks.
321
Example Internet
LAN-WANLAN
802.5
LAN
R
802.3
LAN
B
802.4
LAN
X.25 WAN
R
LAN-LAN
802.3
LAN
R
LANWAN
R
Gateway: device connecting 2 or
more different networks.
SNA WAN
322
Gateways





Repeaters: operate at physical layer (bits);
amplify/regenerate signal.
Bridges: store-and-forward frames; data link layer
devices.
Routers: operate at network layer.
Transport gateways: connect networks at the
transport layer.
Application gateways: connect 2 parts of an
application at application layer.
323
Half-Gateways
Gateway is split in two: each half owned
and operated by one of the network
providers.
 Common protocol between the 2 halves.

Half-gateway
N1
N2
324
How do networks differ?










Service offered: connection-oriented versus connection-less.
Protocols: IP, IPX, AppleTalk, DECnet.
Addressing: flat (802) versus hierarchical (IP).
Maximum packet size.
Quality of service.
Error control: reliable, ordered, unordered delivery.
Flow control: sliding window versus rate-based.
Congestion control: leaky bucket, choke packets.
Security: privacy rules, encryption.
Parameters: different timeouts.
325
Types of Internetworks

Connection-oriented concatenation of VC
subnets.
– VC between source and router closest to destination
network.
– Router builds V to gateway to other subnet.
– Gateway keeps state about that VC.
– Builds VC to router in the next subnet, etc.

Every packet traverses same path.
– Ordered delivery.
– Routers convert between packet formats.
326
Connection-oriented
concatenation
VC between source and router closest to
destination network.
 Router builds VC to gateway to other
subnet. Gateway keeps state about VC.
 Gateway builds VC to router in the next
subnet, etc.
 Every packet traverses same path.

– Ordered delivery.
– Routers convert between packet formats.
327
Connectionless Internetworking

Datagram model.
– Different packets may take different routes.
– Separate routing decision for each packet.
– No ordered delivery guarantees.
328
Datagram versus VC Internets

VC:
– Plus’s: resources reserved in advance, ordered
delivery, short headers.
– Minus’s: vulnerability to failures, less adaptive,
hard if involving datagram subnet.

Datagram:
– Plus’s: more robust and adaptive, can be used over
datagram subnets (many LANs, mobile networks).
– Minus’s: Longer headers, unordered delivery.
329
Tunneling

Interconnecting through a “foreign” subnet.
Tunnel
Ethernet 2
Ethernet 1
G
G
WAN
IP
Ethernet frame
IP
IP
Ethernet frame
IP packet inside
payload field of
WAN packet.
330
Internetwork Routing 1

2-level hierarchy:
– Routing within each network: interior gateway protocol.
– Routing between networks: exterior gateway protocol.


Within each network, different routing algorithms
can be used.
Each network is autonomously managed and
independent of others: autonomous system (AS).
331
Internetwork Routing 2
Typically, packet starts in its LAN.
Gateway receives it (broadcast on LAN to
“unknown” destination).
 Gateway sends packet to gateway on the
destination network using its routing table.
If it can use the packet’s native protocol,
sends packet directly. Otherwise, tunnels it.

332
Fragmentation 1

Network-specific maximum packet size.
– Width of TDM slot.
– OS buffer limitations.
– Protocol (number of bits in packet length field).

Maximum payloads range from 48 bytes
(ATM cells) to 64Kbytes (IP packets).
333
Fragmentation 2




What happens when large packet wants to travel
through network with smaller maximum packet size?
Fragmentation.
Gateways break packets into fragments; each sent as
separate packet.
Gateway on the other side have to reassemble
fragments into original packet.
2 kinds of fragmentation: transparent and nontransparent.
334
Transparent Fragmentation


Small-packet network transparent to other subsequent
networks.
Fragments of a packet addressed to the same exit
gateway, where packet is reassembled.
– OK for concatenated VC internetworking.


Subsequent networks are not aware fragmentation
occurred.
ATM networks (through special hardware) provide
transparent fragmentation: segmentation.
335
Problems with Transparent
Fragmentation

Exit gateway must know when it received all
the pieces.
– Fragment counter or “end of packet” bit.
Some performance penalty but requiring all
fragments to go through same gateway.
 May have to repeatedly fragment and
reassemble through series of small-packet
networks.

336
Non-Transparent Fragmentation

Only reassemble at destination host.
– Each fragment becomes a separate packet.
– Thus routed independently.

Problems:
– Hosts must reassemble.
– Every fragment must carry header until it
reaches destination host.
337
Keeping Track of Fragments 1
Fragments must be numbered so that original
data stream can be reconstructed.
 Tree-structured numbering scheme:

– Packet 0 generates fragments 0.0, 0.1, 0.2, …
– If these fragments need to be fragmented later on, then
0.0.0, 0.0.1, …, 0.1.0, 0.1.1, …
– But, too much overhead in terms of number of fields
needed.
– Also, if fragments are lost, retransmissions can take
alternate routes and get fragmented differently.
338
Keeping Track of Fragments 2
Another way is to define elementary fragment
size that can pass through every network.
 When packet fragmented, all pieces equal to
elementary fragment size, except last one
(may be smaller).
 Packet may contain several fragments.

339
Keeping Track of Fragments 3

Header contains packet number, number of first
fragment in the packet, and last-fragment bit.
Last-fragment bit
E F G H I
27 0 1 A B C D
Number of
first fragment
Packet number
27 0
0 A B
C D
E
F
G
H
1 byte
J
(a) Original packet
with 10 data bytes.
27 8
1 I
(b) Fragments after passing through network
with maximum packet size = 8 bytes.
J
340
Firewalls 1

Analogy: ditch around medieval castles.
– To enter or exit castle, must pass over single bridge.


Firewalls force traffic to and from company through
single point.
Firewalls typically consist of:
– Packet filters (one for incoming, other for outgoing
packets).
– Application gateway.
341
Firewalls 2

Application
gateway
Packet filter: router
equipped with capability of
inspecting packets.
– Packets that meet criteria are
forwarded; others discarded.

Corporate
network
Outside
world
Application gateways
operate at application level;
e.g., mail gateway.
342
The Internet Network Layer
The Internet as a collection on networks or
autonomous systems (ASs).
 Hierarchical structure.

Transcontinental
links
Regional
network
US
backbone
Transcontinental
links
European
backbone
National
network
343
IP (Internet Protocol)
Glues Internet together.
 Common network-layer protocol spoken by all
Internet participating networks.
 Best effort datagram service:

– No reliability guarantees.
– No ordering guarantees.
344
IP
Transport layer breaks data streams into
datagrams; fragments transmitted over
Internet, possibly being fragmented.
 When all packet fragments arrive at
destination, reassembled by network layer
and delivered to transport layer at
destination host.

345
IP Versions

IPv4: IP version 4.
– Current, predominant version.
– 32-bit long addresses.

IPv6: IP version 6 (aka, IPng).
– Evolution of IPv4.
– Longer addresses (16-byte long).
346
IP Datagram Format
IP datagram consists of header and data (or
payload).
 Header:

– 20-byte fixed (mandatory) part.
– Variable length optional part.
347
IP Header
32 bits
Version Header Type of
length
service
Identification
TTL
Total length
U D M Fragment offset
Protocol
Header checksum
Source address
Destination address
Options
348
IP Header Fields 1




Version: which IP version datagram uses.
Header length: how long (in 32-bit words) is header;
minimum=5; maximum=15 (options=40 bytes).
Type of service: precedence (priority), 3 flags (delay,
throughput, reliability). In practice, routers ignore
type of service.
Total length: length of total datagram, i.e., header +
data (max = 64Kbytes).
349
IP Header Fields 2
Identification: which datagram fragment
belongs to.
 U: unused bit.
 D: don’t fragment.
 M: more fragments.
 Fragment offset: position of fragment in
datagram.
 TTL: datagram lifetime.

350
IP Header Fields 3
Protocol: number of the transport protocol
that generated the datagram.
 Header checksum: verifies header integrity;
computed at each hop.
 Source and destination address: IP
addresses of source and destination.
 Options: way of extending the protocol.

351
Addressing

Required for packet delivery.
– Each network may use different addressing
scheme.
– Addresses must be unique.
Flat addresses: physical addresses (e.g.,
Ethernet address).
 Hierarchical addresses: use hierarchy
scheme like postal addresses (e.g., IP).

352
Address Types
Unicast: uniquely distinguishes a single
node.
 Multicast: shared by a group of nodes.
 Broadcast: shared by all nodes.

353
IP Addresses
Every host and router on the Internet must
have an IP address.
 2-level hierarchy:

– Network number.
– Host number.

Notations:
– Binary: 10000000 00000110 11110000 00000011
– Dotted decimal: 128.6.240.3
354
IP Address Formats 1

4 different classes:
Network
Host
Class A:
0XXXXXXX
128 nets.
16M hosts/net.
Class B:
10XXXXXX XXXXXXXX
16K nets.
64K hosts/net.
Class C:
110XXXXX XXXXXXXX XXXXXXXX
2M nets.
256 hosts/net.
Class D:
1110XXXX XXXXXXXX XXXXXXXX XXXXXXXX
Multicast.
355
IP Address Formats 2
Class A: 1~127.
 Class B: 128~191.
 Class C: 192~223.
 Class D: 224~239.

356
Multi-addresses

A router usually has more than one IP
address.
236.240.128.0
129.98.0.0
129.98.95.1

236.240.128.3
80.0.0.8
80.0.0.0
Multi-homed host: host with multiple
network interfaces each of which has
different IP address.
357
Management and Scalability 1
Network numbers assigned by single
authority: NIC (network information
center).
 All hosts in a network must have same
network number.
 What if networks grow?

358
Management and Scalability 2

Example: company starts with 1 class C
LAN, thus can connect up to 256 hosts.
– It might grow to more than 256 hosts.
– It might get more LANs.
– For every new LAN, need new network number
from NIC.
– Moving machines between LANs needs address
change.
359
Subnetting 1

Split address space into several “internal”
subnets.
– Still act like single network to outside world.
360
Subnetting 2

Routing: hierarchical.
– (network, -) entries: distant networks hosts.
– (this network, host) entries: local hosts.
– Routers only need to keep track of other networks and
local hosts.

With subnetting:
–
–
–
–
(network, -) entries: distant networks hosts.
(this network, subnet, -).
(this network, this subnet, host).
Adds extra hierarchical level
361
Subnet Mask

Used to compute the subnet number; i.e., gets
rid of the host number.
– Facilitates routing table look-up.
– IP address AND subnet mask = subnet #

Example:
10XXXXXX XXXXXXXX SSSSSSHH
HHHHHHHH
11111111 11111111 11111100 00000000
Ex: 130.50.15.6 AND subnet mask = 130.50.12.0
362
Internet Control Protocols
IP carries data.
 There are other network layer protocols that
carry control information.
 Example: ICMP

363
ICMP
Internet Control Message Protocol.
 Report specific events.

– Generated by routers.
– Encapsulated in IP packets.
364
ICMP Messages
Destination unreachable
Time exceeded
Parameter problem
Source quench
Redirect
Echo request
Echo reply
Timestamp request
Timestamp reply
Packet couldn’t be delivered
TTL field hit 0
Invalid header field
Choke packets
Route problem
Check if destination is up
Destination responds
Same as echo request + TS
Same as echo reply + TS
365
Mapping IP to DLL Address
Internet applications refer to hosts by their IP
addresses; once packet gets to destination
LAN, node needs to figure out the destination
address.
 One solution is to have configuration file.

– Hard to maintain/update.

Address Resolution Protocol (ARP):
– Run by every node to map IP to DLL address
(RFC 826).
366
ARP

Advantage:
– Easy to administer, less human intervention.
– Example: 2 hosts on the same Ethernet want to
communicate.
» Host 1 must figure out host 2’s Ethernet address.
» Host 1 broadcasts ARP packet on Ethernet asking for
the Ethernet address of host 2.
» Host 2 receives the ARP request, and replies with its
Ethernet address.
367
ARP Optimizations

Caching of ARP replies.
– Entries may have large TTLs.
When sending ARP request, piggyback its
own IP-DLL address mapping.
 Every machine broadcasts its mapping at
boot time.

– No response is expected.
– Other machines cache that information.
368
Proxy ARP

What if host 1 wants to send data to host 3
on a different LAN?
– Router connecting the 2 LANs can be
configured to respond to ARP requests for the
networks it interconnects: proxy arp.
– Another solution is for host 1 to recognize host
3 is on remote network and use default LAN
address that handles all remote traffic; that
could be the router’s Ethernet address.
369
RARP
Reverse Address Resolution Protocol.
 Given LAN address, what’s the IP address?
 Usually for booting diskless workstation.

–
–
–
–
Gets the OS image from remote file server.
Same image for all machines.
Machine broadcasts its LAN address.
Remote RARP server responds with machine’s IP
address.
370
BOOTP
RARP broadcasts are not forwarded by
routers.
 Need RARP server on every network.
 BOOTP uses UDP messages that are
forwarded by routers.

– Also provides additional information such as IP
address of file server holding OS image, subnet
mask, etc.
371
Internet Routing

IGPs and EGPs
– IGPs: routing within ASs.
– EGPs: routing between ASs.
372
IGPs

Original Internet IGP was RIP.
– Distance vector.
– OK for small ASs but not efficient as ASs got larger.

New IGP: OSPF.
–
–
–
–
Open Shortest Path First.
Became standard in 1990.
Link state algorithm.
RIP is still running but OSPF is taking over.
373
OSPF 1

Design requirements:
–
–
–
–
Open implementation.
Support for various distance metrics: delay, hops, etc.
Dynamic: automatically adapt to topology changes.
QoS Routing: real-time versus other traffic using IP’s type
of service field.
– Load balancing across multiple lines.
– Security and tunneling.
374
OSPF 2
Abstracts collection of networks, routers and
lines into a directed graph where edges are
assigned a cost proportional to the routing
metric.
 It then computes shortest path.
 Hierarchical routing within ASs.

– Areas: collection of contiguous networks.
– Area 0: AS backbone; all areas connected to it.
375
OSPF 3

Type of service routing:
– Uses different graphs labeled with different
metrics.

Routing updates:
– Adjacent routers exchange routing information.
– Adjacent routers are on different LANs.
– Reliable link state updates with sequence #’s.
376
EGPs
Routing protocol between ASs.
 Take policy into account.

– An AS may not be willing to carry traffic
originating and destined to foreign ASs.
– Example: phone companies are willing to carry
traffic for their customers but not for others.
377
Routing Policy Examples
No transit traffic through certain ASs.
 Traffic source restricts ASs through which
its traffic crosses.
 Same for destination.

378
BGP 1
Border Gateway Protocol.
 Policies are manually configured into BGP
routers.
 BGP abstracts networks as a collection of
BGP routers and the their links.
 2 BGP routers are connected if they share a
common network.
 BGP routers communicate reliably using TCP.

379
BGP 2

3 types of networks:
– Stub networks: have a single connection in the
BGP graph; cannot carry transit traffic.
– Multi-connected networks: have multiple
connections but refuse to carry transit traffic.
– Transit networks: agree to carry transit (3rd.
party) traffic possibly with some restriction;
e.g., backbones.
380
BGP 3
BGP is a distance vector protocol.
 Routing table entries keep whole path to
destination + distance.
 BGP routers can discard the paths containing
itself: avoiding loops and counting to infinity.
 Routers compute distance associated to a route
taking policy into account.

– If policy is violated, distance = infinity.
381
Internet Multicasting

IP supports multicasting using class D
addresses.
– Each class D address identifies a group of
hosts.
– 28 bits define over 250 million groups.

Best-effort delivery.
382
Group Membership
Hosts (single or multiple processes) may join
and leave group.
 Special, multicast routers perform multicast
routing and packet forwarding.

– Hosts belonging to multicast groups periodically
send messages to the closest multicast router.
– Multicast routers and hosts use IGMP (Internet
Group Management Protocol) to exchange
membership information.
383
IP Multicast Routing
Use spanning trees.
 Modified distance vector protocol using
unicast routing information.

– Build one spanning tree per source, per group.
– Or, one shared spanning tree per group.
– Use pruning to remove parts of the tree that don’t
have any multicast group members.
– Use tunneling to cross regions that are not
multicast capable.
384
Mobile IP 1

Support for mobile users.
– “Last hop” mobility.

Problem: IP addressing scheme.
– Class+network number+host number.
– If host moves and attaches itself to foreign
network, packets destined to it will still go to its
home network.
– Assigning hosts new IP address?
» Too much hassle.
385
Mobile IP 2

Solution:
– Home agent: runs at the home network.
– Foreign agent: runs at foreign network.
– When mobile host connects itself to foreign
network, registers with foreign network’s
foreign agent.
– Foreign agent assigns host care-of address, and
informs home agent.
386
Mobile IP 3
Sending packets: mobile host uses its care-of
address.
 Receiving packets:

– When packet arrives at home network, router that gets it
sends ARP request for that IP address.
– Home agent replies with its own Ethernet address. It gets
the packet, and tunnels it to foreign agent. Foreign agent
delivers packet to mobile host.
– Home agent sends care-of address to sender, so future
packets are sent directly to foreign network.
387
Mobile IP 4

Locating foreign agents:
– Foreign agents periodically broadcast their address and
service provided (e.g., home, foreign, or both).
– Mobile host can announce its presence and wait for
response from foreign agent.

Unregistration:
– If host leaves without unregistering, its registration expires
after some time.

Security:
– Authentication issues.
388
Scaling IP Addresses 1

Exponential growth of the Internet!
– 32-bit address fields are getting too small.
– Early predictions: it’d take decades to achieve
100,000 network mark.
– 100,000th. network was connected in 1996!
– Internet is rapidly running out of IP addresses!
– Waste due to hierarchical address.
389
IP Address Formats

4 different classes:
Network
Host
Class A:
0XXXXXXX
128 nets.
16M hosts/net.
Class B:
10XXXXXX XXXXXXXX
16K nets.
64K hosts/net.
Class C:
110XXXXX XXXXXXXX XXXXXXXX
2M nets.
256 hosts/net.
Class D:
1110XXXX XXXXXXXX XXXXXXXX XXXXXXXX
Multicast.
390
Scaling IP Addresses 2
Class A addresses: 16M hosts is usually too
much.
 Class C addresses: 254 hosts is usually too
small.
 Class B addresses provide room for 64K hosts.

– Organizations usually request class B addresses
but more than 50% of them only have up to 50
hosts!
391
Scaling IP Addresses 3


Class C addresses should have 10-bit host
numbers instead of only 8-bit numbers.
– Would allow for 1022 hosts instead of just 254.
– More Class C networks: network number can
grow up to 0.5M.
But, could result in routing table explosion.
– Routers will have to know about many more
networks.
392
CIDR 1
Classless Interdomain Routing: RFC 1519.
 No longer uses classes A, B, and C addresses.
 Allocate remaining Class C addresses in
variable-sized blocks.

– Example: if an organization needs 2000 addresses,
it’s given a block of 2048 addresses, or 8
contiguous class C networks and not a full class B
address.
393
CIDR 2


New allocation rules for class C addresses.
World partitioned into 4 zones and each one was
given portion of class C address space (192~223).
–
–
–
–
192.0.0.0~195.255.255.255: Europe.
198.0.0.0~199.255.255.255: North America.
200.0.0.0~201.255.255.255: Central and South America.
202.0.0.0~203.255.255: Asia and Pacific.
394
CIDR 3
Each region is allocated ~ 32M class C
addresses.
 Addresses 204.0.0.0~223.255.255.255
reserved for future use.
 Advantages:

– Less waste.
– Routers can keep only one RT entry per region,
i.e., 32M addresses compressed into one.
395
CIDR 4
Once packet gets to its destination region,
need more detailed routing information.
 One possibility is to keep 131,072 (32M/28)
entries for all “local” networks.

– Explosion problem.

Instead, use of 32-bit masks: only need to
keep start address of block.
396
CIDR - Example 1



Cambridge University has 2048 addresses from
194.24.0.0~194.24.7.255 and mask 255.255.248.0.
Oxford University: 4096 addresses
194.24.16.0~194.24.31.255 with mask
255.255.240.0.
U of Edinburgh: 1024 addresses
194.24.8.0~194.24.11.255 and mask 255.255.252.0.
397
CIDR - Example 2

Routing tables in Europe contain base address and
mask:
Address
Mask
11000010 00011000 00000000 00000000 11111111 11111111 11111000 00000000
11000010 00011000 00010000 00000000 11111111 11111111 11110000 00000000
11000010 00011000 00001000 00000000 11111111 11111111 11111100 00000000
When packet to 194.24.17.4 (11000010 00011000 00010001 00000100)
arrives, it’s ANDed with Cambridge U’s mask yielding 11000010
00011000 00010000 00000000 which does not match Cambridge U’s base.
When it’s ANDed with Oxford’s mask, it matches Oxford’s base, so
packet sent to Oxford’s router.
398
IP Evolution
CIDR bought IPv4 a few more years.
 Because of its addressing limitations and to
accommodate next-generation Internet
applications, IP must evolve.
 In 1990, IETF started work on IP next
generation, or IPng.

– Several proposals were considered.
– SIPP (Simple Internet Protocol Plus) was selected
and became IPv6.
399
IPv6 1
RFCs 1883~1887.
 Features:

– Longer addresses (16 bytes versus only 4 in IPv4).
– Header simplification (only 7 fields versus 13
fields in IPv4): faster processing by routers.
– Better option support since fields that were
previously required are now optional.
– Improved security and QoS support.
400
IPv6 Header
32 bits
Version Priority
Payload length
Flow label
Next header
Hop limit
Source address
(16 bytes)
Destination address
(16 bytes)
401
IPv6 Header Fields 1

Version = 6.
– During transition period, routers will examine this field to
decide what kind of packet it is.

Priority: handling different kinds of traffic.
– 0~7: data that can be flow controlled, e.g., data distribution
services.
– 8~15: real-time traffic (e.g., audio, video)
– Within each group, lower values have lower priority than
higher values (e.g., 1 for news, 4 for ftp and 6 for telnet)
402
IPv6 Header Fields 2

Flow label (experimental): allows source and
destination to set up pseudo-connection.
– Try to have some kind of service guarantees.
– Example: assign flow number to a stream of
packets that need reserved bandwidth.
– Flow number: src+dst+flow #.

Payload length: length of data.
– Different from IPv4 which specified total length
of datagram.
403
IPv6 Header Fields 3
Next header: specifies what is present in the
options field (extension headers).
 Hop limit: equivalent to IPv4’s TTL.
 Source and destination addresses:

– 16-byte addresses (fixed length).
– Address space is divided by using prefixes.
404
IPv6 versus IPv4



No more IHL (header length); why?
No more protocol field: next header field.
No more fragmentation-related fields.
– All IPv6 hosts and routers must support 576-byte packets.
– Fragmentation is less likely to occur.
– Router sends error messages back to source when packet is
too big so source breaks it down.

No more checksum: rely on more reliable networks
and DLL and transport checksums.
405
IPv6 Addressing 1

Separate prefixes for provider-based and geographicbased addresses.
– Ability to accommodate 2 ways of address assignment:
» Addresses allocated to ISP companies.





Prefix 010.
Each ISP assigned portion of address space.
First 5 bits following prefix defines registry where provider is
registered.
Remaining 15 bytes are allocated by each provider.
Example: 3-byte provider number.
406
IPv6 Addressing 2

Geographic-based addresses:
– Prefix 100.
– Same model as current Internet.

Multicast addresses:
– Prefix 11111111.
– 4-bit flag + 4-bit scope fields + 112-bit group id.
– Flags: 1 bit defines whether group is permanent or
not.
– Scope: limit reach of multicast packet.
407
IPv6 Address Notation

8 groups of 4 hexadecimal digits separated
by colons.
– Example:
8000:0000:0000:0000:0123:4567:89AB:CDEF
– Optimizations:
» Leading zeros within group can be omitted.
» Groups of zeros can be replaced by pair of colons.

8000::123:4567:89AB:CDEF.
» IPv4 addresses: ::192.31.20.46.
408
Extension Headers 1
Equivalent to IPv4 options.
 6 types of extension headers:

Hop-by-hop options
Routing
Fragmentation
Authentication
Encrypted payload
Destination options
Misc. info for routers
Full or partial route included
Management of fragments
Verification of source’s id
Information about encryption
Information for destination
409
Extension Headers 2


Fixed format and variable-sized headers.
Variable-sized headers:
– (type, length, value).
– Type: 1 byte specifying which option this is.
» First 2 bits tell option-uncapable routers what to do: skip option,
discard packet, discard packet with ICMP message, discard packet
without ICMP packet for multicast addresses.
– Length: how long value field (0~255 bytes).
– Value: information.
410
Hop-by-Hop Header

Convey information all routers along path
must examine.
– Jumbograms: datagrams > 64KBytes.
Next Header
0
194
0
Jumbogram payload length
– Next header: what option this is.
– Length of hop-by-hop header excluding the first 8
(mandatory) bytes.
– Defines option, in this case datagram size.
411
Routing Header

Lists one or more routers that must be
visited on the way to the destination.
– Strict source routing: full path is supplied.
– Loose source routing: only selected routers are
listed.
412
Fragment Header

Allows source to fragment datagram.
– In IPv6, routers are not allowed to fragment.
– If a router receives packet that is too big, it
discards it and sends back a ICMP message to
source.
– Source uses this option to fragment packet, and
resend it.
– Contains datagram id, fragment number, and
“last fragment” bit.
413
Authentication Header
Supports verification of sender’s identity.
 Contains authentication key and
cryptographic checksum of the whole
datagram.
 Receiver uses key number to find secret
key. Computes checksum using secret key
and checks whether it matches with
received datagram.

414
Destination Options

Supports options that need only be
interpreted by destination host.
415
Network Layer in ATM
Networks

ATM layer: connection oriented.
– Provides connection-oriented service.
– Uses virtual circuits, or virtual channels.
– No ACKs.
» Intended for fiber networks.
» Intended for real-time traffic.
– Ordering guarantees.
416
ATM Networks

Virtual path: group of virtual circuits.
– When re-routed, all VCs are re-routed together.
417
ATM Cells
53 bytes!
 2 different formats:

– UNI: user-network interface.
» Between host and ATM network (carrier).
– NNI: network-network interface.
» Between 2 ATM switches (ATM for routers).
418
Cell Formats
UNI Header:
GFC
VPI
VCI
PTI
4 bits
8 bits
16 bits
3 bits
VCI
PTI
P
HEC
8 bits
NNI Header:
VPI
GFC: General flow control
VPI: Virtual path id
VCI: Virtual channel id
P
HEC
PTI: Payload type
C: Cell loss priority
HEC: Header error control
419
Cell Fields 1

GFC: only in UNI cells.
– No e2e significance.
– First switch overwrites it.
– Not currently used.
VPI: specifies virtual path (up to 256 VPs).
 VCI: specifies virtual circuit (up to 64K
VCs).

420
Cell Fields 2

PTI: type of payload.
– Cell type defined by user, congestion info by
network.
Payload Type
Meaning
000
001
010
011
100
101
110
111
User data, no congestion, cell type 0
User data, no congestion, cell type 1
User data, congestion, cell type 0
User data, congestion, cell type 1
Control info adjacent switches
Control info between src and dst
Resource management (ABR CC)
Reserved
421
Cell Field 3
CLP bit may be set by host to differentiate
high- from low-priority traffic when
choosing cell to discard if congestion.
 HEC: header checksum.


Payload: 48 bytes.
422
Connection Setup

Permanent and switched VCs.
– Permanent: always present (like leased lines).
– Switched: need to be established (like phone
calls).

How are switched VCs established?
– Separate protocol called Q.2931.
423
VC Setup
Source
Switch 1
Switch 2
Destination
Setup
Call processing
Setup
Call processing
Setip
Connect
Connect
Connect
Connect ack
Connect ack
Connect ack
424
VC Tear-down
Release
Release
Release
Release complete
Release complete
Release complete
425
Routing and Switching

Routing using VPs and VCs.
– Route on VPIs except at the final hop.
– Advantages:
» Once VP established, all VCs between src-dst can
follow the same path: no new routing decisions.
» Cell switching only needs to look at the VP (12bits)
instead of VP (12 bits) + VC (16 bits).
» Easier to re-route whole group of VCs.
» Easier for carriers to offer private networks.
426
Network Layer in ATM
Networks
[Continuation]
427
Service Categories 1

Types of traffic carried by ATM networks
and types of services required by users.
– Constant-bit rate (CBR):
» No error or flow control.
» Constant-rate, synchronous bit transmission.
» Accommodate traffic carried by current telephone
system: T1 lines, voice-grade lines.
428
Service Categories 2

Variable bit rate (VBR):
– RT-VBR: variable bit rates and real-time
requirements.
» Example: interactive compressed video
(videoconferencing applications).
» Compression schemes: base frame+differences between
current and base frames: transmission rate varies over
time.
» Cell delay and cell delay variation must be controlled:
image quality.
» But occasional loss is tolerable.
429
Service Categories 3

Variable bit rate (VBR):
– NRT-VBR: services with variable bit rates and
non real-time requirements.
» Example: multimedia e-mail (stored in disk; eliminates
delay variation).
430
Service Categories 4

Available bit rate (ABR):
– Targets bursty traffic.
– Guarantees average demand and will try to
provide peak demand.
– Network provides feedback to sender: request
sender to slow down if congestion.
– If senders are well-behaved, low loss rate.
431
Service Categories 5

Unspecified bit rate (UBR):
– No guarantees: best effort.
– Suited to IP traffic.
– Potential applications: file transfer, e-mail,
news.
432
Quality of Service




Service offered by the network (carrier) to customer
(end user): service agreement.
Service agreement: offered traffic, offered service,
compliance requirements.
If customer and carrier don’t agree: VC will not be
set up.
Different requirements for each direction.
– E.g., VOD application: required bandwidth user->server
<> server->user.
433
Quality of Service Parameters 1
Peak cell rate
PCR Max. cell transmission rate
Sustained cell rate
SCR Average cell rate
Minimum cell rate
MCR Min. acceptable cell rate
Cell delay variation tolerance CDVT Max. acceptable cell jitter
Cell loss ratio
CLR Fraction of lost cells
Cell transfer delay
CTD Time to deliver
Cell delay variation
CDV Delivery delay variation
Cell error rate
CER Fraction of correct cells
434
QoS Parameters 2
PCR, SCR, MCR, and CVDT: specified by
sender.
 CLR, CTD, and CDV describe network
conditions and are measured at receiver.

435
Traffic Policing
Checking whether each cell conforms to
service agreement parameters.
 2 parameters:

– Maximum allowed arrival rate (PCR).
» Or minimum inter-arrival time.
– Amount of acceptable variation (CDVT).

Enforcing service agreement:
– Non-conforming cells are dropped.
436
Congestion Control

Admission control:
– Congestion avoidance strategy.
– New flow specifies offered traffic and expected
service.
– Before setting up VC, network checks whether
requested resources are available without affecting
other flows.
– If no routes satisfy request, call is rejected.
– Prevent starvation by dividing users into classes.
437
Resource Reservation
Resources can be reserved at call setup
time.
 Reserve peak bandwidth along each hop.
 Reserving peak versus average bandwidth.

438
Rate-Based Congestion Control 1
CBR and VBR: sender cannot slow down
due to real-time nature of traffic.
 UBR: extra cells are simply dropped.
 ABR: network can signal congestion asking
sender(s) to slow down.
 ACR: actual cell rate.

– For each sender.
– MCR < ACR < PCR
439
Rate-Based Congestion Control 2

Resource management (RM) cell:
– Transmitted after a certain number of data cells
traveling along same path.
– Carry the explicit rate (ER), which is rate at
which sender would currently like to transmit.
– Congested switches may reduce ER.
– When RM cell comes back, sender knows
acceptable rate and adjusts ACR accordingly.
440
The Transport Layer
441
The Transport Layer

End-to-end.
– Communication from source to destination
host.
– Only hosts run transport-level protocols.
– Under user’s control as opposed to network
layer which is controlled/owned by carrier.
442
The Transport Service
Service provided to application layer.
 Transport entity: process that implements
the transport protocol running on a host.

– At OS kernel, user-level process, or network
card.
443
The Transport Layer
Source host
Destination host
Application
Layer
Transport
address
Transport
Entity
Network
Layer
Network
Address
Application
Layer
Application/
transport
interface
TPDU
Transport/
network
interface
Transport
Entity
Network
Layer
444
Types of Transport Services
Connection-less versus connection-oriented.
 Connection-less service: no logical
connections, no flow or error control.
 Connection-oriented:

– Based on logical connections: connection setup,
data transfer, connection teardown.
– Flow and error control.
445
Transport versus Network
Layer

Transport layer is “controlled” by user.
– Ability to enhance network layer quality of
service.
– Example: transport service can be more reliable
than underlying network service.
– Transport layer makes standard set of
primitives available to users which are
independent from the network service
primitives, which may vary considerably.
446
Quality of Service

User may specify QoS parameters at then
transport layer.
– At connection setup time, user may define
preferred, acceptable, and minimum values for
various service parameters.
– Transport layer determines whether it’s
possible to provide required service based on
available network service(s).
447
Transport-Layer QoS Parameters
1
Connection establishment delay: time to
establish connection.
 Connection establishment failure
probability: probability connection is not
established within maximum establishment
time.
 Throughput: bytes transferred per second
measured over a time interval.

448
Transport-Layer QoS Parameters
2
Transit delay: time between sending a
message and receiving it on the other side
(measured by the transport entities).
 Residual error ratio: ratio of messages in error
to total messages sent.
 Priority: way for user to indicate that some
connections are more important.
 Resilience: probability connection is
terminated due to congestion, etc.

449
Transport Layer QoS
Only few transport protocols provide QoS
parameters.
 Most just try to minimize residual error rate.
 QoS parameters specified by transport user
when connection is setup.

– Desired and minimum acceptable values can be
specified.
– Service negotiation.
450
Transport Service Primitives
Allow transport users (e.g., application
programs) to access transport service.
 Example: connection-oriented transport
service primitives.

PRIMITIVE
TPDU Sent
Meaning
LISTEN
CONNECT
SEND
(none)
listen for connection
Connection Req. try to establish connection
DATA
send data
RECEIVE
(none)
waits for data
DISCONNECT
Disc. Req.
try to release connection
451
TPDU
Transport protocol data unit.
 Messages sent between transport entities.
 TPDUs contained in network-layer packets,
which in turn are contained in DLL frames.

Frame
header
Packet
header
TPDU
header
TPDU payload
452
Connection Management State
Machine
SERVER
CLIENT
Connect
executed
Active
establishment
pending
Connection
Accept
Active
Disconnect
disconnect
execute
pending
Connection
Idle
req. received
Passive
establishment
pending
Connect
executed Established
Disc.
s req.
Passive
disconnect received
pending
Disconnect
executed
Idle
Disc. accept. received
453
Berkeley Sockets 1


Set of transport-level primitives made available by
Berkeley UNIX.
Server side:
» SOCKET: create new communication end point.
» BIND: attach local address to socket (once server binds address,
clients can connect to it).
» LISTEN: listen for connection.
» ACCEPT: accept new connection.
» SEND, RECEIVE: send and receive data.
» CLOSE: release connection.
454
Berkeley Sockets 2

Client side:
» SOCKET: create socket.
» CONNECT: try to establish connection.
» SEND, RECEIVE: send and receive data.
» CLOSE: release connection.
455
Transport Protocol Issues:
Addressing
Address of the transport-level entity.
 TSAP: transport service access point
(analogous to NSAP).

–
–
–
–
Internet TSAP: (IP address, local port).
Internet NSAP: IP address.
There may be multiple TSAPs on one host.
Typically, only one NSAP.
456
Example 1

Finding the time of day from a time-of-day
server.
– Time-of-day server process on host 2 attaches
itself to TSAP 122 and waits for requests (e.g.,
through LISTEN).
– Application process (TSAP 6) on host 1 wants
to find out the time-of-day; issues CONNECT
specifying TSAP 6 as source and TSAP 122 as
destination.
457
Example 2
– Transport entity on host 1 tries to establish
transport connection between its TSAP 6 and
the TSAP 122 on host 2.
– Transport entity on host 2 contacts process on
TSAP 122; if it agrees, transport connection
established.
458
Finding Services 1

Well-known TSAP.
– Time-of-day server has been using TSAP 122 forever so
every users know it.

Initial connection protocol: special process
server that proxies for less well-known
services.
– Process server listens to set of ports at the same time.
– Users CONNECT to a TSAP, and if there are no servers,
process server is likely to be listening. It them spawns
requested server.
459
Finding Services 2

Name or directory service.
– Name server listens to well-known TSAP.
– User sends service name and name server
responds with service’s TSAP.
– New services need to register with name server.

Finding the server’s network address.
– Hierarchical addresses solve this problem, i.e., the
NSAP is part of the TSAP.
460
Connection Establishment


CONNECTION REQUEST and CONNECTION
ACCEPTED TPDUs.
Problem: delayed duplicates.
– Duplicates can re-appear and be taken as the real
messages.

Solution: messages age and are discarded after some
time; need to discard ack’s.
– Maximum hop count.
– Timestamp.
461
Avoiding Duplicates 1
2 identically numbered TPDUs are never
outstanding at the same time.
 Bounded packet lifetime.
 Each host has its clock.

– Clock as a counter that increments itself.
– #bits(counter)>= #bits(sequence number).
– Clocks don’t “crash”.
462
Avoiding Duplicates 2
When connection setup, low-order k bits of
clock used as initial sequence number.
 Each connection starts numbering its
TPDUs with different sequence number.
 Sequence number space need to be such that
by the time sequence numbers wrap around,
old TPDUs with same sequence numbers
have aged.

463
Sequence Numbers versus Time
1
Seq.
#’s
. Linear relation between time
and initial sequence number.
Time
464
Sequence Numbers versus Time
2
Seq.
#’s
T
Forbidden
region
Time
. Host crash: when it comes
up, it doesn’t know where it
ere in the sequence # space.
. Example: T=60 sec and
clock ticks once per second.
. At t=30s, TPDU on connection
5 gets seq.# 80.
. Host crashes and comes up.
. At t=60s, reopens connections 0~4.
. At t=70s, reopens connection 5 and at t=80s, sends TPDU 80.
. Old TPDU 80 still valid, and one would look like a duplicate.
. To prevent this, check if it’s in the “forbidden region” and delay
sequence number.
465
Three-Way Handshake

Solves the problem of getting 2 sides to
agree on initial sequence number.
1
2
CR (seq=x)
CR: connection
request.
ACK(seq=y,ACK=x)
DATA(seq=x, ACK=y)
466
3-Way Handshake: Duplicates 1
2
1
*
CR(seq=x)
ACK(seq=y, ACK=x)
REJECT(ACK=y)
. Old duplicate CR.
. The ACK from host 2 tries
to verify if host 1 was trying to
open a new connection with
seq=x.
. Host 1 rejects host 2’s attempt
to establish.
Host 2 realizes it was a duplicate
CR and aborts connection.
467
3-Way Handshake: Duplicates 2
2
1
*
CR(seq=x)
. Old duplicate CR and ACK
to connection accepted.
ACK(seq=y, ACK=x)
DATA(seq=x,
ACK=z)
REJECT(ACK=y)
468
Connection Release

Asymmetric release: telephone system.
– When one party hangs up, connection breaks.
– May cause data loss.

Symmetric release:
– Treats connection as 2 separate unidirectional
connections.
– Requires each to be released separately.
469
Symmetric Release
How to determine when all data has been
sent and connection could be released?
 2-army problem:

Blue army 1
Blue army 2
. White army larger
than either blue armies.
White army
. Blue army together is
larger.
. If each blue army attacks, it’ll be defeated. They win if attack together.
470
2-Army Problem 1


To synchronize attack, they must use messengers that
need to cross valley: unreliable.
Is there a protocol that allows blue army to win? No.
– Blue army 1 sends message to blue army 2.
– Blue army 2 sends ACK back.
– Blue army 2 is not sure whether ACK was received.
471
2-Army Problem 2

Use 2-way handshake.
– Blue army 1 ACKs back but it’ll never know if
the ACK was received.

Applying to connection release:
– Neither side is prepared to disconnect until
convince other side is prepared to disconnect.
– In practice, hosts are willing to take risks.
472
Connection Release Protocol
Send DR+
start timer
DR
DR
Release
connection
Send
ACK
DR: disconnection
request.
Send DR+
start timer
ACK
Release
connection
473
Connection Release Scenarios 1
Send DR+
start timer
DR
DR
Release
connection
Send
ACK
DR: disconnection
request.
Send DR+
start timer
ACK
Timeout:
Release
connection
474
Connection Release Scenarios 2
Send DR+
start timer
DR: disconnection
request.
DR
DR
Timeout:
send DR+
start timer
Send DR+
start timer
DR
Send DR+
start timer
DR
ACK
Release
connection
475
The Internet Transport Protocols:
TCP and UDP

UDP: user datagram protocol (RFC 768).
– Connection-less protocol.

TCP: transmission control protocol (RFCs
793, 1122, 1323).
– Connection-oriented protocol.
476
UDP

Provides connection-less, unreliable service.
– No delivery guarantees.
– No ordering guarantees.
– No duplicate detection.

Low overhead.
– No connection establishment/teardown.

Suitable for short-lived connections.
– Example: client-server applications.
477
UDP Segment Format
0
15
31
Destination port
Source port
Length
Checksum
Data
Source and destination ports: identify the end points.
Length: 8-byte header+ data.
Checksum: optional; if not used, set to zero.
478
UDP Checksum
Computed over a pseudo-header+ UDP
header+data+padding (to even number of
bytes if needed).
 Pseudo-header:

0
31
Source IP address
00000000
Destination IP address
Protocol
Segment length
479
TCP
Reliable end-to-end communication.
 TCP transport entity:

– Runs on machine that supports TCP.
– Interfaces to the IP layer.
– Manages TCP streams.
» Accepts user data, breaks it down and sends it as
separate IP datagrams.
» At receiver, reconstructs original byte stream from
IP datagrams.
480
TCP Reliability

Reliable delivery.
– ACKs.
– Timeouts and retransmissions.

Ordered delivery.
481
TCP Service Model 1

Obtained by creating TCP end points.
– Example: UNIX sockets.
– TSAP address: IP address + 16-bit port
number.
– Multiple connections can share same port pair.
– Port numbers below 1024: well-known ports
reserved for standard services.
» List of well-known ports in RFC 1700.
482
TCP Service Model 2
TCP connections are full-duplex and pointto-point.
 Byte stream (not message stream).

– Message boundaries are not preserved e2e.
A
B
C
D
4 512-byte segments sent as
separate IP datagrams
ABCD
2048 bytes of data delivered
to application in single READ
483
TCP Byte Stream
When application passes data to TCP, it
may send it immediately or buffer it.
 Sometimes application wants to send data
immediately.

– Example: interactive applications.
– Use PUSH flag to force transmission.

URGENT flag.
– Also forces TCP to transmit at once.
484
TCP Protocol Overview 1

TCP’s TPDU: segment.
– 20-byte header + options.
– Data.
– TCP entity decides the size of segment.
» 2 limits: 64KByte IP payload and MTU.
» Segments that are too large are fragmented.

More overhead by addition of IP header.
485
TCP Protocol Overview 2

Sequence numbers.
– Reliability, ordering, and flow control.
– Assigned to every byte.
– 32-bit sequence numbers.
486
TCP Segment Header
Source port
Destination port
Sequence number
Acknowledgment number
Header
length
UA P R S F
Checksum
Window size
Urgent pointer
Options (0 or more 32-bit words)
Data
487
TCP Header Fields 1
Source and destination ports identify
connection end points.
 Sequence number.
 Acknowledgment number specifies next byte
expected.
 TCP header length: how many 32-bit words
are contained in header.
 6-bit unused field.

488
TCP Header Fields 2

6 1-bit flags:
– URG: indicate urgent data present; urgent
pointer gives byte offset from current sequence
number where urgent data is.
– ACK: indicates whether segment contains
acknowledgment; if 0, acknowledgement
number field ignored.
– PUSH: indicates PUSHed data so receiver
delivers it to application immediately.
489
TCP Header Fields 3

Flags (cont’d):
– RST: used to reset connection, reject invalid
segment, or refuse to open connection.
– SYN: used to establish connection; connection
request, SYN=1, ACK=0.
– FIN: used to release connection.

Window size: how many bytes can be sent
starting at acknowledgment number.
490
TCP Header Fields 4
Checksum: checksums the
header+data+pseudo-header.
 Options: provide way to add extra
information.

– Examples:
» Maximum payload host is willing to accept; can be
advertised during connection setup.
» Window scale factor that allows sender and receiver
to negotiate larger window sizes.
491
TCP Connection Setup

3-way handshake.
Host 1
SYN (SEQ=x)
Host 2
SYN(SEQ=y,ACK=x+1)
(SEQ=x+1, ACK=y+1)
492
TCP Connection Release 1

Abrupt release:
– Send RESET.
– May cause data loss.
493
TCP Connection Release 2

Graceful release:
– Each side of the connection released
independently.
» Either side send TCP segment with FIN=1.
» When FIN acknowledged, that direction is shut down for data.
» Connection released when both sides shut down.
– 4 segments: 1 FIN and 1 ACK for each direction;
1st. ACK+2nd. FIN combined.

494
TCP Connection Release 3

Timers to avoid 2-army problem.
– If response to FIN not received within 2*MSL,
FIN sender releases connection.

After connection released, TCP waits for
2*MSL (e.g., 120 sec) to ensure all old
segments have aged.
495
TCP Transmission 1
Sender process initiates connection.
 Once connection established, TCP can start
sending data.
 Sender writes bytes to TCP stream.
 TCP sender breaks byte stream into
segments.

– Each byte assigned sequence number.
– Segment sent and timer started.
496
TCP Transmission 2

If timer expires, retransmit segment.
– After retransmitting segment for maximum
number of times, assumes connection is dead and
closes it.
If user aborts connection, sending TCP flushes
its buffers and sends RESET segment.
 Receiving TCP decides when to pass received
data to upper layer.

497
TCP Flow Control

Sliding window.
– Receiver’s advertised window.
» Size of advertised window related to receiver’s
buffer space.
» Sender can send data up to receiver’s advertised
window.
498
TCP Flow Control: Example
App. writes
2K of data
App. does
3K write
Sender
blocked
Sender
may send up
to 2K
4K
2K;SEQ=0
2K
ACK=2048; WIN=2048
2K; SEQ=2048
0
App. reads
2K of data
ACK=4096; WIN=0
ACK=4096; WIN=2048
1K; SEQ=4096
2K
1K
499
TCP Flow Control: Observations

TCP sender not required to transmit data as
soon as it comes in form application.
– Example: when first 2KB of data comes in,
could wait for more data since window is 4KB.

Receiver not required to send ACKs as
soon as possible.
– Wait for data so ACK is piggybacked.
500
Delayed ACKs



Tries to optimize ACK transmission.
Delay ACKs and window update (500msec)
hoping to piggyback on data segment.
Example: telnet to interactive editor:
– Send 1 character at a time: 20-byte TCP header+ 1byte data+20-byte IP header.
– Receiver ACKs immediately: 40-byte ACK.
– When editor reads character, window update: 40-byte
datagram.
– Then echoes character back: 41-byte datagram.
501
Nagle’s Algorithm
Tries to optimize sending of small data
chunks.
 Example: telnet to interactive editor).

– Send first byte and buffer the rest until
outstanding byte is ACKed; then send all buffered
data in one segment; buffer until next ACK.

Disabled in some cases (e.g., window
application: mouse movements).
502
Silly Window Syndrome

Caused by receiver sending window updates of very
small values.
– Example:
» Receiver application reads 1 byte at a time and receiver TCP sends
1-byte window update.
» Sender TCP has large blocks to send but can only send 1 byte at a
time.

Solution: [Clark] prevent receiver from generating
small window advertisements; also, sender can wait.
503
Congestion Control

Why do it at the transport layer?
– Real fix to congestion is to slow down sender.

Use law of “conservation of packets”.
– Keep number of packets in the network
constant.
– Don’t inject new packet until old one leaves.

Congestion indicator: packet loss.
504
TCP Congestion Control 1

Like, flow control, also window based.
– Sender keeps congestion window (cwin).
– Each sender keeps 2 windows: receiver’s
advertised window and congestion window.
– Number of bytes that may be sent is
min(advertised window, cwin).
505
TCP Congestion Control 2

Slow start [Jacobson 1988]:
– Connection’s congestion window starts at 1
segment.
– If segment ACKed before time out,
cwin=cwin+1.
– As ACKs come in, current cwin is increased
by 1.
– Exponential increase.
506
TCP Congestion Control 3

Congestion Avoidance:
–
–
–
–
–
Third parameter: threshold.
Initially set to 64KB.
If timeout, threshold=cwin/2 and cwin=1.
Re-enters slow-start until cwin=threshold.
Then, cwin grows linearly until it reaches
receiver’s advertised window.
507
TCP Congestion Control:
Example
508
TCP Retransmission Timer

When segment sent, retransmission timer
starts.
– If segment ACKed, timer stops.
– If time out, segment retransmitted and timer
starts again.
509
How to set timer?
Based on round-trip time: time between a
segment is sent and ACK comes back.
 If timer is too short, unnecessary
retransmissions.
 If timer is too long, long retransmission
delay.

510
Jacobson’s Algorithm 1

Determining the round-trip time:
– TCP keeps RTT variable.
– When segment sent, TCP measures how long it
takes to get ACK back (M).
– RTT = alpha*RTT + (1-alpha)M.
– alpha: smoothing factor; determines weight
given to previous estimate.
– Typically, alpha=7/8.
511
Jacobson’s Algorithm 2

Determining timeout value:
– Measure RTT variation, or |RTT-M|.
– Keeps smoothed value of cumulative variation
D=alpha*D+(1-alpha)|RTT-M|.
– Alpha may or may not be the same as value
used to smooth RTT.
– Timeout = RTT+4*D.
512
Karn’s Algorithm

How to compute ACKs for retransmitted
segments?
– Count it for first or second transmission?
– Karn proposed not to update RTT on any
retransmitted segment.
– Instead RTT is doubled on each failure until
segments get through.
513
Persistence Timer
Prevents deadlock if an window update
packet is lost and advertised window = 0.
 When persistence timer goes off, sender
probes receiver; receiver replies with its
current advertised window.
 If 0, persistence timer is set again.

514
Keepalive Timer
Goes off when a connection is idle for a
long time.
 Causes one side to check whether the other
side is still alive.
 If no answer, connection terminated.

515
TIME_WAIT
2*MSL.
 Makes sure all segments die after
connection is closed.

516
Wireless TCP 1
According to layered system design
principles, transport protocol should be
independent of underlying technology.
 However, wireless networks invalidate this
principle.

– Ignoring properties of wireless medium can
lead to poor TCP performance.
– Problem: TCP’s congestion control.
517
Wireless TCP 2

Problem: packet loss as congestion
indicator.
– When retransmission timer times out, sender
slows down.

Wireless links are lossy!
– Dealing with losses in this case should be resending lost segments asap.
518
Indirect TCP (I-TCP)
 [Bakne and Badrinath, 1995].

Split TCP connection in 2: one from sender to base
station and the other from base station to receiver.
– Base station serves as “repeater”: copies segments
between connections in both directions.
– Connections are homogeneous; timeouts on 1st.
connection, slow down sender.
– Problem: violates TCP’s e2e’ness.

Example: ACKs to sender mean base station received segments, not
necessarily receiver.
519
Snoop TCP
 [Balakrishnan et al., 1995].


Does not break connection.
Modifications to base station’s network layer code.
– Snooping agent on base station observes and caches TCP
segments sent to mobile host and ACKs coming back.
– If it doesn’t see an ACK for a segment or sees duplicate
ACKs, it times out and retransmits.
– But source may time out anyway.
520
End-To-End Argument
Design principle to help guide placement of
functionality in distributed systems.
 Rationale for moving functions upward
closer to application.

521
Where to place distributed
systems functions?

Layered system design:
– Different levels of abstraction for simplicity.
– Lower layer provides service to upper layer.
– Very well defined interfaces.

Some functions can be implemented at
different layers or even at multiple layers.
522
E2E Argument Statement
“The function in question can completely and
correctly be implemented only with the
knowledge and help of the application at the
endpoints. Therefore providing that function
in the communication system itself is not
possible. Sometimes an incomplete version
of the function provided by the
communication system may be useful as
performance enhancement.”
523
Functions Closer to Application


E2E argument paper argues that functions should be
moved closer to the application that uses them.
Rationale:
– Some functions can only be completely and correctly
implemented with app’s knowledge.
» Example: file transfer.
» If error occurs in the network, network reliability can fix it.
» Otherwise, only application can.
524
Another perspective: Cost

Why pay for something you don’t need.
» Example 1: the Internet.
» Example 2: trend in kernel design - take away from
kernel as much functionality as possible.

Applications that don’t need certain
functions should not have to pay for them.
525
E2E Counter Argument

Performance!
– Example: File transfer
» Reliability checks at lower layers detect problems
earlier.
» Abort transfer and re-try without having to wait till
whole file is transmitted.

“Spread out” functionality across layers.
526
Domain Name System (DNS)
Basic function: translation of names (ASCII
strings) to network (IP) addresses and viceversa.
 Example:

– zephyr.isi.edu <-> 128.9.160.160
527
History

Original approach (ARPANET, 1970’s):
– File hosts.txt listed all hosts and their IP addresses.
– Every night every host fetches file from central
repository.
– OK for a few hundred hosts.
– Scalability?
» File size.
» Centrally managed.
528
DNS
Hierarchical name space.
 Distributed database.
 RFCs 1034 and 1035.

529
How is it used?

Client-server model.
– Client DNS (running on client hosts), or
resolver.
– Application calls resolver with name.
– Resolver contacts local DNS server (using
UDP) passing the name.
– Server returns corresponding IP address.
530
DNS Name Space

Tree-based hierarchy.
int
com
ibm
eng sales cs
edu
gov mil
org
net
us
ca …
usc
ee
531
Name Space Structure

Top-level domains:
– Generic.
– Countries.
Leaf domains: no sub-domains.
 In practice all US organizations are under a
generic domain, while everything outside
the US is under the corresponding country
domain.

532
DNS Names

Domain names:
– Concatenation of all domain names starting from
its own all the way to the root separated by “.”.
– Refers to a tree node and all names under it.
– Case insensitive.
– Components up to 63 characters.
– Full name less than 255 characters.
533
Name Space Management

Domains are autonomous.
– Organizational boundaries.
– Each domain manages its own name space
independently of other domains.

Delegation:
– When creating new domain: register with parent
domain.
» For name uniqueness.
» For name resolution.
534
Resource Records





Entry in the DNS database.
Several types of entries or RRs.
Example: RR “A” contains IP address.
Name <-> several resource records.
RR format: five-tuple.
–
–
–
–
–
Name.
TTL (in seconds).
Class (usually “IN” for Internet info).
Type: type of RR.
Value.
535
RR Types 1

SOA: start of authority.
– Marks beginning of zone’s database.
– Provides general info about the zone: e-mail
address of admin, default TTL, etc.

A: address.
– Contains 32-bit IP address.
– Single name <-> several A RRs.

MX: mail exchange.
– Name of mail server for this domain.
536
RR Types 2

NS: name server.
– Name of name server for this domain.

CNAME: canonical name.
– Alias.

HINFO: host description.
– Provides information about host, e.g., CPU type, OS,
etc.

TXT: arbitrary string of characters.
– Generic description of the domain, where it is located,
etc.
537
Name Servers

Entire database in a single name server.
– Practical?
– Why?
DNS database is partitioned into zones.
 Each zone contains part of the DNS tree.
 Zone <-> name server.

– Each zone may be served by more than 1 server.
– A server may serve multiple zones.

Primary and secondary name servers.
538
Name Resolution 1


Application wants to resolve name.
Resolver sends query to local name server.
– Resolver configured with list of local name servers.
– Select servers in round-robin fashion.

If name is local, local name server returns matching
authoritative RRs.
– Authoritative RR comes from authority managing the RR
and is always correct.
– Cached RRs may be out of date.
539
Name Resolution 2

If information not available locally (not
even cached), local NS will have to ask
someone else.
– It asks the server of the top-level domain of the
name requested.
540
Recursive Resolution

Recursive query:
– Each server that doesn’t have info forwards it to
someone else.
– Response finds its way back.

Alternative:
– Name server not able to resolve query, sends back
the name of the next server to try.
– Some servers use this method.
– More control for clients.
541
Example

Suppose resolver on flits.cs.vu.nl wants to resolve
linda.cs.yale.edu.
–
–
–
–
Local NS, cs.vu.nl, gets queried but cannot resolve it.
It then contacts .edu server.
.edu server forwards query to yale.edu server.
yale.edu contacts cs.yale.edu, which has the authoritative
RR.
– Response finds its way back to originator.
– cs.vu.nl caches this info.
» Not authoritative (since may be out-of-date).
» RR TTL determines how long RR should be cached.
542
Review 1

Network-layer congestion control.
–
–
–
–
What is it?
CC versus FC.
Taxonomy: closed versus open loop.
Open loop:
» Token and leaky bucket.
– Closed loop:
» Choke packets.
» Fair and weighted fair queuing.
» Load shedding.
543
Review 2

Internetworking.
–
–
–
–
Gateways.
Connectionless versus connection-oriented.
Tunneling.
Fragmentation.
» Transparent.
» Non-transparent.
544
Review 3

IP.
–
–
–
–

IP header.
Addressing.
Address formats.
Subnetting.
Companion protocols.
– ICMP, ARP, RARP, BOOTP.
545
Review 4

Internet Routing.
–
–
–
–
IGPs versus EGPs.
RIP, OSPF, BGP.
Internet multicast.
Mobile IP.
CIDR.
 IPv6.

546
Review 5


ATM network layer.
Transport layer.
–
–
–
–
–
Types of transport services.
Transport service primitives.
Berkeley sockets.
TPDUs.
Connection management.
»
»
»
»
Setting up and releasing.
Avoiding duplicates.
3-way handshake.
2-army problem.
547
Review 6

UDP.
– Type of service.
– Header.

TCP.
–
–
–
–
Type of service.
Header.
Connection setup and release.
Flow control.
548
Review 7

TCP (cont’d).
–
–
–
–
Delayed ACKs.
Nagle’s algorithm.
Silly window syndrome.
Congestion control.
Wireless TCP.
 E2E argument.
 The Web and HTTP.

549
Review 8
Network security.
 Reliable multicast.
 DNS.

550