Telcordia-NSIS - Columbia University

Download Report

Transcript Telcordia-NSIS - Columbia University

Internet data-plane
signaling - revisiting
RSVP
Henning Schulzrinne
Dept. of Computer Science
Columbia University
[email protected]
(with Robert Hancock, Hannes Tschofenig, S. van den
Bosch, G. Karagiannis, A. McDonald, X. Fu and others)
Telcordia - June 21, 2004
1
Overview
Signaling: application vs. data plane
 Resource control

 DiffServ
vs. IntServ
What’s wrong with RSVP?
 Components of a general solution
 NSIS = NTLP (GIMPS) + {NSLP}+
 Route change detection

Telcordia - June 21, 2004
2
Signaling – the big picture
session signaling
SIP proxy
server
off-path NE
off-path signaling
data
AS#1
on-path signaling
AS#2
datapath signaling
Telcordia - June 21, 2004
3
Need for data plane state
establishment

Differentiated treatment of packets



Mapping state


accounting
Other state establishment




network address translation (NAT)
Counting packets


QoS
firewall (loss = 100% vs. loss = 0%)
setting up active network capsules
MPLS paths
pseudo-wire emulation (PWE) – T1 over IP
Related: visit subset of data path nodes, but don’t leave state behind


diagnostics  better traceroute
link speeds, load, loss, packet treatment, …
Telcordia - June 21, 2004
4
On-path vs. off-path signaling


On-path (path-coupled): visit subset of routers
on data path
Off-path (path-decoupled): anything else, but
presumably roughly along data path
 one
proposal: one “touch point” for each AS
 bandwidth broker
 difficult part is resource tracking, not signaling

No fundamental differences in protocol 
separate out next-hop discovery to allow re-use
Telcordia - June 21, 2004
5
Differentiated packet handling

Not just QOS, but also



firewall
network address translation
accounting and measurement
IntServ
filter
management
DiffServ
traffic
filtering
traffic shaping,
handling & measurement
Telcordia - June 21, 2004
6
DiffServ  IntServ

Filter always uses packet characteristic
 5-tuple
(protocol, source/destination address +
port) + global label (TOS)
 multiple “flows” can be mapped to one treatment
mechanism
DiffServ
IntServ
in-band
identification
TOS
5-tuple?
5-tuple
5-tuple
mapping
fixed
signaled
TCP
SYN
Telcordia - June 21, 2004
7
The scaling bogeyman

Networks routinely handle large-scale per-flow state




OC-48 can handle 31,875 DS-0 voice calls
Mean call duration = 9 min  60 requests/second
probably about 3 MB of data
partially explained by poor initial RSVP explanations


firewalls
NATs
scaling = cost per flow is constant (or decreasing)
flow numbers are modest:




It doesn’t
scale!
where flow search time ~ O(N) rather than O(1)
likely limitations are in AAA, not router signaling
Telcordia - June 21, 2004
8
RSVP characteristics
soft-state = state vanishes if not refreshed
 two-pass signaling = path discovery +
reservation
 receiver-based resource reservation
 separation of QoS signaling from routing

 with
some router feedback
Telcordia - June 21, 2004
9
The problem with RSVP


Designed for QoS establishment, used mostly for other things (RSVP-TE)
Designed for large-scale IP multicast  customer never materialized

adds significant complexity:




receiver-based  PATH + RESV
designed for ASM (any-source) rather than SSM (source-specific)
receiver-based motivated by receiver diversity – not very useful in practice
Designed in simpler days (1997):





does not work well with mobile nodes (IP mobility or changing IP addresses)
no support for NATs
security mostly bolted on – non-standard mechanisms
single-purpose, with no clear extensibility model
very primitive transport mechanism

either refresh or exponential decay (refresh reduction, RFC 2961)
Telcordia - June 21, 2004
10
The cost of multicast for RSVP

reservation styles
60

multiple senders in same group: shared vs.
distinct
 sender selection: explicit vs. wildcard



motivated by heterogeneous
can do leaf-initiated join rather than rootinitiated



but still need periodic PATH to visit new sub-tree
60
3
0
20
10
20
three different flow specs



20
receiver-oriented
Sender_TSpec, ADSpec, (TSpec, RSpec)
fairly tightly woven into core protocol
40
20
state merging and management
killer reservation (KR-II)

generally, error handling problematic
60
ResvErr!
20
10
40
draft-fu-rsvp-multicast-analysis
Telcordia - June 21, 2004
60
60
11
IETF NSIS working group




chartered in Dec. 2001, after BOF in March 2001
Motivated by Braden’s two-layer model (draftlindell-waypoint, draft-braden-2level-signal-arch)
Active participation from Roke Manor, Siemens,
NEC Europe, Nokia, Samsung, Columbia
Based partially on CASP protocol designed by
Columbia/Siemens group and prototyped at UKy
Telcordia - June 21, 2004
12
NSIS protocol structure
NSLP
QoS, NAT/FW, …
(C)
NTLP
GIMPS
(GIMPS)
transport layer

IP router alert
client layer does the real work:  transport layer:

reserve resources
 open firewall ports
 …

UDP, TCP, SCTP

reliable transport
messaging layer:

establishes and tears down state
 negotiates features and capabilities
Telcordia - June 21, 2004
13
NSIS properties

Network friendly



congestion-controlled
 re-use of state across applications

application-neutral


soft state
per-node time-out
 explicit removal of state

extensible

add more applications later
data format
 negotiation
transport neutral

any reliable protocol
 initially, TCP and SCTP
 also, UDP for initial probing

policy neutral

no particular AAA policy or protocol
 interaction with COPS, DIAMETER
needs work
Telcordia - June 21, 2004
14
NSIS properties, cont'd.

Topology hiding
 not
recommended, but
possible

Light weight
 implementation
complexity
 security associations
(re-use)
 may not need kernel
implementation
Telcordia - June 21, 2004
15
What is GIMPS?

Generic signaling transport service

establishes state along path of data
 one sender, typically one receiver

can be multiple receivers  multicast (not in initial version)

can be used for QoS per-flow or per-class reservation
 but not restricted to that

avoid restricting users of protocol (and religious arguments):

sender vs. receiver orientation
 more or less closely tied to data path


initially, router-by-router (path-coupled)
later, network (AS) path (path-decoupled)
Telcordia - June 21, 2004
16
NSIS network model – pathcoupled
selective
NTLP chain
QoS
QoS
QoS
midcom
omnivorous


NTLP nodes form NTLP chain
not every node processes all client protocols:

non-NTLP node: regular router
 omnivorous: processes all NTLP messages
 selective: bypassed by NTLP messages with unknown client protocols
Telcordia - June 21, 2004
17
Network model – path-decoupled
Bandwidth broker
NAC
NTLP
AS15465
AS 1249
AS17
data
 Also
route network-by-network
 can combine router-by-router with out-ofpath messaging
Telcordia - June 21, 2004
18
GIMPS messages
 Regular
NTLP messages
establish
or tear down state
carry client protocol
datagram (“D”) or connection (“C”) mode
 Hop-by-hop
reliability
 Generated by any node along the chain
Telcordia - June 21, 2004
19
NSIS transport protocol usage



Most signaling messages are small and infrequent
but:
 not all applications  e.g., mobile code for active
networks
 digital signatures
 re-"dialing" when resources are busy
Need:
 reliability  to avoid long setup delays
 flow control  avoid overloading signaling server
 congestion control  avoid overloading network
 fragmentation of long signaling messages
 in-sequence delivery  avoid race conditions
 transport-layer security  integrity, privacy


This defines standard reliable transport
protocols:
 TCP
 SCTP
Avoid re-inventing wheel  see SIP
experience
Telcordia - June 21, 2004
20
GIMPS transport protocol usage



One transport connection  many NSLP sessions
may use multiple TCP/SCTP ports
can use TLS for transport-layer security



compared to IPsec, well-exercised key establishment
not quite clear what the principal is
re-use of transport 




no overhead of TCP and SCTP session establishment
avoid TLS session setup
better timer estimates
SCTP avoids HOL blocking
Telcordia - June 21, 2004
21
Message forwarding

Route stateless or state-full:

stateless: record route and retrace
 state-full: based on next-hop information in ‘C’ node

Destination:






address  look at destination address
address + record  record route
route  based on recorded route
state forward  based on next-hop state
state backward  based on previous-hop state
State:
no-op  leave state as is
 ADD  add message (and maybe client) state
 DEL  delete message state

Telcordia - June 21, 2004
22
Message format
common header

client protocol data
No GIMPS distinction between requests and responses



extensions
just routed in different directions
client protocol may define requests and responses
Common header defines:









destination flag
state flag
session identifier
traffic selector: identify traffic "covered" by this session
message sequence number
response sequence number
message cookie  avoid IP address impersonation
origin address  may not be data source or sink
destination address or scope
Telcordia - June 21, 2004
23
Message format, cont'd



Limit session lifetime
Avoid loops  hop counter
Mobility:




dead branch removal flag
branch identifier
Record route: gathers up addresses of NSIS nodes
visited
Route: addresses that NSIS message should visit
Telcordia - June 21, 2004
24
Capability negotiation

NSIS has named capabilities


including client protocols
Three mechanisms:

discovery: count capabilities along a path




"10 out of 15 can do QoS"
record: record capabilities for each node
require: for scout message, only stop once node supports all
capabilities (or-of-and)
avoid protocol versioning
Telcordia - June 21, 2004
25
Next-hop discovery

Next-in-path service







enhanced routing protocols  distribute information about node capabilities in
OSPF
routing protocol with probing
service discovery, e.g., SLP
first hop, e.g., router advertisements
DHCP
scout protocol
Next AS service (not in current version):

touch down once per autonomous system (AS)
 new DNS name space: ASN.as.arpa, e.g., 17.as.arpa
 use new DNS NAPTR and SRV for lookup

similar to SIP approach
Telcordia - June 21, 2004
26
Next-hop discovery

Y
next IP hop
existing transport
NSIS-aware?
connection?
N
use D mode to find
next NSIS hop
Y
N
done


establish
transport

connection

Telcordia - June 21, 2004
scout messages
are special NSIS
messages
limited < MTU
size
addressed to
session
destination
UDP with router
alert option 
get looked at by
each router
reflected when
matching NSIS
node found
27
Mobility and route changes

DEL (B=2)
discovers new route
B=1
on refresh

ADD
B=2
Telcordia - June 21, 2004

avoids
session
identification
by end point
addresses
avoid use of
traffic selector
as session
identifier
remove dead
branch
28
QoS-NSLP: resource reservation






NSLP for signaling QoS reservations in the
Internet
both sender- and receiver-initiated reservations
soft-state
peer-to-peer signaling and refresh (rather than
end-to-end)
bundled sessions (e.g., video + audio)
agnostic about QoS models (IntServ, DiffServ,
RMD, …)
Telcordia - June 21, 2004
29
QoS-NSLP node architecture
resource
management
QoS-NSLP
policy
control
API
GIMPS
forwarding table
manipulation
select
GIMPS
packets
outgoing
interface
selection
(forwarding)
packet
classifier
input packet
processing
traffic control
Telcordia - June 21, 2004
packet
scheduler
30
QoS-NSLP actors
QoS
unaware
NSLP nodes
not shown
IP address =
flow source address
flow
sender
data
IP address =
flow destination
QNE (…)
QNI
QNR
flow
receiver
GIMPS+
QoS-NSLP
e.g.,
access
router
Telcordia - June 21, 2004
31
QoS-NSLP: sender-initiated
reservation
QNI
QNE
RESERVE
(RSN #4)
QNE
RESERVE
(RSN #17)
QNR
RESERVE
(RSN #3)
RESPONSE
RESPONSE
RESPONSE
Telcordia - June 21, 2004
32
QoS-NSLP: receiver-initiated
reservation
QNI
QNE
QUERY
QNE
QUERY
QNR
QUERY
RESERVE
RESERVE
RESERVE
RESPONSE
RESPONSE
Telcordia - June 21, 2004
RESPONSE
33
QoS flow aggregation
aggregate
QoS-NSLP style
(RFC 3175)
traffic
sink
(LAN)
sinktree style
(BGRP)
Telcordia - June 21, 2004
34
The weight of NSIS
 NSIS
state = transport state + GIMPS
state + NSLP state
 GIMPS state = two sockets
 transport state = O(100) bytes  10,000
users consume 1 MB
Telcordia - June 21, 2004
35
Route change detection


Don’t want to wait for periodic rediscovery –
delay of 30s+
Not all route changes matter
 e.g.,

only changes between NSIS routers
Data plane detection
 TTL
change of arriving data packets
 propagation delay change for data packets
 monitoring propagation delay (~ min(e2e delay))
 increases in packet loss or jitter
Telcordia - June 21, 2004
36
Route change measurements







12 measurement sites (looking glass)
one traceroute every 15’  2.75 hours per pair
availability: 99.8%
0.1% repeated IP addresses
4.4% single hop with multiple IP addresses
422 route changes observed after data cleanup
(13,074 records)
67 out of 422 also showed AS changes
 often,
indicates multi-homing
Telcordia - June 21, 2004
37
Route changes
250
40
35
Frequency
Frequency
200
150
30
25
20
15
10
100
5
0
50
-2
-1
0
1
2
AS count change
0
-8
-6
-4
-2
0
2
4
6
TTL change
Telcordia - June 21, 2004
38
On-going and planned work
Finish NTLP (GIMPS) and NSIS clients
(NAT-FW and QoS)
 Longer term: off-path signaling (new WG?)
 New applications: diagnostics
 Mobility support

Telcordia - June 21, 2004
39
Conclusion
NSIS = unified infrastructure for data-affiliated
sessions
 avoid making assumptions except that sessions
wants to "visit" data nodes or networks
 not just mobility, but also mobility

 route

change detection challenging
protocol framework in place
 but
need to work out packet formats
Telcordia - June 21, 2004
40