QoSIP - Columbia University

Download Report

Transcript QoSIP - Columbia University

QoS and security - using traditional
services for new ends
Henning Schulzrinne
Dept. of Computer Science
Columbia University
Feb. 4, 2005
QoSIP - Catania
1
Overview


Some impolite remarks about network
research and QoS
QoS challenges in real networks:





NATs and firewalls
DOS
reliability
Permission-based networking
GIMPS: next steps in signaling
Feb. 4, 2005
QoSIP - Catania
2
Impolite remarks on QoS
and network research
Feb. 4, 2005
QoSIP - Catania
3
Lifecycle of technologies
traditional technology propagation:
military
opex/capex
doesn’t
matter;
expert
support
Can it be done?
Feb. 4, 2005
corporate
capex/opex
sensitive,
but
amortized;
expert
support
Can I afford it?
QoSIP - Catania
consumer
capex
sensitive;
amateur
Can my mother use it?
4
Networking research is fashiondriven
workshop
white paper
DARPA, NSF
 $$
EU
Nth framework
trailing-edge
research
Sigcomm
Infocom
Mobicom
ICNP
ATM
DQDB
QoS
Feb. 4, 2005
secondary
conferences
active networks
QoSIP - Catania
networking courses
First (European) workshop
on X -YAP on X
mobile networks
wireless
ad-hoc, sensor
5
Impact of network research

What’s promising/interesting
– two different axes:



Field has few grand
challenges and metrics


Intellectual merit 
interesting analysis, broadly
applicable, …
Satisfies practical needs 
may not be a scientific
breakthrough
cf., speech understanding or
face recognition






Depends largely on external
technology inputs


faster CPUs, better optical
gear, compression
typical performance
improvements in queueing:
20-50%
Feb. 4, 2005
Networking research impact
QoSIP - Catania
on deployed systems and
protocols?
on understanding network
behavior?
on other papers?
Which of the 10,000 QoS
papers had real impact?
What papers were
responsible for most
important networking
advances?

TCP , web?, email?
6
Maturing network research

Old questions:

Can we make X work over packet networks?




All major dedicated network applications (flight reservations,
embedded systems, radio, TV, telephone, fax, messaging, …)
are now available on IP
Can we get M/G/T bits to the end user?
Raw bits everywhere: “any media, anytime, anywhere”
New questions:



Dependency on communications  Can we make the
network reliable?
Can non-technical users use networks without becoming
amateur sys-admins?  auto/zeroconfiguration,
autonomous computing, self-healing networks, …
Can we prevent social and financial damage inflicted through
networks (viruses, spam, DOS, identity theft, privacy
violations, …)?
Feb. 4, 2005
QoSIP - Catania
7
Observations on network
research

Frustration with inability to change network infrastructure in less
than 10 -- 20 year horizons:





Network research community has dismal track record for new
applications


IPv6
Layer-3 multicast
QoS
Security
web, IM, P2P (Gnutella, BitTorrent), … vs. video-on-demand
Niche applications get disproportionate attention


active networks, ad-hoc networks, (structured) P2P
successful applications don’t care if they don’t scale


centralized IM & search, unstructured P2P, …
Disconnect from standardization


Few attempts to bring research work into standards bodies
Standards bodies slow to catch up (e.g., P2P)
Feb. 4, 2005
QoSIP - Catania
8
Why do good ideas fail?

Research: O(.), CPU overhead

“per-flow reservation (RSVP) doesn’t scale”  not
the problem


Reality:


at least now -- routinely handle O(50,000) routing states
deployment costs of any new L3 technology is
probably billions of $
Cost of failure:


conservative estimate (1 grad student year = 2
papers)
10,000 QoS papers @ $20,000/paper  $200
million
Feb. 4, 2005
QoSIP - Catania
9
Cause of death for the next big
thing
QoS
multicast
mobile
IP
not manageable across
competing domains



not configurable by normal
users (or apps writers)

no business model for ISPs



no initial gain
80% solution in existing
system
increase system
vulnerability
Feb. 4, 2005






active
IPsec IPv6
networks











(NAT)


QoSIP - Catania


10
QoS

QoS is meaningless to users


common QoS models now:



scavenger service (worse-than-best-effort)  self-protection
DiffServ on access routers and NAT boxes
care about service availability  reliability




difficult to engineer service that is consistently poor, but
usable
but most commercial service is good enough for
VoIP/video/… most of the time
charging model problem  users will arbitrage and buy
basic quality except during congestion periods
see multi-homing vs. high-end providers
as more and more value depends on network
services, can't afford random downtimes
Feb. 4, 2005
QoSIP - Catania
11
Why did QoS (mostly) fail?

hypothesis: “The
success of a technology
is inversely proportional
to the number of papers
published before its
popularity.”



ACM: 10,158 papers with
QoS or “quality of
service” in abstract
IEEE: 7,297 papers
real-time streaming
video-on-demand 
DVD via Netflix or TCP
onto 200 GB hard disk
Feb. 4, 2005








QoSIP - Catania
bandwidth “too cheap to meter”
undemocratic – some traffic is
more equal than others
reminds you of your mom: no,
you can’t have that 10 Mb/s
now
socialist: administer scarcity we like SUVs (or to drive 100
mph)!
“risky scheme”: security
only displacement applications
(such as telephony) need QoS
requires cooperation: edge-ISP,
transit ISPs, end systems
snake oil: add QoS, lose half
your bandwidth
12
Why did QoS fail? (con’td)


dishonesty: we only talk
about the beneficiaries
network has become harder
to evolve:







network address translation
firewalls
high packetization overhead
(VPNs, IPv6)
to be useful, has to be
nearly universally supported
(“no, you can’t make calls to
AS 123”)
network QoS vs. business
class model: “coach is
empty, please refund fare”
Feb. 4, 2005
currently, the ISP interface is
IP and BGP – adding a third
one is a big deal
new Internet service model:
TCP client (inside) – server
(outside)



QoSIP - Catania
exception: peer-to-peer on
college campuses
network to host: you first,
no, you first
failure of IP QoS  success
of MPLS

more TE than QoS
13
Where did QoS technology
succeed?

Edge network:







VLAN prioritization
802.11e MAC layer priority
IP TOS byte (not quite DiffServ) – known
since 1980s…
Docsis/PacketCable  application-initiated
Mostly deals with self-interference
No admission control
No authorization (except Docsis)
Feb. 4, 2005
QoSIP - Catania
14
Network reliability

we don’t know precisely why network applications
fails





temporary overloads
reduce operator errors



components and backbones appear to pretty reliable
but we measured at 99.5% of usable time  far below
99.999% in telecom networks
lots of possible culprits, including DNS and carrier
interconnects
e.g., XCONF effort in IETF
inherently safe or fail-safe protocols?
faster convergence in routing protocols

BGP  up to 20-30 minutes!
Feb. 4, 2005
QoSIP - Catania
15
New applications – need for QoS?

New bandwidth-intensive applications



Distributed games often require only low-bandwidth
control information


current game traffic ~ VoIP
Computation vs. storage vs. communications


Reality-based networking
(security) cameras
communications cost has decreased less rapidly than storage
costs
Emphasis on user control of communications



from anywhere, anytime, any media to where appropriate,
my time, my media
Guess: #1 user-selected research problem: fix spam
#2: keep cell phone from ringing in the movie theater
Feb. 4, 2005
QoSIP - Catania
16
New network
architectures for security
Feb. 4, 2005
QoSIP - Catania
17
Security challenges

DOS, security attacks  permissions-based
communications




only allow modest rates without asking
effectively, back to circuit-switched
Higher-level security services  more
application-layer access via gateways,
proxies, …
User identity

problem is not availability, but rather overabundance
Feb. 4, 2005
QoSIP - Catania
18
Trustability: Internet decay


Decay of inner cities: small number of bad elements
+ lack of social controls and law enforcement
Small number of miscreants


“The bulk of U.S. spam is coming from a very limited set of
IPs with high-bandwidth connections," said Alperovitch, who
estimated that the high-volume spamming addresses
number fewer than 10,000 and the number of spammers at
less than 200.” (Informationweek, Aug. 2004)
Naïve users

with increasing firepower
Feb. 4, 2005
QoSIP - Catania
19
Trustability problems

Traditional security didn’t solve user interface problem


traditional firewall (crunchy outside, squishy inside)


in a village, you know your neighbors
on-going approaches useful, but limited






most spam proposals unlikely to work
notion of “global village” is an oxymoron


fails with any content – even JPEGs aren’t safe
email usability rapidly decreasing


is citi-bank.com my bank or phishing?
conversion of protocols to secured versions (e.g., via TLS)
prevent source address spoofing
OS and application robustness against buffer overflow attacks
IETF MARID (SenderID, SPF, …) for email sender identification
DOS traceback
thus, may need to rethink network architecture
Feb. 4, 2005
QoSIP - Catania
20
Trustability: A more polite
Internet



introduce yourself first
“shoot first, ask later” (Bush)
“ask first, shoot later” (Kerry)
may I send?
yes, up to 10 kb/s
limits large-scale DDOS
more circuit-oriented
may get permission slip for future use
Feb. 4, 2005
QoSIP - Catania
21
Restoring the village part of the
global village



It’s not what you know, it’s who you know
Authentication works only if addresses can be
recognized by policy or human
Doesn’t work well for first-time contacts 
much of communications


won’t be fixed by SPF and SenderID
Need to leverage indirect knowledge


our approach: social networks for recognizing
users in SIP systems
leverage knowledge across media: visiting web
page enables receipt of email from related address
 make phishing more difficult
Feb. 4, 2005
QoSIP - Catania
22
GIMPS – a modular data
plane signaling protocol
(with Robert Hancock, Hannes Tschofenig, S. van den Bosch,
G. Karagiannis, A. McDonald, X. Fu and others)
Feb. 4, 2005
QoSIP - Catania
23
Overview


Signaling: application vs. data plane
Resource control





DiffServ vs. IntServ
What’s wrong with RSVP?
Components of a general solution
NSIS = NTLP (GIMPS) + {NSLP}+
Route change detection
Feb. 4, 2005
QoSIP - Catania
24
Signaling – the big picture
session signaling
SIP proxy
server
off-path NE
off-path signaling
data
AS#1
on-path signaling
AS#2
datapath signaling
Feb. 4, 2005
QoSIP - Catania
25
Need for data plane state
establishment

Differentiated treatment of packets



Mapping state


accounting
Other state establishment




network address translation (NAT)
Counting packets


QoS
firewall (loss = 100% vs. loss = 0%)
setting up active network capsules
MPLS paths
pseudo-wire emulation (PWE) – T1 over IP
Related: visit subset of data path nodes, but don’t leave state behind


diagnostics  better traceroute
link speeds, load, loss, packet treatment, …
Feb. 4, 2005
QoSIP - Catania
26
On-path vs. off-path signaling


On-path (path-coupled): visit subset of
routers on data path
Off-path (path-decoupled): anything else, but
presumably roughly along data path




one proposal: one “touch point” for each AS
bandwidth broker
difficult part is resource tracking, not signaling
No fundamental differences in protocol 
separate out next-hop discovery to allow reuse
Feb. 4, 2005
QoSIP - Catania
27
Differentiated packet handling

Not just QOS, but also



firewall
network address translation
accounting and measurement
IntServ
filter
management
DiffServ
traffic
filtering
Feb. 4, 2005
traffic shaping,
handling & measurement
QoSIP - Catania
28
DiffServ  IntServ

Filter always uses packet characteristic


5-tuple (protocol, source/destination address +
port) + global label (TOS)
multiple “flows” can be mapped to one
treatment mechanism
Feb. 4, 2005
DiffServ
IntServ
in-band
identificatio
n
TOS
5-tuple?
5-tuple
5-tuple
mapping
fixed
signaled
TCP SYN
QoSIP - Catania
29
The scaling bogeyman

Networks routinely handle large-scale per-flow
state






OC-48 can handle 31,875 DS-0 voice calls
Mean call duration = 9 min  60 requests/second
probably about 3 MB of data
partially explained by poor initial RSVP
explanations


firewalls
NATs
scaling = cost per flow is constant (or decreasing)
flow numbers are modest:


It doesn’t
scale!
where flow search time ~ O(N) rather than O(1)
likely limitations are in AAA, not router signaling
Feb. 4, 2005
QoSIP - Catania
30
RSVP characteristics




soft-state = state vanishes if not
refreshed
two-pass signaling = path discovery +
reservation
receiver-based resource reservation
separation of QoS signaling from
routing

with some router feedback
Feb. 4, 2005
QoSIP - Catania
31
The problem with RSVP


Designed for QoS establishment, used mostly for other things (RSVPTE)
Designed for large-scale IP multicast  customer never materialized

adds significant complexity:




receiver-based  PATH + RESV
designed for ASM (any-source) rather than SSM (source-specific)
receiver-based motivated by receiver diversity – not very useful in practice
Designed in simpler days (1997):





does not work well with mobile nodes (IP mobility or changing IP
addresses)
no support for NATs
security mostly bolted on – non-standard mechanisms
single-purpose, with no clear extensibility model
very primitive transport mechanism

Feb. 4, 2005
either refresh or exponential decay (refresh reduction, RFC 2961)
QoSIP - Catania
32
The cost of multicast for RSVP

reservation styles



multiple senders in same group: shared vs.
distinct
sender selection: explicit vs. wildcard
receiver-oriented


motivated by heterogeneous
can do leaf-initiated join rather than rootinitiated




20
but still need periodic PATH to visit new subtree
three different flow specs


60
3
0
20
10
20
Sender_TSpec, ADSpec, (TSpec, RSpec)
fairly tightly woven into core protocol
40
state merging and management
killer reservation (KR-II)

60
20
60
60
generally, error handling problematic
ResvErr!
20
40
draft-fu-rsvp-multicast-analysis
Feb. 4, 2005
10
QoSIP - Catania
60
33
IETF NSIS working group




chartered in Dec. 2001, after BOF in March
2001
Motivated by Braden’s two-layer model (draftlindell-waypoint, draft-braden-2level-signalarch)
Active participation from Roke Manor,
Siemens, NEC Europe, Nokia, Samsung,
Columbia
Based partially on CASP protocol designed by
Columbia/Siemens group and prototyped at
UKy
QoSIP - Catania
34
Feb. 4, 2005
NSIS protocol structure
NSLP
(C)
NTLP
(GIMPS)
transport layer

GIMPS
UDP, TCP, SCTP
IP router alert
client layer does the real work:




QoS, NAT/FW, …
reserve resources
open firewall ports
…

transport layer:

reliable transport
messaging layer:


establishes and tears down state
negotiates features and capabilities
Feb. 4, 2005
QoSIP - Catania
35
NSIS properties

Network friendly





soft state



add more applications later
transport neutral


congestion-controlled
re-use of state across applications
application-neutral



per-node time-out
explicit removal of state
extensible


data format
negotiation
any reliable protocol
initially, TCP and SCTP
also, UDP for initial probing
policy neutral


no particular AAA policy or
protocol
interaction with COPS, DIAMETER
needs work
Feb. 4, 2005
QoSIP - Catania
36
NSIS properties, cont'd.

Topology hiding


not recommended, but
possible
Light weight



implementation
complexity
security associations
(re-use)
may not need kernel
implementation
Feb. 4, 2005
QoSIP - Catania
37
What is GIMPS?

Generic signaling transport service


establishes state along path of data
one sender, typically one receiver


can be multiple receivers  multicast (not in initial version)

can be used for QoS per-flow or per-class reservation

but not restricted to that
avoid restricting users of protocol (and religious arguments):


sender vs. receiver orientation
more or less closely tied to data path


initially, router-by-router (path-coupled)
later, network (AS) path (path-decoupled)
Feb. 4, 2005
QoSIP - Catania
38
NSIS network model – pathcoupled
selective
NTLP chain
QoS
QoS
QoS
midcom
omnivorous


NTLP nodes form NTLP chain
not every node processes all client protocols:



non-NTLP node: regular router
omnivorous: processes all NTLP messages
selective: bypassed by NTLP messages with unknown client protocols
Feb. 4, 2005
QoSIP - Catania
39
Network model – pathdecoupled
Bandwidth broker
NAC
NTLP
AS15465
AS 1249
AS17
data
Also route network-by-network
 can combine router-by-router with outof-path messaging

Feb. 4, 2005
QoSIP - Catania
40
GIMPS messages

Regular NTLP messages



establish or tear down state
carry client protocol
datagram (“D”) or connection (“C”) mode
Hop-by-hop reliability
 Generated by any node along the chain

Feb. 4, 2005
QoSIP - Catania
41
NSIS transport protocol usage



Most signaling messages are small and
infrequent
but:

not all applications  e.g., mobile code for
active networks

digital signatures

re-"dialing" when resources are busy
Need:

reliability  to avoid long setup delays

flow control  avoid overloading signaling
server

congestion control  avoid overloading
network

fragmentation of long signaling messages

in-sequence delivery  avoid race conditions

transport-layer security  integrity, privacy
Feb. 4, 2005


QoSIP - Catania
This defines standard reliable transport
protocols:

TCP

SCTP
Avoid re-inventing wheel  see SIP
experience
42
GIMPS transport protocol
usage



One transport connection  many NSLP sessions
may use multiple TCP/SCTP ports
can use TLS for transport-layer security



compared to IPsec, well-exercised key establishment
not quite clear what the principal is
re-use of transport 




no overhead of TCP and SCTP session establishment
avoid TLS session setup
better timer estimates
SCTP avoids HOL blocking
Feb. 4, 2005
QoSIP - Catania
43
Message forwarding

Route stateless or state-full:



Destination:






stateless: record route and retrace
state-full: based on next-hop information in ‘C’ node
address  look at destination address
address + record  record route
route  based on recorded route
state forward  based on next-hop state
state backward  based on previous-hop state
State:



no-op  leave state as is
ADD  add message (and maybe client) state
DEL  delete message state
Feb. 4, 2005
QoSIP - Catania
44
Message format
common header

client protocol data
No GIMPS distinction between requests and responses



extensions
just routed in different directions
client protocol may define requests and responses
Common header defines:









destination flag
state flag
session identifier
traffic selector: identify traffic "covered" by this session
message sequence number
response sequence number
message cookie  avoid IP address impersonation
origin address  may not be data source or sink
destination address or scope
Feb. 4, 2005
QoSIP - Catania
45
Message format, cont'd



Limit session lifetime
Avoid loops  hop counter
Mobility:




dead branch removal flag
branch identifier
Record route: gathers up addresses of NSIS nodes
visited
Route: addresses that NSIS message should visit
Feb. 4, 2005
QoSIP - Catania
46
Capability negotiation

NSIS has named capabilities


including client protocols
Three mechanisms:

discovery: count capabilities along a path




"10 out of 15 can do QoS"
record: record capabilities for each node
require: for scout message, only stop once node supports all
capabilities (or-of-and)
avoid protocol versioning
Feb. 4, 2005
QoSIP - Catania
47
Next-hop discovery

Y
next IP hop
existing transport
NSIS-aware?
connection?
N
use D mode to find
next NSIS hop
Y
N
done


establish
transport

connection

Feb. 4, 2005
QoSIP - Catania
scout messages
are special NSIS
messages
limited < MTU
size
addressed to
session
destination
UDP with router
alert option 
get looked at by
each router
reflected when
matching NSIS
node found
48
Mobility and route changes

DEL (B=2)
discovers new route
B=1
on refresh

ADD
B=2
Feb. 4, 2005
QoSIP - Catania

avoids
session
identification
by end point
addresses
avoid use of
traffic
selector as
session
identifier
remove dead
branch
49
QoS-NSLP: resource reservation






NSLP for signaling QoS reservations in the
Internet
both sender- and receiver-initiated
reservations
soft-state
peer-to-peer signaling and refresh (rather
than end-to-end)
bundled sessions (e.g., video + audio)
agnostic about QoS models (IntServ, DiffServ,
RMD, …)
Feb. 4, 2005
QoSIP - Catania
50
QoS-NSLP: sender-initiated
reservation
QNI
QNE
RESERVE
(RSN #4)
QNE
RESERVE
(RSN #17)
QNR
RESERVE
(RSN #3)
RESPONSE
RESPONSE
RESPONSE
Feb. 4, 2005
QoSIP - Catania
51
QoS-NSLP: receiver-initiated
reservation
QNI
QNE
QUERY
QNE
QUERY
QNR
QUERY
RESERVE
RESERVE
RESERVE
RESPONSE
Feb. 4, 2005
RESPONSE
QoSIP - Catania
RESPONSE
52
QoS flow aggregation
aggregate
QoS-NSLP style
(RFC 3175)
traffic
sink
(LAN)
sinktree style
(BGRP)
Feb. 4, 2005
QoSIP - Catania
53
Route change detection


Don’t want to wait for periodic rediscovery –
delay of 30s+
Not all route changes matter


e.g., only changes between NSIS routers
Data plane detection




TTL change of arriving data packets
propagation delay change for data packets
monitoring propagation delay (~ min(e2e delay))
increases in packet loss or jitter
Feb. 4, 2005
QoSIP - Catania
54
Route change measurements







12 measurement sites (looking glass)
one traceroute every 15’  2.75 hours per
pair
availability: 99.8%
0.1% repeated IP addresses
4.4% single hop with multiple IP addresses
422 route changes observed after data
cleanup (13,074 records)
67 out of 422 also showed AS changes

often, indicates multi-homing
Feb. 4, 2005
QoSIP - Catania
55
Route changes
250
Frequency
200
150
100
50
0
-8
-6
-4
-2
0
2
4
6
TTL change
Feb. 4, 2005
QoSIP - Catania
56
On-going and planned work




Finish NTLP (GIMPS) and NSIS clients
(NAT-FW and QoS)
Longer term: off-path signaling (new
WG?)
New applications: diagnostics
Mobility support
Feb. 4, 2005
QoSIP - Catania
57
Conclusion

QoS deployment: 25 year old
technology at edge only


Security concerns trump utilization
optimization


can do 95% with 5% complexity
prioritize user traffic  deny resources to
attacker
GIMPS: a re-engineered generic
signaling mechanism
Feb. 4, 2005
QoSIP - Catania
58