QoSIP - Columbia University
Download
Report
Transcript QoSIP - Columbia University
QoS and security - using traditional
services for new ends
Henning Schulzrinne
Dept. of Computer Science
Columbia University
Feb. 4, 2005
QoSIP - Catania
1
Overview
Some impolite remarks about network
research and QoS
QoS challenges in real networks:
NATs and firewalls
DOS
reliability
Permission-based networking
GIMPS: next steps in signaling
Feb. 4, 2005
QoSIP - Catania
2
Impolite remarks on QoS
and network research
Feb. 4, 2005
QoSIP - Catania
3
Lifecycle of technologies
traditional technology propagation:
military
opex/capex
doesn’t
matter;
expert
support
Can it be done?
Feb. 4, 2005
corporate
capex/opex
sensitive,
but
amortized;
expert
support
Can I afford it?
QoSIP - Catania
consumer
capex
sensitive;
amateur
Can my mother use it?
4
Networking research is fashiondriven
workshop
white paper
DARPA, NSF
$$
EU
Nth framework
trailing-edge
research
Sigcomm
Infocom
Mobicom
ICNP
ATM
DQDB
QoS
Feb. 4, 2005
secondary
conferences
active networks
QoSIP - Catania
networking courses
First (European) workshop
on X -YAP on X
mobile networks
wireless
ad-hoc, sensor
5
Impact of network research
What’s promising/interesting
– two different axes:
Field has few grand
challenges and metrics
Intellectual merit
interesting analysis, broadly
applicable, …
Satisfies practical needs
may not be a scientific
breakthrough
cf., speech understanding or
face recognition
Depends largely on external
technology inputs
faster CPUs, better optical
gear, compression
typical performance
improvements in queueing:
20-50%
Feb. 4, 2005
Networking research impact
QoSIP - Catania
on deployed systems and
protocols?
on understanding network
behavior?
on other papers?
Which of the 10,000 QoS
papers had real impact?
What papers were
responsible for most
important networking
advances?
TCP , web?, email?
6
Maturing network research
Old questions:
Can we make X work over packet networks?
All major dedicated network applications (flight reservations,
embedded systems, radio, TV, telephone, fax, messaging, …)
are now available on IP
Can we get M/G/T bits to the end user?
Raw bits everywhere: “any media, anytime, anywhere”
New questions:
Dependency on communications Can we make the
network reliable?
Can non-technical users use networks without becoming
amateur sys-admins? auto/zeroconfiguration,
autonomous computing, self-healing networks, …
Can we prevent social and financial damage inflicted through
networks (viruses, spam, DOS, identity theft, privacy
violations, …)?
Feb. 4, 2005
QoSIP - Catania
7
Observations on network
research
Frustration with inability to change network infrastructure in less
than 10 -- 20 year horizons:
Network research community has dismal track record for new
applications
IPv6
Layer-3 multicast
QoS
Security
web, IM, P2P (Gnutella, BitTorrent), … vs. video-on-demand
Niche applications get disproportionate attention
active networks, ad-hoc networks, (structured) P2P
successful applications don’t care if they don’t scale
centralized IM & search, unstructured P2P, …
Disconnect from standardization
Few attempts to bring research work into standards bodies
Standards bodies slow to catch up (e.g., P2P)
Feb. 4, 2005
QoSIP - Catania
8
Why do good ideas fail?
Research: O(.), CPU overhead
“per-flow reservation (RSVP) doesn’t scale” not
the problem
Reality:
at least now -- routinely handle O(50,000) routing states
deployment costs of any new L3 technology is
probably billions of $
Cost of failure:
conservative estimate (1 grad student year = 2
papers)
10,000 QoS papers @ $20,000/paper $200
million
Feb. 4, 2005
QoSIP - Catania
9
Cause of death for the next big
thing
QoS
multicast
mobile
IP
not manageable across
competing domains
not configurable by normal
users (or apps writers)
no business model for ISPs
no initial gain
80% solution in existing
system
increase system
vulnerability
Feb. 4, 2005
active
IPsec IPv6
networks
(NAT)
QoSIP - Catania
10
QoS
QoS is meaningless to users
common QoS models now:
scavenger service (worse-than-best-effort) self-protection
DiffServ on access routers and NAT boxes
care about service availability reliability
difficult to engineer service that is consistently poor, but
usable
but most commercial service is good enough for
VoIP/video/… most of the time
charging model problem users will arbitrage and buy
basic quality except during congestion periods
see multi-homing vs. high-end providers
as more and more value depends on network
services, can't afford random downtimes
Feb. 4, 2005
QoSIP - Catania
11
Why did QoS (mostly) fail?
hypothesis: “The
success of a technology
is inversely proportional
to the number of papers
published before its
popularity.”
ACM: 10,158 papers with
QoS or “quality of
service” in abstract
IEEE: 7,297 papers
real-time streaming
video-on-demand
DVD via Netflix or TCP
onto 200 GB hard disk
Feb. 4, 2005
QoSIP - Catania
bandwidth “too cheap to meter”
undemocratic – some traffic is
more equal than others
reminds you of your mom: no,
you can’t have that 10 Mb/s
now
socialist: administer scarcity we like SUVs (or to drive 100
mph)!
“risky scheme”: security
only displacement applications
(such as telephony) need QoS
requires cooperation: edge-ISP,
transit ISPs, end systems
snake oil: add QoS, lose half
your bandwidth
12
Why did QoS fail? (con’td)
dishonesty: we only talk
about the beneficiaries
network has become harder
to evolve:
network address translation
firewalls
high packetization overhead
(VPNs, IPv6)
to be useful, has to be
nearly universally supported
(“no, you can’t make calls to
AS 123”)
network QoS vs. business
class model: “coach is
empty, please refund fare”
Feb. 4, 2005
currently, the ISP interface is
IP and BGP – adding a third
one is a big deal
new Internet service model:
TCP client (inside) – server
(outside)
QoSIP - Catania
exception: peer-to-peer on
college campuses
network to host: you first,
no, you first
failure of IP QoS success
of MPLS
more TE than QoS
13
Where did QoS technology
succeed?
Edge network:
VLAN prioritization
802.11e MAC layer priority
IP TOS byte (not quite DiffServ) – known
since 1980s…
Docsis/PacketCable application-initiated
Mostly deals with self-interference
No admission control
No authorization (except Docsis)
Feb. 4, 2005
QoSIP - Catania
14
Network reliability
we don’t know precisely why network applications
fails
temporary overloads
reduce operator errors
components and backbones appear to pretty reliable
but we measured at 99.5% of usable time far below
99.999% in telecom networks
lots of possible culprits, including DNS and carrier
interconnects
e.g., XCONF effort in IETF
inherently safe or fail-safe protocols?
faster convergence in routing protocols
BGP up to 20-30 minutes!
Feb. 4, 2005
QoSIP - Catania
15
New applications – need for QoS?
New bandwidth-intensive applications
Distributed games often require only low-bandwidth
control information
current game traffic ~ VoIP
Computation vs. storage vs. communications
Reality-based networking
(security) cameras
communications cost has decreased less rapidly than storage
costs
Emphasis on user control of communications
from anywhere, anytime, any media to where appropriate,
my time, my media
Guess: #1 user-selected research problem: fix spam
#2: keep cell phone from ringing in the movie theater
Feb. 4, 2005
QoSIP - Catania
16
New network
architectures for security
Feb. 4, 2005
QoSIP - Catania
17
Security challenges
DOS, security attacks permissions-based
communications
only allow modest rates without asking
effectively, back to circuit-switched
Higher-level security services more
application-layer access via gateways,
proxies, …
User identity
problem is not availability, but rather overabundance
Feb. 4, 2005
QoSIP - Catania
18
Trustability: Internet decay
Decay of inner cities: small number of bad elements
+ lack of social controls and law enforcement
Small number of miscreants
“The bulk of U.S. spam is coming from a very limited set of
IPs with high-bandwidth connections," said Alperovitch, who
estimated that the high-volume spamming addresses
number fewer than 10,000 and the number of spammers at
less than 200.” (Informationweek, Aug. 2004)
Naïve users
with increasing firepower
Feb. 4, 2005
QoSIP - Catania
19
Trustability problems
Traditional security didn’t solve user interface problem
traditional firewall (crunchy outside, squishy inside)
in a village, you know your neighbors
on-going approaches useful, but limited
most spam proposals unlikely to work
notion of “global village” is an oxymoron
fails with any content – even JPEGs aren’t safe
email usability rapidly decreasing
is citi-bank.com my bank or phishing?
conversion of protocols to secured versions (e.g., via TLS)
prevent source address spoofing
OS and application robustness against buffer overflow attacks
IETF MARID (SenderID, SPF, …) for email sender identification
DOS traceback
thus, may need to rethink network architecture
Feb. 4, 2005
QoSIP - Catania
20
Trustability: A more polite
Internet
introduce yourself first
“shoot first, ask later” (Bush)
“ask first, shoot later” (Kerry)
may I send?
yes, up to 10 kb/s
limits large-scale DDOS
more circuit-oriented
may get permission slip for future use
Feb. 4, 2005
QoSIP - Catania
21
Restoring the village part of the
global village
It’s not what you know, it’s who you know
Authentication works only if addresses can be
recognized by policy or human
Doesn’t work well for first-time contacts
much of communications
won’t be fixed by SPF and SenderID
Need to leverage indirect knowledge
our approach: social networks for recognizing
users in SIP systems
leverage knowledge across media: visiting web
page enables receipt of email from related address
make phishing more difficult
Feb. 4, 2005
QoSIP - Catania
22
GIMPS – a modular data
plane signaling protocol
(with Robert Hancock, Hannes Tschofenig, S. van den Bosch,
G. Karagiannis, A. McDonald, X. Fu and others)
Feb. 4, 2005
QoSIP - Catania
23
Overview
Signaling: application vs. data plane
Resource control
DiffServ vs. IntServ
What’s wrong with RSVP?
Components of a general solution
NSIS = NTLP (GIMPS) + {NSLP}+
Route change detection
Feb. 4, 2005
QoSIP - Catania
24
Signaling – the big picture
session signaling
SIP proxy
server
off-path NE
off-path signaling
data
AS#1
on-path signaling
AS#2
datapath signaling
Feb. 4, 2005
QoSIP - Catania
25
Need for data plane state
establishment
Differentiated treatment of packets
Mapping state
accounting
Other state establishment
network address translation (NAT)
Counting packets
QoS
firewall (loss = 100% vs. loss = 0%)
setting up active network capsules
MPLS paths
pseudo-wire emulation (PWE) – T1 over IP
Related: visit subset of data path nodes, but don’t leave state behind
diagnostics better traceroute
link speeds, load, loss, packet treatment, …
Feb. 4, 2005
QoSIP - Catania
26
On-path vs. off-path signaling
On-path (path-coupled): visit subset of
routers on data path
Off-path (path-decoupled): anything else, but
presumably roughly along data path
one proposal: one “touch point” for each AS
bandwidth broker
difficult part is resource tracking, not signaling
No fundamental differences in protocol
separate out next-hop discovery to allow reuse
Feb. 4, 2005
QoSIP - Catania
27
Differentiated packet handling
Not just QOS, but also
firewall
network address translation
accounting and measurement
IntServ
filter
management
DiffServ
traffic
filtering
Feb. 4, 2005
traffic shaping,
handling & measurement
QoSIP - Catania
28
DiffServ IntServ
Filter always uses packet characteristic
5-tuple (protocol, source/destination address +
port) + global label (TOS)
multiple “flows” can be mapped to one
treatment mechanism
Feb. 4, 2005
DiffServ
IntServ
in-band
identificatio
n
TOS
5-tuple?
5-tuple
5-tuple
mapping
fixed
signaled
TCP SYN
QoSIP - Catania
29
The scaling bogeyman
Networks routinely handle large-scale per-flow
state
OC-48 can handle 31,875 DS-0 voice calls
Mean call duration = 9 min 60 requests/second
probably about 3 MB of data
partially explained by poor initial RSVP
explanations
firewalls
NATs
scaling = cost per flow is constant (or decreasing)
flow numbers are modest:
It doesn’t
scale!
where flow search time ~ O(N) rather than O(1)
likely limitations are in AAA, not router signaling
Feb. 4, 2005
QoSIP - Catania
30
RSVP characteristics
soft-state = state vanishes if not
refreshed
two-pass signaling = path discovery +
reservation
receiver-based resource reservation
separation of QoS signaling from
routing
with some router feedback
Feb. 4, 2005
QoSIP - Catania
31
The problem with RSVP
Designed for QoS establishment, used mostly for other things (RSVPTE)
Designed for large-scale IP multicast customer never materialized
adds significant complexity:
receiver-based PATH + RESV
designed for ASM (any-source) rather than SSM (source-specific)
receiver-based motivated by receiver diversity – not very useful in practice
Designed in simpler days (1997):
does not work well with mobile nodes (IP mobility or changing IP
addresses)
no support for NATs
security mostly bolted on – non-standard mechanisms
single-purpose, with no clear extensibility model
very primitive transport mechanism
Feb. 4, 2005
either refresh or exponential decay (refresh reduction, RFC 2961)
QoSIP - Catania
32
The cost of multicast for RSVP
reservation styles
multiple senders in same group: shared vs.
distinct
sender selection: explicit vs. wildcard
receiver-oriented
motivated by heterogeneous
can do leaf-initiated join rather than rootinitiated
20
but still need periodic PATH to visit new subtree
three different flow specs
60
3
0
20
10
20
Sender_TSpec, ADSpec, (TSpec, RSpec)
fairly tightly woven into core protocol
40
state merging and management
killer reservation (KR-II)
60
20
60
60
generally, error handling problematic
ResvErr!
20
40
draft-fu-rsvp-multicast-analysis
Feb. 4, 2005
10
QoSIP - Catania
60
33
IETF NSIS working group
chartered in Dec. 2001, after BOF in March
2001
Motivated by Braden’s two-layer model (draftlindell-waypoint, draft-braden-2level-signalarch)
Active participation from Roke Manor,
Siemens, NEC Europe, Nokia, Samsung,
Columbia
Based partially on CASP protocol designed by
Columbia/Siemens group and prototyped at
UKy
QoSIP - Catania
34
Feb. 4, 2005
NSIS protocol structure
NSLP
(C)
NTLP
(GIMPS)
transport layer
GIMPS
UDP, TCP, SCTP
IP router alert
client layer does the real work:
QoS, NAT/FW, …
reserve resources
open firewall ports
…
transport layer:
reliable transport
messaging layer:
establishes and tears down state
negotiates features and capabilities
Feb. 4, 2005
QoSIP - Catania
35
NSIS properties
Network friendly
soft state
add more applications later
transport neutral
congestion-controlled
re-use of state across applications
application-neutral
per-node time-out
explicit removal of state
extensible
data format
negotiation
any reliable protocol
initially, TCP and SCTP
also, UDP for initial probing
policy neutral
no particular AAA policy or
protocol
interaction with COPS, DIAMETER
needs work
Feb. 4, 2005
QoSIP - Catania
36
NSIS properties, cont'd.
Topology hiding
not recommended, but
possible
Light weight
implementation
complexity
security associations
(re-use)
may not need kernel
implementation
Feb. 4, 2005
QoSIP - Catania
37
What is GIMPS?
Generic signaling transport service
establishes state along path of data
one sender, typically one receiver
can be multiple receivers multicast (not in initial version)
can be used for QoS per-flow or per-class reservation
but not restricted to that
avoid restricting users of protocol (and religious arguments):
sender vs. receiver orientation
more or less closely tied to data path
initially, router-by-router (path-coupled)
later, network (AS) path (path-decoupled)
Feb. 4, 2005
QoSIP - Catania
38
NSIS network model – pathcoupled
selective
NTLP chain
QoS
QoS
QoS
midcom
omnivorous
NTLP nodes form NTLP chain
not every node processes all client protocols:
non-NTLP node: regular router
omnivorous: processes all NTLP messages
selective: bypassed by NTLP messages with unknown client protocols
Feb. 4, 2005
QoSIP - Catania
39
Network model – pathdecoupled
Bandwidth broker
NAC
NTLP
AS15465
AS 1249
AS17
data
Also route network-by-network
can combine router-by-router with outof-path messaging
Feb. 4, 2005
QoSIP - Catania
40
GIMPS messages
Regular NTLP messages
establish or tear down state
carry client protocol
datagram (“D”) or connection (“C”) mode
Hop-by-hop reliability
Generated by any node along the chain
Feb. 4, 2005
QoSIP - Catania
41
NSIS transport protocol usage
Most signaling messages are small and
infrequent
but:
not all applications e.g., mobile code for
active networks
digital signatures
re-"dialing" when resources are busy
Need:
reliability to avoid long setup delays
flow control avoid overloading signaling
server
congestion control avoid overloading
network
fragmentation of long signaling messages
in-sequence delivery avoid race conditions
transport-layer security integrity, privacy
Feb. 4, 2005
QoSIP - Catania
This defines standard reliable transport
protocols:
TCP
SCTP
Avoid re-inventing wheel see SIP
experience
42
GIMPS transport protocol
usage
One transport connection many NSLP sessions
may use multiple TCP/SCTP ports
can use TLS for transport-layer security
compared to IPsec, well-exercised key establishment
not quite clear what the principal is
re-use of transport
no overhead of TCP and SCTP session establishment
avoid TLS session setup
better timer estimates
SCTP avoids HOL blocking
Feb. 4, 2005
QoSIP - Catania
43
Message forwarding
Route stateless or state-full:
Destination:
stateless: record route and retrace
state-full: based on next-hop information in ‘C’ node
address look at destination address
address + record record route
route based on recorded route
state forward based on next-hop state
state backward based on previous-hop state
State:
no-op leave state as is
ADD add message (and maybe client) state
DEL delete message state
Feb. 4, 2005
QoSIP - Catania
44
Message format
common header
client protocol data
No GIMPS distinction between requests and responses
extensions
just routed in different directions
client protocol may define requests and responses
Common header defines:
destination flag
state flag
session identifier
traffic selector: identify traffic "covered" by this session
message sequence number
response sequence number
message cookie avoid IP address impersonation
origin address may not be data source or sink
destination address or scope
Feb. 4, 2005
QoSIP - Catania
45
Message format, cont'd
Limit session lifetime
Avoid loops hop counter
Mobility:
dead branch removal flag
branch identifier
Record route: gathers up addresses of NSIS nodes
visited
Route: addresses that NSIS message should visit
Feb. 4, 2005
QoSIP - Catania
46
Capability negotiation
NSIS has named capabilities
including client protocols
Three mechanisms:
discovery: count capabilities along a path
"10 out of 15 can do QoS"
record: record capabilities for each node
require: for scout message, only stop once node supports all
capabilities (or-of-and)
avoid protocol versioning
Feb. 4, 2005
QoSIP - Catania
47
Next-hop discovery
Y
next IP hop
existing transport
NSIS-aware?
connection?
N
use D mode to find
next NSIS hop
Y
N
done
establish
transport
connection
Feb. 4, 2005
QoSIP - Catania
scout messages
are special NSIS
messages
limited < MTU
size
addressed to
session
destination
UDP with router
alert option
get looked at by
each router
reflected when
matching NSIS
node found
48
Mobility and route changes
DEL (B=2)
discovers new route
B=1
on refresh
ADD
B=2
Feb. 4, 2005
QoSIP - Catania
avoids
session
identification
by end point
addresses
avoid use of
traffic
selector as
session
identifier
remove dead
branch
49
QoS-NSLP: resource reservation
NSLP for signaling QoS reservations in the
Internet
both sender- and receiver-initiated
reservations
soft-state
peer-to-peer signaling and refresh (rather
than end-to-end)
bundled sessions (e.g., video + audio)
agnostic about QoS models (IntServ, DiffServ,
RMD, …)
Feb. 4, 2005
QoSIP - Catania
50
QoS-NSLP: sender-initiated
reservation
QNI
QNE
RESERVE
(RSN #4)
QNE
RESERVE
(RSN #17)
QNR
RESERVE
(RSN #3)
RESPONSE
RESPONSE
RESPONSE
Feb. 4, 2005
QoSIP - Catania
51
QoS-NSLP: receiver-initiated
reservation
QNI
QNE
QUERY
QNE
QUERY
QNR
QUERY
RESERVE
RESERVE
RESERVE
RESPONSE
Feb. 4, 2005
RESPONSE
QoSIP - Catania
RESPONSE
52
QoS flow aggregation
aggregate
QoS-NSLP style
(RFC 3175)
traffic
sink
(LAN)
sinktree style
(BGRP)
Feb. 4, 2005
QoSIP - Catania
53
Route change detection
Don’t want to wait for periodic rediscovery –
delay of 30s+
Not all route changes matter
e.g., only changes between NSIS routers
Data plane detection
TTL change of arriving data packets
propagation delay change for data packets
monitoring propagation delay (~ min(e2e delay))
increases in packet loss or jitter
Feb. 4, 2005
QoSIP - Catania
54
Route change measurements
12 measurement sites (looking glass)
one traceroute every 15’ 2.75 hours per
pair
availability: 99.8%
0.1% repeated IP addresses
4.4% single hop with multiple IP addresses
422 route changes observed after data
cleanup (13,074 records)
67 out of 422 also showed AS changes
often, indicates multi-homing
Feb. 4, 2005
QoSIP - Catania
55
Route changes
250
Frequency
200
150
100
50
0
-8
-6
-4
-2
0
2
4
6
TTL change
Feb. 4, 2005
QoSIP - Catania
56
On-going and planned work
Finish NTLP (GIMPS) and NSIS clients
(NAT-FW and QoS)
Longer term: off-path signaling (new
WG?)
New applications: diagnostics
Mobility support
Feb. 4, 2005
QoSIP - Catania
57
Conclusion
QoS deployment: 25 year old
technology at edge only
Security concerns trump utilization
optimization
can do 95% with 5% complexity
prioritize user traffic deny resources to
attacker
GIMPS: a re-engineered generic
signaling mechanism
Feb. 4, 2005
QoSIP - Catania
58