ICNP - Columbia University

Download Report

Transcript ICNP - Columbia University

Protocol and Network Design
Challenges for a Consumer-Grade
Internet
Henning Schulzrinne
(with members of the IRT group)
Dept. of Computer Science
Columbia University
ICNP (Berlin)
October 6, 2004
Overview


Life cycle of network technology
Challenges for network research




Permission-based networking


no longer a young discipline
from performance to usability
“Anywhere, any time, any media”  “leave me
alone”
spam, DDOS and phishing
Lessons from designing IETF protocols


protocol design engineering trade-offs
Skype, Asterisk and SIP
Network evolution
Lifecycle of technologies
traditional technology propagation:
military
opex/capex
doesn’t
matter;
expert
support
Can it be done?
corporate
capex/opex
sensitive,
but
amortized;
expert
support
Can I afford it?
consumer
capex
sensitive;
amateur
Can my mother use it?
Lifecycle of technologies

Examples:

content creation:




video cameras, non-linear editors, …
35mm blue slides  PowerPoint in elementary school
LANs: corporate  home Ethernet & wireless
But:

now often faster technology evolution for
consumers



initial expense discourages replacement
viz. airline 3270 terminals
military GPS units
Internet and networks timeline
theory
1960
university
prototypes
1970
port
speeds
Internet
protocols
production use
in research
1980
100 kb/s
email
ftp
queuing
architecture
commercial
early residential
1990
1 Mb/s
DNS
RIP
UDP
TCP
SMTP
SNMP
finger
routing
cong. control
broadband
home
2000
10 Mb/s
2010
100 Mb/s
ATM
BGP, OSPF
Mbone
IPsec
HTTP
HTML
RTP
DQDB, ATM
QoS
VoD
1 Gb/s
XML
OWL
SIP
Jabber
p2p
ad-hoc
sensor
Networking is getting into
middle years
idea
current
IP
1969,
1980?
1981
TCP
telnet
ftp
1974
1969
1980
1981
1983
1985
Technologies at ~30 years

Other technologies at similar maturity level:





air planes: 1903 – 1938 (Stratoliner)
cars: 1876 – 1908 (Model T)
analog telephones: 1876 – 1915 (transcontinental
telephone)
railroad: 1800s -- ?
(Communications) technologies rarely disappear (as
long as operational cost is low):

exceptions:




telex, telegram, semaphores  fax, email
X.25 + OSI, X.400  IP, SMTP
analog cell phones
 thus, NGN (post-IP) discussions likely academic
History of networking

History of networking = non-network applications
migrate






postal & intracompany mail, fax  email, IM
broadcast: TV, radio
interactive voice/video communication  VoIP
information access  web, P2P
disk access  iSCSI, Fiberchannel-over-IP
Initially, mainstream applications (email)

followed later (now) by niches:




high reliability requirements & specialized features
air traffic control
EDI  XML
911 system: CAMA  IP
Network evolution

Only three L2/L3
modes, now thoroughly
explored:




packet/cell-based
message-based
(application data units)
session-based (circuits)
Applications:

pull


Replace specialized
networks

left to do: embedded
systems




web, ftp, RPC
push asynchronous



email
push synchronous


continuous media
IM


need cost(CPU +
network) < $10
cars
industrial
(manufacturing) control
commercial buildings
(lighting, HVAC,
security; now
LONworks)
remote controls, light
switches
keys replaced by
biometrics
Transit cost (OC-3, NY –
London)
Commercial access cost (T1)
$700
$600
$/month
$500
$400
$300
$200
$100
$0
1996
1998
2000
Year
2001
T1
2002
2003
Disk storage cost (IDE)
Cost
$100,000.00
$/GB
$10,000.00
$1,000.00
$100.00
$10.00
$1.00
May-79
Feb-82
Nov-84
Aug-87
May-90
Jan-93
Date
Oct-95
Jul-98
Apr-01
Jan-04
Network Research
Networking research is fashiondriven
workshop
white paper
DARPA, NSF
 $$
EU
Nth framework
trailing-edge
research
Sigcomm
Infocom
Mobicom
ICNP
ATM
DQDB
QoS
secondary
conferences
active networks
networking courses
First (European) workshop
on X -YAP on X
mobile networks
wireless
ad-hoc, sensor
Impact of network research

What’s promising/interesting
– two different axes:



Field has few grand
challenges and metrics


Intellectual merit 
interesting analysis, broadly
applicable, …
Satisfies practical needs 
may not be a scientific
breakthrough
cf., speech understanding or
face recognition
Depends largely on external
technology inputs


faster CPUs, better optical
gear, compression
typical performance
improvements in queueing:
20-50%

Networking research impact





on deployed systems and
protocols?
on understanding network
behavior?
on other papers?
Which of the 10,000 QoS
papers had real impact?
What papers were
responsible for most
important networking
advances?

TCP , web?, email?
What’s fashionable (and not)

Judging from Infocom submissions and NSF
panels:






Security of any sort
Peer-to-peer networks
Sensor networks
Overlay networks
Network measurements
What’s not:



QoS: scheduling, admission control, …
Active networks
Multicast
Maturing network research

Old questions:

Can we make X work over packet networks?




All major dedicated network applications (flight reservations,
embedded systems, radio, TV, telephone, fax, messaging, …)
are now available on IP
Can we get M/G/T bits to the end user?
Raw bits everywhere: “any media, anytime, anywhere”
New questions:



Dependency on communications  Can we make the
network reliable?
Can non-technical users use networks without becoming
amateur sys-admins?  auto/zeroconfiguration,
autonomous computing, self-healing networks, …
Can we prevent social and financial damage inflicted through
networks (viruses, spam, DOS, identity theft, privacy
violations, …)?
Observations on network
research

Frustration with inability to change network infrastructure in less
than 10 -- 20 year horizons:





Network research community has dismal track record for new
applications


IPv6
Layer-3 multicast
QoS
Security
web, IM, P2P (Gnutella, BitTorrent), … vs. video-on-demand
Niche applications get disproportionate attention


active networks, ad-hoc networks, (structured) P2P
successful applications don’t care if they don’t scale


centralized IM & search, unstructured P2P, …
Disconnect from standardization


Few attempts to bring research work into standards bodies
Standards bodies slow to catch up (e.g., P2P)
Why do good ideas fail?

Research: O(.), CPU overhead

“per-flow reservation (RSVP) doesn’t scale”  not
the problem


Reality:


at least now -- routinely handle O(50,000) routing states
deployment costs of any new L3 technology is
probably billions of $
Cost of failure:


conservative estimate (1 grad student year = 2
papers)
10,000 QoS papers @ $20,000/paper  $200
million
Cause of death for the next big
thing
QoS
multicast
mobile
IP
not manageable across
competing domains



not configurable by normal
users (or apps writers)

no business model for ISPs



no initial gain
80% solution in existing
system
increase system
vulnerability






active
IPsec IPv6
networks











(NAT)




QoS

QoS is meaningless to users


common QoS models now:



scavenger service (worse-than-best-effort)  self-protection
DiffServ on access routers and NAT boxes
care about service availability  reliability




difficult to engineer service that is consistently poor, but
usable
but most commercial service is good enough for
VoIP/video/… most of the time
charging model problem  users will arbitrage and buy
basic quality except during congestion periods
see multi-homing vs. high-end providers
as more and more value depends on network
services, can't afford random downtimes
Why did QoS fail?

hypothesis: “The
success of a technology
is inversely proportional
to the number of papers
published before its
popularity.”



ACM: 10,158 papers with
QoS or “quality of
service” in abstract
IEEE: 7,297 papers
real-time streaming
video-on-demand 
DVD via Netflix or TCP
onto 200 GB hard disk








bandwidth “too cheap to meter”
undemocratic – some traffic is
more equal than others
reminds you of your mom: no,
you can’t have that 10 Mb/s
now
socialist: administer scarcity we like SUVs (or to drive 100
mph)!
“risky scheme”: security
only displacement applications
(such as telephony) need QoS
requires cooperation: edge-ISP,
transit ISPs, end systems
snake oil: add QoS, lose half
your bandwidth
Why did QoS fail? (con’td)


dishonesty: we only talk
about the beneficiaries
network has become harder
to evolve:







network address translation
firewalls
high packetization overhead
(VPNs, IPv6)
to be useful, has to be
nearly universally supported
(“no, you can’t make calls to
AS 123”)
network QoS vs. business
class model: “coach is
empty, please refund fare”
currently, the ISP interface is
IP and BGP – adding a third
one is a big deal
new Internet service model:
TCP client (inside) – server
(outside)



exception: peer-to-peer on
college campuses
network to host: you first,
no, you first
failure of IP QoS  success
of MPLS

more TE than QoS
Standardization
Oscillate: convergence  divergence




continued convergence clearly at physical layer
connectivity trumps functionality
niches larger  support separate networks
Really two facets of standardization:

1.
2.
public, interoperable description of protocol, but
possibly many (Tanenbaum)
reduction to 1-3 common technologies



LAN: Arcnet, tokenring, ATM, FDDI, DQDB, … 
Ethernet
WAN: IP, X.25, OSI  IP
OS: dozens  Windows & Linux
Standardization


Have reached phase 2 in most cases, with RPC
(SOAP) and presentation layer (XML) most recent
'conversions‘
Often, non-standardized technologies can be
deployed faster

single (dominant) vendor





Skype vs. SIP and H.323
AOL IM and Jabber vs. SIMPLE
SMB vs. NFS
 Standardization after success
IETF one-protocol-for-application vs. everything-isRPC

not enough network experts  standardization scales better
Infrastructure research questions:
Scaling, Maintainability, Security, …

Scaling



Maintainability



no major changes for 20+ years (link-state, DV, etc.)
two-layer (intra/inter)  other routing paradigms
protocols and systems are not designed with fault diagnosis
capabilities
e.g., “transparent” proxies, routing, DNS, hacked traceroute
Security


secure routing protocols
DOS prevention (pushback, source discovery)
… and Reliability

we don’t know precisely why network applications
fails





temporary overloads
reduce operator errors



components and backbones appear to pretty reliable
but we measured at 99.5% of usable time  far below
99.999% in telecom networks
lots of possible culprits, including DNS and carrier
interconnects
e.g., XCONF effort in IETF
inherently safe or fail-safe protocols?
faster convergence in routing protocols

BGP  up to 20-30 minutes!
Applications

Transition of custom protocols to XML, SOAP?






but this is the not the first try (DCE, SunRPC, COM, Java
RMI, Corba, …)
Scalable event systems (research)
Presence (SIP/SIMPLE, Jabber, …)
P2P systems
Application-layer routing and overlays, multicast,
discovery, …
The categorical imperative for network design: “act
only in accordance with that maxim through which
you can at the same time will that it become a
universal law.”

satisfied by overlay networks for packet routing? flooding
P2P systems?
New applications

New bandwidth-intensive applications



Distributed games often require only low-bandwidth
control information


current game traffic ~ VoIP
Computation vs. storage vs. communications


Reality-based networking
(security) cameras
communications cost has decreased less rapidly than storage
costs
Emphasis on user control of communications



from anywhere, anytime, any media to where appropriate,
my time, my media
Guess: #1 user-selected research problem: fix spam
#2: keep cell phone from ringing in the movie theater
New network
architectures for security
Security challenges

DOS, security attacks  permissions-based
communications




only allow modest rates without asking
effectively, back to circuit-switched
Higher-level security services  more
application-layer access via gateways,
proxies, …
User identity

problem is not availability, but rather overabundance
Trustability: Internet decay


Decay of inner cities: small number of bad elements
+ lack of social controls and law enforcement
Small number of miscreants


“The bulk of U.S. spam is coming from a very limited set of
IPs with high-bandwidth connections," said Alperovitch, who
estimated that the high-volume spamming addresses
number fewer than 10,000 and the number of spammers at
less than 200.” (Informationweek, Aug. 2004)
Naïve users

with increasing firepower
Trustability problems

Traditional security didn’t solve user interface problem


traditional firewall (crunchy outside, squishy inside)


in a village, you know your neighbors
on-going approaches useful, but limited






most spam proposals unlikely to work
notion of “global village” is an oxymoron


fails with any content – even JPEGs aren’t safe
email usability rapidly decreasing


is citi-bank.com my bank or phishing?
conversion of protocols to secured versions (e.g., via TLS)
prevent source address spoofing
OS and application robustness against buffer overflow attacks
IETF MARID (SenderID, SPF, …) for email sender identification
DOS traceback
thus, may need to rethink network architecture
Trustability: A more polite
Internet



introduce yourself first
“shoot first, ask later” (Bush)
“ask first, shoot later” (Kerry)
may I send?
yes, up to 10 kb/s
limits large-scale DDOS
more circuit-oriented
may get permission slip for future use
Restoring the village part of the
global village



It’s not what you know, it’s who you know
Authentication works only if addresses can be
recognized by policy or human
Doesn’t work well for first-time contacts 
much of communications


won’t be fixed by SPF and SenderID
Need to leverage indirect knowledge


our approach: social networks for recognizing
users in SIP systems
leverage knowledge across media: visiting web
page enables receipt of email from related address
 make phishing more difficult
Reflections on protocol design
Internet services – the missing
entry
Service/delivery
synchronous
asynchronous
push
instant messaging
presence
event notification
session setup
media-on-demand
messaging
pull
data retrieval
file download
remote procedure call
peer-to-peer file sharing
Filling in the protocol gap
Service/delivery
synchronous
asynchronous
push
SIP
RTSP, RTP
SMTP
pull
HTTP
ftp
SunRPC, Corba, SOAP
(not yet standardized)
SIP trapezoid
outbound proxy
destination proxy
(identified by SIP URI domain)
1st request
SIP trapezoid
2nd, 3rd, … request
[email protected]:
128.59.16.1
registrar
voice traffic
RTP
Reflections on protocol
development: SIP

Decisions that worked well:


generality over efficiency
proxies as first-class protocol entities


text-based protocol (+ compression)


XML would be fashionable now, but would not allow
additional capabilities
separation of header and session description/body



but would remove distinction between proxies and
B2BUAs
allowed expanding scope to presence + IM
allowed S/MIME
allow simple implementation

but many simple implementations are lazy
… and those I’d like to re-think

too many options in text encoding


support of unreliable, non-secure transport


but UDP ensured early implementations and lower latency
special-casing INVITE transaction



lower/upper case, spaces, line folding, escaping, …
optimized in messages, but complicates implementation
should have had generic state transition indication
should have allowed for more regular insertion
mechanism for protocol elements

but would probably be fairly expensive
Protocols are now ecosystems
70
60
50
40
SIP
SIPPING
SIMPLE
30
20
10
0
1999
2000
2001
2002
2003
includes draft-ietf-*-00 and draft-personal-*-00
Combining client-server and P2P:
P2P-SIP


Unlike server-based SIP architecture
Unlike proprietary Skype architecture


Robust and efficient lookup using DHT
Interoperability


Hybrid architecture


Lookup in SIP+P2P, end-to-end communications
Unlike file-sharing applications


DHT algorithm uses SIP messages
Data storage, caching, delay, reliability
Disadvantages

Lookup delay and security
Architecture
User interface (buddy list, etc.)
On reset Signout,
transfer
On startup
Leave
Discover
Peer found/
Detect NAT
ICE

Join
Multicast REG
Signup,
Find buddies
IM,
call
User location
Find
Audio devices
DHT (Chord)
REG
SIP
REG, INVITE,
MESSAGE
Codecs
RTP/RTCP
DHT communication using SIP REGISTER



Known node: sip:[email protected]
Unknown node: sip:[email protected]
User: sip:[email protected]
Dialing Out

INVITE sip:[email protected]
MESSAGE sip:[email protected]
INVITE key=42
302
Last seen
Call, instant message, etc.

INVITE

If existing buddy, use cache
first
If not found

DHT

SIP-based lookup (DNS
NAPTR, SRV,…)
P2P lookup


42
Send to super-nodes: proxy
Use DHT to locate: proxy or
redirect to next hop
Conclusion



Is networking research becoming like civil
engineering: large, important infrastructure,
but resistant to fundamental change?
Challenges are in reliability and
maintainability, rather than performance or
packet-loss & jitter QoS
As a community, need to learn more from our
collective and individual mistakes…

Need series “The design mistakes in [popular
system or protocol]”