Carrier-grade vs. Internet VoIP
Download
Report
Transcript Carrier-grade vs. Internet VoIP
Carrier-grade vs. Internet
VoIP
Henning Schulzrinne
(with Wenyu Jiang)
Columbia University
FCC Technical Advisory Council III
Washington, DC – October 20, 2003
Overview
Previous talk: interactive
communication services
signaling & media
Now focus on overall architecture:
network & service availability
signaling services: SIP, H.323
supporting services: DNS, DHCP, LDAP, …
network transport
network quality-of-service
packet loss, delay, jitter
Overview
(on-going work, preliminary results, still
looking for measurement sites, …)
Service availability
Measurement setup
Measurement results
call success probability
overall network loss
network outages
outage induced call abortion probability
Service availability
Users do not care about QoS
at least not about packet loss, jitter, delay
rather, it’s service availability how likely is it that I
can place a call and not get interrupted?
availability = MTBF / (MTBF + MTTR)
MTBF = mean time between failures
MTTR = mean time to repair
availability = successful calls / first call attempts
equipment availability: 99.999% (“5 nines”) 5
minutes/year
Long-distance voice 99.978%
AT&T (2003):
ATM data
99.999%
Sprint IP frame relay SLA: 99.5% Frame relay data
99.998%
IP
99.991%
Availability – PSTN metrics
PSTN metrics (Worldbank study):
fault rate
fault clearance (~ MTTR)
“next business day”
call completion rate
“should be less than 0.2 per main line”
during network busy hour
“varies from about 60% - 75%”
dial tone delay
Example PSTN statistics
Source: Worldbank
Measurement setup
Node name Location
Connectivity
Network
columbia
Columbia University, NY
>= OC3
I2
wustl
Washington U., St. Louis
I2
unm
Univ. of New Mexico
I2
epfl
EPFL, Lausanne, CH
I2+
hut
Helsinki University of Technology
I2+
rr
NYC
cable modem
ISP
rrqueens
Queens, NY
cable modem
ISP
njcable
New Jersey
cable modem
ISP
newport
New Jersey
ADSL
ISP
sanjose
San Jose, California
cable modem
ISP
suna
Kitakyushu, Japan
3 Mb/s
ISP
sh
Shanghai, China
cable modem
ISP
Shanghaihome
Shanghai, China
cable modem
ISP
Shanghaioffice
Shanghai, China
ADSL
ISP
Measurement setup
Active measurements
call duration 3 or 7 minutes
UDP packets:
36 bytes alternating with 72 bytes (FEC)
40 ms spacing
September 10 to December 6, 2002
13,500 call hours
Call success probability
62,027 calls
succeeded, 292
failed 99.53%
availability
roughly constant
across I2, I2+,
commercial ISPs
All
99.53%
Internet2
99.52%
Internet2+
99.56%
Commercial
99.51%
Domestic (US)
99.45%
International
99.58%
Domestic
commercial
99.39%
International
commercial
99.59%
Overall network loss
PSTN: once connected,
call usually of good
quality
exception: mobile phones
compute periods of time
below loss threshold
5% causes degradation
for many codecs
others acceptable till
20%
loss
0%
5%
10%
20%
All
82.3
97.48
99.16
99.75
ISP
78.6
96.72
99.04
99.74
I2
97.7
99.67
99.77
99.79
I2+
86.8
98.41
99.32
99.76
US
83.6
96.95
99.27
99.79
Int.
81.7
97.73
99.11
99.73
US
ISP
73.6
95.03
98.92
99.79
Int.
ISP
81.2
97.60
99.10
99.71
Network outages
sustained packet losses
arbitrarily defined at 8 packets
far beyond any recoverable loss (FEC,
interpolation)
23% outages
make up significant part of 0.25%
unavailability
symmetric: AB BA
spatially correlated: AB AX
not correlated across networks (e.g., I2 and
commercial)
Network outages
1
US Domestic paths
International paths
0.1
0.01
0.001
0.0001
Complementary CDF
Complementary CDF
1
all paths
Internet2
0.1
0.01
0.001
0.0001
0
50 100 150 200 250 300 350 400
outage duration (sec)
1e-05
0
50 100 150 200 250 300 350 400
outage duration (sec)
Network outages
no. of
outages
%
duration
symmetric (mean)
duration
(median)
total (all,
h:m)
outages >
1000
packets
all
10,753
30%
145
25
17:20
10:58
I2
819
14.5%
360
25
3:17
2:33
I2+
2,708
10%
259
26
7:47
5:37
ISP
8,045
37%
107
24
9:33
4:58
US
1,777
18%
269
20
5:18
3:53
Int.
8,976
33%
121
26
12:02
6:42
Outage-induced call abortion
probability
Long interruption user likely
to abandon call
from E.855 survey: P[holding]
= e-t/17.26 (t in seconds)
half the users will abandon
call after 12s
2,566 have at least one
outage
946 of 2,566 expected to be
dropped 1.53% of all calls
all
1.53%
I2
1.16%
I2+
1.15%
ISP
1.82%
US
0.99%
Int.
1.78%
US ISP
0.86%
Int. ISP
2.30%
Conclusions from measurement
Availability in space is (mostly) solved
availability in time restricts usability for new
applications
initial investigation into service availability for
VoIP
need to define metrics for, say, web access
unify packet loss and “no Internet dial tone’’
far less than “5 nines”
working on identifying fault sources and
locations
looking for additional measurement sites
What’s next?
Existing SLAs are mostly useless
Existing measurements similarly dubious
Limited ability to learn from mistakes
what are the primary causes of service unavailability?
what can I do to protect myself – multi-homing via same fiber?
diverse access mechanisms?
Consumers of services have no good ways to compare service
availability
too many exceptions
wrong time scales: month vs. minutes
no guarantees for interconnects
only some very large customers may get access to carrier-internal
data
Thus, market failure
Need published metrics
similar to switch availability reporting