Meauring Availability in Telecommunications Networks
Download
Report
Transcript Meauring Availability in Telecommunications Networks
Measuring Availability in
Telecommunications Networks
Mattias Thulin, November 2004
Disposition
1.
2.
3.
4.
5.
6.
7.
8.
9.
Introduction
Method
Network Availability
SDH Network description
ITU-T standard G.826
Analysis
Implementation
Result
Conclusions
2
Song Networks
Nordic network provider
Optical fiber network covering
Northern Europe
Products:
IP-VPN
Internet connections
Telephone services
Hosting
Carrier services
Introduction
Market demand for network quality
Important to measure network availability
Maintain service-level agreements
Attract new customers
Indicator of network quality for internal maintenance
Methods of measuring and defining Network Availability vary
between operators
Purpose
How is Network Availability defined?
How can it be measured?
Why should it be measured?
What standards exist?
Are there any recommended values for availability parameters?
How can availability measurements be applied to Song
Networks SDH transmission network?
Delimitations
1)
2)
General study on Network Availability
Develop a method for availability measurement
Nortel SDH equipment
Four rings
44 links
Oriented towards network-operation
Method
Literature study
Network study
Monitoring system Preside
Interviews
Standards
Design model for availability measurement and presentation
Network Availability - definition
The ability of a functional unit to be in a state to
perform a required function under given conditions at a
given instant of time or over a given time interval,
assuming that the required external resources are
provided.
ISO 2382-14, 1997
Network Availability – The “five-nines”
Percentage value of uptime for
a given time period
“Five-nines” 99,999%
Availability
Downtime per
year
99,9999% 32s
99,999% 5min 15s
Viewed as desired uptime in
network core-level
99,99% 52min 36s
99,9% 8h 46min
99% 3 days 15h
40min
Theoretic Availability
Summing availability
Serial units
A
B
Total availability = A * B
99,99%
99,99%
Parallel units
A
B
Total availability = A + B - A * B
Total availability =
0 , 9999
4
0 , 9996
99,99%
Reactive Availability
Data from trouble-tickets
Good for measuring customer-experienced availability
Easy to identify what equipment failed and what solved the error
Can lack information of short interruptions and outside of office
hours
Customer- vs. Network-management
oriented
Important to know for whom or for what purpose are we
measuring
Customer oriented
Includes all layers
Calculate downtime when the customer connection is not
working.
Network-management oriented
What links have lower availability?
Considered as downtime although the traffic is rerouted
SDH Network Description
SDH – Synchronous Digital Hierarchy
Based on American standard SONET
Normally build in ring structure
Error correction and retransmission is done by overlaying
protocols
Song Networks’ SDH network
Sweden ring
Nordic ring
Baltic ring
Europe ring
Song Networks’ SDH network
Sweden ring: 9 Network Elements
Nordic ring: 7 Network Elements
European ring: 7 Network Elements
Baltic ring: 3 Network Elements
G18
G17
6050
6054
G17
G18
G18
G17
G12
G11
6052
6056
G17
G18
Surveillance and statistics
NE
NE
NE
NE
OPC
NE
NE
Preside
Preside
Global performance
Alarm lists
Query performance
statistics
Preside log files
Comma-delimited text files (CSV)
One file per Network Element
96 15-min counts (past 24 hours)
8 24-hour counts (past week)
1200,Ottawa,OC48,Term,DS3,G7,2,Line,Rx,Ne,SES,03/07/99,03/0
7/99,16:00,
0,0,0,2,3,8,12,6,
0,0,1,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0
,0,0
ITU-T Standard G.826
Parameters:
Bit-error
Errored Second (ES)
1 sec with 1 EB
Errored Block (EB)
1 bit-error/block
Severly Errored Second
(SES)
1 sec with 30% EB
Unavailable Second (UAS)
1
0s
<
1
0s
U
n
a
v
a
ila
b
ilityd
e
te
c
te
d
U
n
a
v
a
ila
b
le
p
e
r
io
d
S
e
v
e
r
e
lyE
r
r
o
r
e
dS
e
c
o
n
d
A period
of
unavailable
time begins at the onset of ten
E
r
r
o
r
e
d
S
e
c
o
n
d
(
n
o
n
S
E
S
)
consecutive
SES
events.
These
ten seconds are considered to
r
r
o
r
f
r
e
e
S
e
c
o
n
d time. A new period of available time
be partE
of
unavailable
begins at the onset of ten consecutive non-SES events. These
ten seconds are considered to be part of available time.
(ITU-T G.826, 2002)
1
0s
A
v
a
ila
b
ilityd
A
v
a
ila
b
le
p
e
r
io
d
Analysis
Define availability
Develop model for calculating average availability
Define database structure for saving availability statistics
Specify format for availability reports
Analysis
Follow ITU-T Standard G.826
Apply to all active links in the network
Calculate average availability per link, per ring and total network
Present first five significant figures
First calculate average UAS, convert to percentage in last step to avoid
rounding error
MeasuredTi me UAS
Availabili ty
* 100
MeasuredTi me
Analysis
+
+
3
Availability for a ring is the average UAS for all the links in the ring
Implementation
Log Files
Parser
Database
Analyze
Report
Implementation - parser program
Programmed in Java for platform independence
Parse all log-files in directory for:
NE
Link
Day
UAS count
Insert into MySQL database table
Implementation – report generating
Web interface for easy access
Input parameters: start and end
date
PHP-script query database for
UAS values and calculate
average availability
Per link
Per ring
Total network
Report can be saved to PDF
format (PHP-script)
Implementation – Graphic reports
Crystal Reports
Start-date and end-date are
entered and the program
queries database and produces
graphic reports
Can be exported to PDF file
NE
Result
Between 2004-07-12 and 2004-09-19
SDH Ring
Availability in %
Link
Availability in %
6052
G11
99.989
6052
G12
99.989
6052
G17
99.944
6052
G18
99.958
6053
G17
99.856
6053
G18
99.959
6054
G17
100
6054
G18
99.95
6056
G11
99.989
6056
G12
100
6056
G17
99.951
6056
G18
100
6060
G11
100
6060
G12
100
6060
G17
100
Sweden DX
99.983
Nordic DX
99.997
6060
G18
99.993
Europe DX
99.462
6058
G17
99.993
6058
G18
100
Baltic DX
99.814
6044
G11
100
TOTAL
99.81
6044
G12
99.989
6044
G17
100
6044
G18
100
6045
G17
100
6045
G18
100
6046
G17
100
6046
G18
100
Result
Between 2004-07-12 and 2004-09-19
Result
Between 2004-07-12 and 2004-09-19
Conclusions
Background study can be used for planning future
measurements
Positive feedback from network operations management for the
weekly reports
Need more statistic in the database to observe general trends
By studying trends Song Networks can cut maintenance
spending and better forecast future cost by directing resources
to maintain a high network quality
Future work:
Measure backbone availability from a customer point of view
using relational databases
How do errors in the backbone affect distribution layer?
Questions?