Validating the Resilience Mechanisms for the Packet Switched

Download Report

Transcript Validating the Resilience Mechanisms for the Packet Switched

Validating the Resilience Mechanisms for the
Packet Switched Domain in 3G Networks
Jari Hietanen
Nokia Networks
Helsinki
Supervisor: Professor Raimo Kantola
Instructor: Juhani Helske (M.Sc.)
1
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Agenda
2
•
Background for the Thesis
•
The Objective and the Scope
•
Packet Switched Domain in 3G Networks
•
Resilience Mechanisms in Gn Interface
•
Validation of the Existing Resilience Mechanism in Gn Interface
•
Validation of New Resilience Mechanisms
•
Validation Results and Comparison of Resilience Mechanisms
•
Conclusions
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Background for the Thesis
• Mobile networks are evolving from 2G towards 3G.
• So far the most of traffic has been voice traffic.
• Now amount of data traffic is growing faster than traditional voice traffic.
• New packet based services:
 Multimedia messaging
 Wireless Internet browsing
 Advertising
 Entertainment services
3
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Voice and data traffic volume evolution in
Western Europe
4
© Jari Hietanen
Thesis Seminar, 7.6.2005.
New Challenges for Telecom Vendors and
Operators:
• Amount of IP based data traffic is increasing in 3G networks.
• Traditionally people are used to high availability and service quality in
circuit switched PSTN and GSM netwoks.
• Now people are expecting the same level of availability and quality also in
packet switched 3G networks.
• This causes huge challenge for vendors and operators to develop and build
fault tolerant and resilient networks, which quarantee high availability of the
networks.
• 3G networks are complex:
• Several physical transmission mediums (e.g. IP, ATM, SS7, GTP..)
• Many protocols and network layers
• Multi-vendor HW
5
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Terms
• What is the difference between terms ”resilience” and ”redundancy”?
RESILIENCE:
“Ability of the system to function seamlessly in the event of the failure of any
single item of hardware or failure of the software package”.
REDUNDANCY:
According to Collins English dictionary, the term redundancy means
“duplication of components in electronic or mechanical equipment so that
operations can continue following failure of a part or repetition of
information or inclusion of additional information to reduce errors in
telecommunication transmissions and computer processing”.
6
© Jari Hietanen
Thesis Seminar, 7.6.2005.
The Objective and the Scope of the Thesis
The objective:
• The main objective was to study and validate existing and new resilience
mechanisms for the Gn interface in 3G networks.
The scope:
• Thesis concentrated only on resilience mechanisms of data link layer (L2)
and network layer (L3). Upper and lower level resilience solutions were out
of the scope.
• The implementation of Gn interface architecture and resilience mechanisms
are not standardized. Thesis concentrated on validating Nokia’s
implementation to build the resilient Gn interface between 3G SGSN and
GGSN network elements.
7
© Jari Hietanen
Thesis Seminar, 7.6.2005.
The UMTS Network Architecture
8
© Jari Hietanen
Thesis Seminar, 7.6.2005.
The 3G SGSN
Main functions:
• Subscriber authentication & authorization.
• User data tunneling and routing (acts as a gateway for user data tunneling
between RNC and GGSN, separate tunnels towards RNC and GGSN for
each connection).
• Mobility management (controls the location, state and security of UE).
• Session management (managed through resource monitoring, admission
control and PDP context creation, modification and deletion).
• Traffic management (performs packet classifying, policing, buffering,
shaping, marking and scheduling to ensure that all connections receive
appropriate Quality of Service).
• Short message delivery.
• Collection of charging data and traffic statistics.
9
© Jari Hietanen
Thesis Seminar, 7.6.2005.
The 3G GGSN
Main functions:
• Signalling towards access networks. There is signalling, which is required
for creating, modifying and deleting the PDP contexts. The request for PDP
context creation comes always from an external network equipment.
• Signalling towards data networks. Some of the signalling is required for
configuring the PDP context. Most of this signalling happens when the PDP
context is created. It is used e.g. allocating a IP address for User Equipment
(UE).
• Charging. The GGSN analyses user plane traffic and reports the metering
results to the charging system via signalling interfaces.
• Subscription management, authentication and session control. GGSN may
need to authenticate mobile subscribers before PDP context can be
created. In addition GGSN may need to know what services the mobile
subscriber is allowed to use.
• Lawful interception. In many countries local authorities require possibility
for monitoring the traffic for certain mobile subscribers.
10
© Jari Hietanen
Thesis Seminar, 7.6.2005.
The GPRS Tunneling Protocol (GTP)
• GPRS Tunnelling Protocol (GTP) is used in Gn interface between GPRS
Support Nodes (GSNs) in UMTS and also in GPRS backbone networks.
• GTP allows multi-protocol packets to be tunnelled through the UMTS or
GPRS backbone between GSNs and UMTS Terrestrial Radio Access
Network (UTRAN).
• GTP protocol is divided in to GTP Control Plane (GTP-C) and GTP User
Plane (GTP-U) procedures.
11
© Jari Hietanen
Thesis Seminar, 7.6.2005.
GTP Path Management Messages and timers
• Echo Request Interval
• Echo Response Interval
• The timer T3-RESPONSE holds the maximum wait time for a response of a
request message.
• The counter N3-REQUESTS holds the maximum number of attempts made
by GTP to send a request message. The recommended value is 5.
12
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Default Parameter values used in Nokia 3G SGSN and GGSN:
Parameter
Min
value
Max
value
Default
value
Unit
GTP Echo Request Interval
10
360
60
seconds (s)
Echo Reply Waiting Time
(T3)
1
60
10
seconds (s)
Echo Request
Retransmisson (N3)
1
10
5
times
This means that GTP considers a path between GSNs to be down from 1 to 600
seconds depending on configuration of GTP parameters.
For example if Echo reply waiting time (T3) is set to be 5 seconds and Echo
request retransmission (N3) is set to be 3 times, then the time that a GTP tunnel
is declared to be down is T3 x N3 = 5 s x 3 = 15 seconds.
13
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Existing Resilience Mechanisms in the Gn
Interface
• Existing solution is to use dynamic OSPF routing protocol in 3G networks.
• 3G SGSN build on an IPSO router platform.
14
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Gn Interface Resilience in 2G networks
• Gn interface topology for host based elements in 2G networks.
• 2G SGSN has not any routing functionality.
15
© Jari Hietanen
Thesis Seminar, 7.6.2005.
New Resilience Mechanisms for the Gn
Interface
• The problem of the existing OSPF based resilience mechanism for Gn
backbone is that convergence time is not necessarily fast enough.
• The worst case scenario is that convergence time can be even 40 seconds,
if OSPF hello protocol is only mechanism to detect a network failure.
• Also some operators are not willing to start to use a dynamic routing
protocol in their Gn backbone network.
• Solution would be to find a suitable data link layer (L2) protocol to be used
as Gn interface resilience mechanism.
• Advantages of L2 mechanisms:
• Simple and flexible network architecture using L2 resilience
mechanisms.
• Similarity with 2G solution.
• No need for two separate Gn interface Virtual LANs (VLANs).
• Fast convergence from error situations compared to OSPF.
16
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Link Layer Resilience Mechanism validated:
• Link aggregation (IEEE 802.3ad)
• Proxy ARP
• Virtual Router Redundancy Protocol (VRRP)
• Hot Standby Router Protocol (HSRP)
• Virtual MAC address based method (Nokia’s own solution)
• Bidirectional Forwarding Detection
17
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Validating Existing Resilience Mechanism
• First existing OSPF based (L3) resilience mechanism was validated
building a Gn test network, which included 3G SGSN and GGSN nodes.
• The conclusions from OSPF validation test results:
• With default parameter values OSPF network convergence times are
too slow. Convergence time can be even 40 seconds, if network
includes hubs or switches.
• It is possible to improve the performance of OSPF convergence, using
shorter Hello and Router Dead intervals.
• Minimum value for Hello interval is 1 second and for Router Dead
interval 4 seconds.
18
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Validating New Resilience Mechanisms
• Some of the link layer techniques were validated building a test network
and measuring convergence times from link failure situations.
• Some techniques were analyzed with other methods:
• using literature sources
• interviewing
19
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Link Aggregation Test Results
• Test topology:
• Failover time from a link failure about 500 ms.
20
© Jari Hietanen
Thesis Seminar, 7.6.2005.
VRRP Test Results
• Test topology:
• Failover time from a link failure about 2.8 seconds.
21
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Test results of Virtual MAC based Mechanism
• Test topology:
• Failover time from a link failure about 600 ms.
22
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Comparison of Resilience Mechanisms
Existing OSPF based solution:
+ A dynamic routing protocol is easy to configure and administrate for network
operator.
+ OSPF is common protocol, which is supported by also by other vendors
products.
- OSPF protocol not suitable to all networks topologies, e.g. if network
includes switches
- Convergence time from error situation rather slow.
- Hello protocol based mechanism is not fast enough if the default parameter
values are used.
23
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Link aggregation:
+ Does not require new hardware
+ Does not waste extra interface or line card capacity in router.
+ Fast recovery time from failure situation (about 500 ms)
+ Standardized solution
+ Economic and flexible method to increase network capacity
24
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Proxy ARP:
+ It can be added to a single router on a network without disturbing the routing
tables of the other routers on the network.
+ IP hosts can be used without configuring default gateway.
+ IP network does not need to have any routing intelligence.
+ Fast recovery time from failure situations.
+ Standardized solution.
+ It is easy to implement (only the gateway router has to be updated to
support Proxy ARP).
- Hosts need larger ARP tables to handle IP-to-MAP mappings.
- The amount of ARP traffic increases.
- This does not work with all network topologies (e.g. more than one router
connecting two physical networks).
25
© Jari Hietanen
Thesis Seminar, 7.6.2005.
VRRP:
+ It offers higher availability of the default path without requiring configuration
of dynamic routing protocol or router discovery protocols on every end-host.
+ Fast recovery from failure situations (average about 2.8 s).
+ Simple and flexible protocol.
+ Standardized solution.
- Not feasible to use with all network node architectures.
HSRP
+ The protocol offers similar functionality
: with VRRP.
- It is a patented solution, which is not possible to be utilized in free of charge.
- It does not offer anything superior compared to VRRP.
26
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Bidirectional Forwarding Detection:
+ OSPF alone offers minimum convergence time of 1-2 second. With BFD
protocol OSPF can provide sub-second failure detection time. According to
measurements made by Marko Luoma (Lic.Tech.) in HUT Networking
Laboratory convergence time can be even 75 ms.
+ Because BFD is not tied to any particular routing protocol, it can be used as
a generic and consistent failure detection mechanism for e.g. OSPF, IS-IS,
EIGRP, and BGP routing protocols.
+ CPU usage is minimal for route processor.
- BFD can potentially generate false alarms and signaling a link failure when
one does not exist. Because the timers used for BFD are so tight, a brief
interval of data corruption or queue congestion could potentially cause BFD
to miss enough control packets to allow the detect-timer to expire
27
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Conclusions
• OSPF based layer 3 resilience mechanism is suitable for Gn network alone,
but in the future it will be too slow mechanism to detect network failures.
• For a greenfield operator who does not have existing 2G network, it might
be reasonable to implement resilience using OSPF. A dynamic routing
protocol based solution offers advantages compared to link layer solution.
For example configuration of a network is simpler
• However, the performance of Gn network can be improved using some L2
mechanism under OSPF.
• Bidirectional Forwarding Detection seems to be most suitable L2 resilience
mechanism to be used with OSPF.
28
© Jari Hietanen
Thesis Seminar, 7.6.2005.
Questions?
29
© Jari Hietanen
Thesis Seminar, 7.6.2005.