IP-MPLS - E-Photon One +

Download Report

Transcript IP-MPLS - E-Photon One +

Place for logos of authors’ institutions
This teaching material is a part of e-Photon/ONe Master study
in Optical Communications and Networks
Course and module:
Optical Network Resilience
Multi-Layer Resilience
Author(s):
Mario Pickavet, IBBT
Luca Valcarenghi, Scuola Superiore Sant’Anna, Pisa, Italy
This tutorial is licensed under the Creative Commons
http://creativecommons.org/licenses/by-nc-sa/3.0/
http://www.e-photon-one.org
Outline
Multi Layer Resilience
Single Layer Recovery
Protection in the Optical layer
Recovery in the IP-MPLS layer
Recovery Interworking techniques
Multilayer survivability strategies
Case study
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
Protection in the Optical Layer
e
d
a
Link in
IP-MPLS network
b
Link in OTN network
(= optic fiber)
c
IP-MPLS
Lightpath through
OTN network
D
E
Optical protection path
in OTN network
OTN
A
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
C
B
Revision: 1/2007
3 (77)
Restoration in the IP-MPLS layer
e
d
a
Link in
IP-MPLS network
b
Link in OTN network
(= optic fiber)
c
IP-MPLS
D
E
OTN
A
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
C
B
Revision: 1/2007
Lightpath through OTN
carrying IP spare cap.
4 (77)
Secondary failures
e
d
a
Link in
IP-MPLS network
b
Link in OTN network
(= optic fiber)
c
IP-MPLS
D
E
Lightpath through
OTN network
OTN
A
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
C
B
Revision: 1/2007
Lightpath through OTN
carrying IP spare cap
5 (77)
Single Layer Recovery
•
Optical protection


Large granularity  few recovery actions
Close to root failure




•
Known to be fast (at least protection)
BUT: cannot recover from all failures
IP-MPLS recovery



For sure, better failure coverage
MPLS protection (making use of pre-established backup LSPs) can also be fast
BUT:
•
•
•
•
No delay due to failure propagation
No need to deal with complex secondary failures
Can be confronted with complex secondary failure scenarios
Fine granularity  many recovery actions
During recovery increased usage of capacity  decreased QoS
Conclusion: combine recovery at both layers
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
6 (77)
Outline
Multi Layer Resilience
Single Layer Recovery
Protection in the Optical layer
Recovery in the IP-MPLS layer
Recovery Interworking techniques
Multilayer survivability strategies
Case study
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
Without coordination
LSA
LSAa a{a-b,
{a-e}a-e}
e
d
a
Link in
IP-MPLS network
b
Link in OTN network
(= optic fiber)
c
LSA b {b-a, b-c}
LSA a {a-b, a-e}
E
LSA
LSAb b{b-a,
{b-c}
b-c}
AIS
AIS
IP-MPLS
Lightpath through
OTN network
D
Optical protection path
in OTN network
OTN
A
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
C
B
Revision: 1/2007
Lightpath through OTN
carrying IP spare cap
8 (77)
Example: SONET-SDH / OTN
???
Dual protection and extra loss !
Source:
Nico Wauters, Gzim Ocakoglu, Kris Struyve, Pedro Falcao Fonseca, “Survivability in a New Pan-European Carriers' Carrier Network Based on WDM and SDH
Technology: Current Implementation and Future Requirements", IEEE Communications Magazine, no. 8, August 1999 pp. 63-69
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
9 (77)
Hold-off timer
e
d
Wait fortimer
hold-off
Cancel
and
timerabout
to expire
forget
failure
a
Link in
IP-MPLS network
Wait fortimer
hold-off
Cancel
and
timerabout
to expire
forget
failure
b
Link in OTN network
(= optic fiber)
c
E
AIS
AIS
IP-MPLS
Lightpath through
OTN network
D
Optical protection path
in OTN network
OTN
A
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
C
B
Revision: 1/2007
Lightpath through OTN
carrying IP spare cap
10 (77)
Hold-off timer
LSA a {a-e}
e
Timer
Wait for
expires
hold-off

timer
start recovery
to expire
a
LSA c {c-d}
Link in
IP-MPLS network
d
Timer
Wait for
expires
hold-off

timer
start recovery
to expire
Timer
Wait for
expires
hold-off

timer
start recovery
to expire
b
c
Link in OTN network
(= optic fiber)
D
AIS
AIS
E
AIS
AIS
IP-MPLS
Lightpath through
OTN network
Optical protection path
in OTN network
Optical protection fails
OTN
A
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
C
B
Revision: 1/2007
Lightpath through OTN
carrying IP spare cap
11 (77)
Recovery Interworking
•
Hold-off timer



adds a delay
but avoids unnecessary switch-over or route flaps in IP-MPLS
when needed? In case failure detection based on:
•
•
•
AIS signals (e.g., PoS interfaces)
HELLO protocol (e.g., GbE interfaces):
 when HELLO interval too small
 some examples:
• RFC3209: default 3.5*5ms < 50 ms for optical protection  problem
• In test-bed experiments: 5*80 ms (default: 4*100 ms)
Alternative: Recovery Token



WHAT? Optical protection fails  send recovery token signal to IP-MPLS layer
PRO? Avoids unnecessary delay
CONTRA? Requires standardisation of recovery token signal
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
12 (77)
Static recovery techniques
Link in
IP-MPLS network
e
d
Link in OTN network
(= optic fiber)
a
b
c
IP-MPLS
D
E
Lightpath through
OTN network
Optical protection path
in OTN network
Lightpath through OTN
carrying IP spare cap
Optical protection path for
lightpath carrying IP spare cap
OTN
A
Authors: Mario Pickavet, Luca Valcarenghi
B
Double
protection:
IP spare capacity
Course: Optical
Network Resilience
Module: Multi-Layer
Resilience
protected
again optically
Revision:
1/2007
IP spare
capacity
optically unprotected
Common pool strategy:
IP spare capacity preC emptible by the optical
recovery
 provide capacity only
13 (77)
once
Dynamic recovery techniques
e
d
a
Link in
IP-MPLS network
b
Link in OTN network
(= optic fiber)
c
IP-MPLS
Lightpath through
OTN network
D
E
Lightpath established
at time of the failure
OTN
A
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
C
B
Revision: 1/2007
Lightpath through OTN
carrying IP spare cap.
14 (77)
Summary
• Static recovery techniques


WHAT? Provide in advance sufficient spare capacity in IP-MPLS layer
Strategies:
•
•
•

Double protection
IP spare capacity optically unprotected
Common pool strategy
ATTENTION! Potential problems (physical disjointness of IP-MPLS spare capacity
and pre-emption), except in case of double protection
• Dynamic recovery techniques


WHAT? In case of a failure, automatically reconfigure the logical IP-MPLS network
by means of the Intelligent Optical Networking (ION) functionality
Strategies:
•
•
ION global reconfigurations: in all conditions optimise the logical IP-MPLS network.
ION local reconfigurations: in case of a failure, only up- or downgrade the capacity of
existing logical IP-MPLS links.
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
15 (77)
Cost model and escalation strategy
IP router
Secondly, recover remaining traffic by
means of IP-MPLS rerouting
First try to recover as
much traffic as possible
by dedicated optical
channel protection
Trib. Interface Cost: OXC-ports,
router line-cards, etc (in $/)
Optical Node Cost: OXC-ports, WDMmuxes, etc (in $/)
Optical Line Cost: fibers, optical
amplifiers (in $/km/)
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
16 (77)
Network scenario
Optimal logical topology in fault-free
conditions ( nominal topology)
Trento
Physical topology
Trento
Milano
Torino
Venezia
Milano
Torino
Venezia
Genova
Bologna
Genova
Bologna
Firenze
Firenze
Pescara
Roma
Pescara
Roma
Cagliari
Napoli
Bari
Cagliari
Napoli
ReggioC
ReggioC
Palermo
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
Palermo
17 (77)
Bari
Cost comparison
of the multilayer survivability strategies
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
18 (77)
Comparison number of reconfigurations:
ION global versus local reconfigurations
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
19 (77)
Comparison number of reconfigurations:
ION global versus local reconfigurations
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
20 (77)
Conclusions
• Cost comparison

Static multilayer survivability strategies:
•

double protection > IP spare unprotected > common pool
Dynamic multilayer survivability strategies:
•
•
ION local reconfigurations < common pool
ION global (almost) most expensive strategy
• ION local reconfigurations strategy very promising



Cheapest strategy
Easy strategy to implement and to operate
Requires less reconfigurations in a failure situation
•
Improved QoS during reconfigurations
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
21 (77)
Application and Network Layer Fault Tolerance
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
22 (77)
Grid Computing
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
23 (77)
Resilience Requirements in Grid Computing
• Resilience in grid


Ability of the grid environment to recover from both software and
hardware failures
Types of resilience
•
•
•
Application level resilience
 For both software and hardware failures
Middleware resilience (collective, resource, and connectivity)
 For both software and hardware failures
Fabric resilience
 Mostly for hardware failures
• Requirements

Successfully complete the computational task fulfilling requirements
such as
•
•
Bandwidth
Latency
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
24 (77)
Approaches for Grid Computing Resilience
Layered Grid Architecture
Application
end-user applications
Middleware
Collective
collective resource control
Resource
resource management
Connectivity
Inter-process communication,
protection
Fabric
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
basic hardware and software
Failover Schemes
TCP/IP Stack
Application specific fault tolerant
schemes based on middleware
fault detection
Tasks and Data Replicas
Application
Condor-G
checkpointing, migration, DAGMan
GT2/GT3/GT4
GridFTP
Reliable File Transfer (RFT)
Replica Location Service (RLS)
Fault Tolerant TCP (FT-TCP)
Transport
delegated to WAN, MAN,
and LAN resilience schemes
Internet/Network
delegated toRevision:
HW, SW,
1/2007
and farm failover schemes
25 (77)
Link
Application Specific Resilience
•
•
Problem specific fault-tolerance [I00AProb]
Problem

•
Fault-tolerance algorithm








•
Tree-based encoding of the B&B sub-problems
Each sub-problem is represented by a sequence of pairs <xi, value> where xi is a condition
variable and value is 0 or 1
A sub-problem is solved after the branching operation has been performed on it
A sub-problem is completed if it is solved and either it is a leaf or both its children are completed
Every process maintains a list of new locally completed problems and a table of the completed
problems it knows about
The list is sent to m other nodes after a certain threshold on the number of completed problems in
the list or on the update time
fault recovery: when a member runs out of work and it cannot get “new” work it chooses an
uncompleted problem (by completing the code of a solved problem whose sibling is not solved)
Termination is detected when successive code compressions of local lists and tables lead to the
code of the root problem
Pros

•
Parallel Branch and Bound
Simpler and more efficient algorithms
Cons


Less general algorithm
Redundant work
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
26 (77)
Resilient Parallel B&B
X1
0
X1
0
1
1
(<X1,1>)
(<X1,0>)
X2
X3
0
X2
1
X4
X3
0
X5
0
(<X1,0>, <X2,0>)
X6
(<X1,0>, <X2,1>, <X5,0>)
(<X1,0>, <X2,1>)
1
X4
X5
0
1
X7
(<X1,0>, <X2,1>, <X5,1>)
X6
X7
completed
solved but
uncompleted
unsolved
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
1
27 (77)
Middleware Fault Detection and
Application Fault Handling
• Fault detection service as basic grid service [S98AFaultDet]
• Globus Heartbeat Monitor (GHM)

Local monitor
•
•
•

Responsible for observing the state of both the computer on which it is located
and any monitored processes on that computer.
It generates periodic “I-am-alive” message or heartbeats
Heartbeats are transmitted through unreliable protocols
Data collector receives heartbeat messages and identify failed components
based on missing heartbeats
• Failure events correspond to a threshold of not received heartbeats
• Upon failure notification the application might decide to




Terminate
Ignore the failure and continue execution
Allocate new resources and restart the failed application component
Use replication and reliable group communication primitives to continue
execution
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
28 (77)
Checkpointing [W00FTWAC]
• Each task saves its portion of the data domain on disk at a set of predetermined iterations

Two chekpointing configurations
•
•
•
Single NFS-mounted disk from a file server
Parallel array of locally attached disks
Network checkpoint model (checkpointing stored on a common server)
• Failed processor (local checkpoint) or another processor (network
checkpoint) continue the elaboration from the last checkpoint
• Dominant overhead for checkpoint is the cost of stopping the
application and writing the checkpoints to disk
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
29 (77)
Migration [M00PM]
•
•
•
Process migration is the act of transferring a process between two machines
during its execution
It is possible to combine it with checkpointing to resume the process execution
in essentially the same state on another machine (e.g., Condor)
Process migration consists of extraction the state of the process on the source
node, transferring it to the destination node where a new instance of the
process is created, and updating the connections with other processes on
communicating nodes
Migrating process
(source instance)
Migrating process
(dest instance)
communicating node
Source node
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
destination node
communicationg
process
Revision: 1/2007
30 (77)
Replication [L01DR]
• A set of application replicas are running in multiples sites
• The set of static replicas are selected based on the user
requirements
• If sites fail and the number of active replicas falls below a
certain threshold new replica are started (dynamic
replicas)
• Dominant overheads for replicas are


Replica scheduling
Replica synchronization
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
31 (77)
Fault-tolerant TCP
• Failed server application can be recovered through checkpointing and restart
on the same or on a different processor
• TCP session need to be recovered as well
• Different approaches for recovering TCP sessions





Insert a layer of software between the TCP layer and the application layer of both
the client and the server
Redesign TCP layer on the server by adding support for checkpointing and
restarting to the TCP implementation
Redirect all the TCP traffic between client and server to a proxy. If the server
crashes the proxy switches the connection to an alternate server
Fault-Tolerant TCP (FT-TCP) utilizes wrapping in which a layer of software
surrounds the TCP layer and intercepts all communication with it [A01WS]
Connection migration in [S01FG] for any connection in progress determines when
and if to move to another server, selects a set of new server candidates, and moves
the connection seamlessly resuming the data transfer from the new server
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
32 (77)
Condor-G
• Condor-G: Condor+Globus [V04Pres]

Condor
•
•
•
End user
Job scheduling across multiples resources
Fault tolerance based on checkpointing and migration
Layered over grid middleware as “personal batch system” for a grid
Gate
Keeper
Schedd
1: Submit job
2: Authenticate
(GSI)
Gridmgr
Home
disk
3: Submit job
(GRAM)
5: Transfer data
(GASS)
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
6: Submit job
Jobmgr
Remote batch queue
4: Store job details
User’s
job 7: Run job
Temp
disk
Revision: 1/2007
33 (77)
Condor-G Architecture
1.
2.
3.
4.
5.
6.
7.
The user submits a job to the scheduler which creates a
gridmanager for the job
The Gridmanager authenticates the user to the gatekeeper, which
creates a jobmanager
The Gridmanager transmits the job details to the jobmanager
The jobmanager store the details
The jobmanager transfers the executables and input data from the
home site
The jobmangaer submits the job to the remote batch queue
The job executes
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
34 (77)
Condor-G Fault Tolerance [P96MC],
[Z99PH]
• Submit-side failure


All relevant state for each submitted job is
stored persistently in the Condor-G job queue
If Condor-G GridManager crashes upon restart
reads the state information and reconnects to
JobManagers running at the time of crash
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
35 (77)
Condor-G Fault Tolerance – Lost contact
with Remote Jobmanager
Can we contact gatekeeper?
Yes - jobmanager crashed
No – retry until we can talk to gatekeeper again…
Can we reconnect to jobmanager?
No – machine crashed or
job completed
Yes – network was down
Restart jobmanager
Has job completed?
No – is job still running?
Yes – update queue
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
from [C04GGFSS]
Revision: 1/2007
36 (77)
Advantages and Drawbacks of Current
Failover Schemes
• Advantages


Network layer independent
Flexible (e.g., degree of failover dependent on application)
• Drawbacks
•
•
•
•
•
•
Application dependent
User driven
Need for TCP synchronization
Slow reaction to failures
Not scalable (e.g., CPU and storage)
No communication QoS guarantees
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
37 (77)
QoS-Aware Fault Tolerance
• Def. QoS-Aware Fault Tolerance:

Capability of overcoming both hardware and software failures while
maintaining communication QoS guarantees
• Def. QoS-Capable Layer

Grid layer capable of guaranteeing communication QoS
• Example:

Upon failure of main data center bandwidth guaranteed
connectivity must be guaranteed to data replica center
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
38 (77)
QoS Awareness in Grid Fault Tolerance
Application
Middleware
Application
Collective
Resource
Connectivity
QoS Unaware
TCP/IP
MPLS
Fabric
QoS Capable
SONET/SDH
Optical Transport
Network
Authors: Mario Pickavet, Luca
Valcarenghi (OTN)
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
39 (77)
QoS Capable TCP/IP
• Dynamic rerouting with Diffserv
• Advantages

Pervasiveness
• Drawbacks

No Traffic Engineering
• Example

After rerouting, packets with the same Type of Service
(ToS) compete for the same insufficient resources along
the shortest path
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
40 (77)
QoS-Aware Fault Tolerance below Layer 3
• QoS aware fault tolerance through connection
oriented communication in QoS capable layer

Multi-Protocol Label Switching (MPLS):
•

SONET/SDH
•

Label Switched Paths (LSPs)
SONET/SDH Path
OTN
•
Lightpath (i.e., wavelength channel)
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
41 (77)
Implementing Multi-layer QoS-Aware
Fault Tolerance
• Assumption


Grid computing services requiring communication QoS guarantees (e.g.,
collaborative visualization)
QoS parameter
•
minimum bandwidth
• Objective

Maximize recovered connections and minimize required network resources
upon network link failure
• Proposed approach

Integrating QoS unaware layer and QoS capable layer fault tolerance 
QoS aware integrated fault tolerance
•
•
QoS capable layer fault tolerance
 (G)MPLS path restoration
Software layer fault tolerance
 Service replication (server migration)
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
42 (77)
Network Layer Fault Tolerance (Path Restoration)
Issue: Blocking
Client
Primary LSP
Primary Video
Server
Backup LSP
Primary LSP
Backup
Video
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Server
Insufficient
Capacity
Revision: 1/2007
43 (77)
Software Layer Fault Tolerance Issue:
Wasted Network Capacity
Client
Primary LSP
Primary Video
Server
Backup
LSP to
Video
Authors: Mario Pickavet, Luca Valcarenghi
Revision: 1/2007
Course: Optical Network Resilience
Backup
Video
Server
Module: Multi-Layer Resilience
Server
Implies
1:1 Path Protection
44 (77)
Non Integrated Fault Tolerance Issue:
Unnecessary Server Migration when Backup LSP is available
Client
Primary LSP
Primary Video
Server
Backup LSP
Backup
LSP to
Video
Authors: Mario Pickavet, Luca Valcarenghi
Revision: 1/2007
Course: Optical Network Resilience
Backup
Video
Server
Module: Multi-Layer Resilience
Server
Absence of
Inter-layer
Coordination
45 (77)
Integrated Fault Tolerance Advantages:
Path Restoration + Service Replication
Client
Primary LSP
Primary Video
Server
Backup LSP
Primary LSP
Backup
Video
Authors: Mario Pickavet, Luca Valcarenghi
LSP to
Revision: 1/2007
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Server
Backup Video Server
46 (77)
Issue 1: what to utilize
A
B
G
H
D
• Integrating network layer connection rerouting with
task/data replication/migration
• Objective: maximizing the number of connections
restored after failure
original task/data
1
A
1
A
0
2
0
G
B
4
HAuthors: Mario Pickavet, Luca Valcarenghi
D
Course: Optical Network Resilience
5Multi-Layer Resilience 3
Module:
task/data replica
2
A
G
A
0
B
4
H
5
3
B
4
D
Revision: 1/2007
G
2
H 47 (77)
5
3
D
Investigated Integrated Restoration
Scheme
• One emitter/collector and one
VP/DVSM pair
• s= emitter/collector
• d= pre-failure VP/DVSM
• li, lj= VP/DVSM post-failure replica
locations
• MILP model
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
48 (77)
Simulation Scenario
•
Physical network



•
•
average ratio between number of
unrecovered connections and failed
connections
Expected replica utilization ratio

Average ratio between the number of locations
utilized for service replication and number of
locations allowed for service replication
Expected path restoration utilization

100 randomly generated connection matrices
Bidirectional connection generation
Bidirectional connection rerouting
Expected network blocking probability

•
•
Expected connection path length

•
average length of connection paths to reach
replica location
Evaluation scenarios
 Limited number of replicas
•
•


Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Average number of times the original server
location node is utilized as replica location
normalized ot the number of replica locations
utilized
per location
per failed connection between (s,d) pair
Limited distance (hop) of allowed replica
locations
Minimum required replication flow
Revision: 1/2007
49 (77)
Integrated Restoration Performance
[V04GN]
•
•
Integrated restoration outperforms OSPF dynamic rerouting resilience with QoS
constraints
Integrated restoration performs as well as service migration resilience but by utilizing
path restoration decreases the need for service synchronization and restart
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
52 (77)
Issue 2: where to place replicas
Objective



Utilize service replication for guaranteeing high percentage of recovered
inter-service connections
Limit number of allowed replica locations to optimize utilized
computational resources
Utilize simple and efficient heuristics for replica placement
• Proposed Approach


Place replicas in nodes adjacent to original service location
Form cluster of nodes with same service replicas (i.e., service islands)
• Expected results


Improve service inter-connectivity recovery
High replica utilization ratio
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
55 (77)
HOP Heuristic
•
Place replica in all the nodes reachable in H-hop from the original service location
Client
Server
H=1
Nodal degree=4
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
Nodal degree=4
Service Replica
56 (77)
Super-Node Degree Heuristic
•
Place replica in all the nodes reachable in H-hop from the original service location and
incrementing the previous step super-node nodal degree
Client
Server
Service Replica
H=1
NO
Nodal degree=4
Nodal degree=5
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
57 (77)
Nodal degree=5
HOP and Super-node Degree
•
•
•
•
•
H, number of hops from destination to consider
candidate service island nodes
=0, no capacity need for replica update (static replica
placement)
=1, full capacity (same as client-server) capacity for
replica update (dynamic replica placement)
H= and =0 expected network blocking probability
lower bound
For both =0 and =1


Super-Node degree and HOP heuristic similar expected
restoration blocking probability
Super-node degree better expected replica utilization ratio
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
59 (77)
Issue 3: how to implement it
RESILIENT SCHEMES
High-quality Multimedia applications such as Video on Demand
(VoD) must guarantee a seamless service with QoS guarantees
to end users.
Issues:
•Sub-optimal global resource utilization > 1:1 LSP from both primary and
backups (1:1 protection)
•NO bandwidth guaranteed streaming
after migration -> network layer
restoration unsucessful + migration
without LSP
•Redundant recovery actions (e.g., 2 LSP
activated) -> migration + slow
restoration
NETWORK LAYER
Failures: link disruption,
connection preemption
•MPLS protection
•MPLS restoration
APPLICATION LAYER
Failures: Server down, software
crash
•Server migration
Integrating network and application recovery mechanism
are theoretically proved on a grid application case study
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
60 (77)
Integrated …
• Coordinates different layer recovery schemes


LSP path restoration first attempted
If unsuccessful Application layer server migration triggered
by RB
• Coordination based on a service broker


Avoids double action
Prevent from slow network layer recovery
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
61 (77)
A new main building block: the Resource
Broker
Resource Broker
LSP setup
Resource Broker (RB)
•Path computation algorithm:
CSPF
Server
•Signaling:RSVP-TE
Service Management
•Routing:OSPF-TE
Client
•Single or multiple servers
implementation
•Responsible for provisioning,
monitoring and restoration
(recovery) in a
COORDINATED way
LSP Management
Server
Control plane
Data plane
Guaranteed Bandwidth LSP (streaming)
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
IP-MPLS Network
62 (77)
Resource Broker–
based Provisioning
RB
RB
•
•
List isof available servers
If LSP status response
negative, another
• server
Best server choice based
is chosen.
on content and server
load
Resource Broker (RB)
Service Request
•
LSP setup on edge router
Server URL
Client
LSP status response
LSP setup request
HTTP-GET
message
Control plane
Client-side
edge LSR
Data plane
Server-side
edge LSR
TE-LSP
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Video
Server
Revision: 1/2007
Bandwidth
Guaranteed Video
63 (77)
Streaming
Resource Broker Based Recovery
It is
1.
2.
3.
divided in three phases:
Failure Detection
Failure Notification
Failure Recovery
RB is involved in all the phases.
The
•
•
•
actions are different depending on the kind of failure:
Network failure (node down, fiber cut)
LSP preemption
Server failure
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
64 (77)
Integrated Failure Detection and Notification
LSR-based
•
•
•
Client-based
Physical failures or LSP
preemption in the network
MPLS dynamic path restoration
is triggered transparently to RB
Very fast and independent
detection (no notification to RB)
•
•
•
Based on TCP buffer
monitoring and socket
reading timeout mechanism
Timeout set to few seconds
avoiding false alarms
Client interacts with RB to
activate recovery
RB-based
•
•
Higher priority traffic
TCP buffer
LSR
Link
failure
Failure notification
Client
Periodic LSP
monitoring (LSP status
request)
In case of PATH-TEAR
message RB triggers
server migration
recovery
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
HP LSP preemption
Server
LSR
“UP”
“DOWN”
“LSP status?”
Revision: 1/2007
65 (77)
Server
migration
RB
Failure recovery scheme
1.
2.
MPLS restoration, if feasible.
RB driven recovery.
Backup
Server
HTTP get
LSR
Available Backup Server choice
Old LSP tear down
resume
New Streaming
LSP setup attempt
URL delivering
Primary
Server
LSP SETUP MESSAGE
New LSP
LSR
LSR
NEW LSP UP
HP LSP
Client
Backup Server URL
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Server
migration
“DOWN”
LSP TEAR DOWN
MESSAGE
Revision: 1/2007
66 (77)
RB
Experimental setup: MetroCore/VESPER
MAN in Pisa
Agilent Router
Tester®
Resource Broker:
Inside the client
Polling interval: 1 sec
Client: Linux VLC
modified
Socket Reading
Timeout: 5 sec
Video Servers: Linux
Web Servers
Playout Buffer Lenght:
15 sec.
Juniper Routers
QoS activated: EF for
MPLS, BE for IP traffic
RB
Video: MPEG-2 file
10 Mbit/s average
Single OSPF-TE
domain
RSVP-TE, MPLS on
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
67 (77)
Test 1: MPLS dynamic restoration
Traffic
Generator
Disturb LSP 95
Mbit/s
High Priority
Stream LSP
10 Mbit/s
Client + RB
Core1
Edge1
RSVP
CSPF
Preemption
computation
Primary Video
Server
No failure
perceived
100 Mbit/s
only!
Edge3
Traffic
Analyzer
Edge2
Backup
Video
Authors:
Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Server
Module: Multi-Layer Resilience
Core2
Revision: 1/2007
68 (77)
Test 1: Results
Traffic profile
detected at client
side
•A streaming interruption occurs at the client
•Interruption duration is uniformely distributed (100 ms, 1000ms)
•The interruption is transparent to the application
•No streaming frame loss is experienced
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
69 (77)
Test 2: Server Migration
Disturb2 LSP 95 Mbit/s High Priority
Traffic
Generator
Disturb3 LSP 85 Mbit/s High Priority
Stream LSP 10 Mbit/s
Client + RB
Edge1
Core1
HTTP
GET
Polling
“LSP DOWN”
Tear Down
Message
IP traffic
Primary Video
Server
Migration
done
Best Effort
Streaming
No available
RSVP
path:
Preemption
Restoration
New LSP Setup
Message
unfeasible
Edge2
Backup
Video
Authors:
Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Server
Module: Multi-Layer Resilience
Edge3
Traffic
Analyzer
Core2
Revision: 1/2007
70 (77)
Test 2: Results
Traffic profile
detected at client
side
No
T (s)
Source
1
0
RB
2
0
Ingress
Primary
3
0
RB
4
0
RB
5
0
RB
6
(2.2, 2.6)
Dest.
Egress
(Edge 3)
Prot.
Comment
TCP
LSP Stream status request
RB
TCP
LSP Stream status = Down
TCP
Req LSP Stream tear-down
TCP
Req LSP Stream setup
TCP
LSP Stream status request
TCP
LSP Stream Down
Ingress
Primary
Ingress
Backup
Ingress
Backup
Ingress
Backup
Ingress
(2.0, 2.5)
Primary
Ingress
(3.0, 3.5)
backup
Egress
9
(3.0, 3.5)
Egress
Core2
•Migration procedure time: 3.4-4.6 seconds
10
(3.4, 4.6)
RB
Ingress
Backup
TCP
LSP Stream status request
•Polling interval up to 1 second
11
(3.4, 4.6)
Ingress
Backup
RB
TCP
LSP Stream Up
•Recovery time: up to 5,6 seconds
12
(3.4, 4.6)
Client
Backup
server
HTTP
GET
Content request
•Best effort traffic present in the transient
13
(3.4, 4.6)
Backup
Server
Client
HTTP
Streaming
7
8
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
•No recovery inconsistencies, no frame loss
RB
Egress
Events
timeline
Revision:
1/2007
71 (77)
RSVP
TEAR
RSVP
PATH
RSVP
RESV
Released resources
Backup LSP setup
“
Conclusions
• Overview of Grid network infrastrcuture monitoring
initiatives
• Monitoring Grid network infrastructure helps
improving task elaboration performance by assigning
them in a network-aware manner
• Overview of Grid computing fault-tolerance
• Integrated QoS-Aware fault tolerance allow to
recover QoS guaranteed connectivity without user
perception of the failure
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
72 (77)
References (1)
• [Gridbook] The Grid: Blueprint for a New
Computing Infrastructure, I. Foster and C.
Kesselman Eds., Morgan Kaufmann Publishers,
1999
• [Grid2book] The Grid 2 : Blueprint for a New
Computing Infrastructure, I. Foster and C.
Kesselman Eds., Morgan Kaufmann Publishers,
2004
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
73 (77)
References (2)
•
•
•
•
•
[F04GSC] D. Fergusson, “Introduction to Web Services”, 2nd Grid Summer School,
http://www.dma.unina.it/~murli/GridSummerSchool2004/session-12.htm, Vico Equense,
Naples, 2004
[C04netserv0] “draft-ggf-ghpn-netservices-1.0”, http://forge.gridforum.org/projetcs/ghpnrg, February 2004
[K03ROS] Kunszt, P. et al., “Advanced Replica Management with Reptor”, in
Proceedings of 5th International Conference on Parallel Processing and Applied
Mathematics, 2003, Czestochowa, Poland, September 7-10
[V04netissues-3] V. Sander (Ed.) et al. “draft-ggf-ghpn-netissues-3”,
http://forge.gridforum.org/projects/ghpn-rg, May 2004[
[S00FTE] S. Song, J. Huang, P. Kappler, R. Freimark, and T.Kozlik, “Fault-tolerant
Ethernet Middleware for IP-Based Process Control Networks”, in Proceedings of the
25th Annual IEEE Conference on Local Computer Networks, Nov. 2000
•
[V99Eth] S. Varadarajan and T. Chiueh, “Automatic Fault Detection and
Recovery in Real Time Switched Ethernet Networks”, Infocom’99
•
[I00AProb] A. Iamnitchi and I. Foster, “A Problem-Specific Fault-Tolerance Mechanism
for Asynchronous, Distributed Systems”, in Proceedings of ICPP 2000
[S98AFaultDet] P. Stelling et al. “A Fault Detection Service for Wide Area Distributed
Computations”, in Proceedings of IEEE HPDC, 1998
[C04GGFSS] M. Livny, “Condor-G Stork and DAGMan: An Introduction”, 2nd Grid
Computing Summer School, Vico Equense, Naples, 2004
•
•
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
74 (77)
References (3)
•
•
•
•
•
•
•
•
[W00FTWAC] J. B. Weissman, “Fault Tolerant Wide-Area Parallel Computing”, in Proceedings
of IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems part of IPDPS, 2000
[P96MC] J. Pruyne and M. Livny, “Managing Checkpoints for Parallel Programs”, in
Proceedings of Workshop on Job Scheduling Strategies for Parallel Processing IPPS’96
[Z99PH] V. C. Zandy, B. P. Miller, and M. Livny, “Process Hijacking”, in Procedings of the 8th
IEEE International Symposium on High Performance Distributed Computing, 1999
[M00PM] D. S. Milojicic et al. “Process Migration”, ACM Computing Surveys, Sep. 2000
[L01DR] B. Lee and J. B. Weissman, “An Adaptive Service Grid Architecture Using Dynamic
Replication, IEEE 2nd Internaitonal Workshop on Grid Computing, November 2001
[S01FG] A. C. Snoeren, D. G. Andersen, and H. Balakrishnam, “Fine-Grained Failover Using
Connection Migration”, in Proceedings of the Third Annual Symposium on Internet
Technologies and Systems (USITS), March 2001
[A01WS] L. Alvisi et al., “Wrapping Server-Side TCP to Mask Connection Failures”, IEEE
Infocom 2001
[GrADS] www.hipersoft.rice.edu/grads
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
75 (77)
References (4)
•
•
•
•
•
[V04GN] L. Valcarenghi and P. Castoldi, “On the Advantages of Integrating Service
Migration and GMPLS Path Restoration for Recovering Grid Service Connectivity”,
submitted to GridNets 2004
[G04ICC] D. Adami et al., “An Experimental Study on the EF-PHB Service in a
DiffServ High Speed Network”, in Proceedings of ICC 2004
[C04G] F. Cugini, L. Valcarenghi, P. Castoldi, and G. Ippoliti
"Experimental Demonstration of Low-Cost 1:1 Optical Span Protection" ,
in Proceedings of Globecom 03-Workshop # 4 ( Protection and Restoration: from
SONET/SDH to Next-Generation Networks), San Francisco, USA, 1 December,
2003
[A04N] Andriolli, T. Jakab, L. Valcarenghi, and P. Castoldi,
"Separate Wavelength Pools for Multiple-Class Optical Channel Provisioning” ,
to appear in Proceedings of Networks 2004, Vienna, Austria, 13-16 June, 2004
[G04O] A. Giorgetti, L. Valcarenghi, and P. Castoldi,
"Combining Multi-Granularity Dynamic Provisioning and Restoration Schemes" , in
Proceedings of ONDM 2004, Gent, Belgium, 1-4 February, 2004
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
76 (77)
Reference (5)
• [NGNgroup], http://ees2cy.engr.ccny.cuny.edu/wwwa/web/ngng
• [VRONEXT05], L.Valcarenghi and P. Castoldi, “QoS-Aware
Connection Resilience for Network-Aware Grid Computing Fault
Tolerance”, RONEXT 2005, Barcelona, July 3-7, 2005
• [VEuroNGI2005], L. Valcarenghi, L. Rossi, F. Paolucci, P. Castoldi, and
F., Cugini,
“Multi-layer bandwidth recovery for multimedia communications: an
experimental evaluation”,
Next Generation Internet Networks, 2005
18-20 April 2005 Page(s):310 - 317
• [CommMag06] L. Valcarenghi, L. Foschini, F. Paolucci, F. Cugini, and
P. Castoldi, "Topology Discovery Services for Monitoring the Global
Grid", IEEE Communications Magazine, March 2006
• [draft-ggf-ghpn-netserv-2] draft-ggf-ghpn-netserv,
http://forge.gridforum.org/projects/ghpn-rg
Authors: Mario Pickavet, Luca Valcarenghi
Course: Optical Network Resilience
Module: Multi-Layer Resilience
Revision: 1/2007
77 (77)