slides - DataTAG
Download
Report
Transcript slides - DataTAG
Service Differentiation and Grids
Pascale Vicat-Blanc Primet
Benjamin Gaidioz, Pierre Billiau, François Echantillac,
Mathieu Goutelle, Fabien Chanussot
INRIA - Reso
LIP Laboratory
Ecole Normale Supérieure de Lyon
France
[email protected]
Outline
Requirements for E2E Service Differentiation in Grids
The EDS approach (DataTAG project)
The QoSinus approach (e-Toile/VTHD project)
Conclusion
P. Vicat-Blanc Primet
2
Typology of Grid flows
Applications flows:
Input & Output data
Inter process communication messages (MPI, DSM, synchro…)
Codes coupling
Interactions
Vizualizations
Voice/Video in collaborative environments
Control flows:
Grid environment deployment
Applications deployment
Control and Management of the Grid (middleware)
Monitoring, scheduling, loading, reporting, alarms…
All these flows share the same « network resource » and the
same bottlenecks
P. Vicat-Blanc Primet
3
Example: e-toile : Infrastructure
Experimental Testbed
Production Testbed
12 PC bipro
ID-IMAG
Grenoble
1 Gb/s
IRISA
Rennes
250 PC en cluster
2 Gb/s
6 PC bipro
CEA
Saclay
1 Gb/s
PRiSM
Versailles
SUN
Grenoble
Serveur 8 processeurs
1 Gb/s
1 Gb/s
Clusters de PC 1,9 Ghz
CERN
1 Gb/s
VTHD
2.5 à 10 Gb/s
1 Gb/s
16 PC bipro
Europe
US
ENS
ENS
Lyon
Lyon
EDF
Clamart
8 x 2 PC
linked by SCI
Serveur bipro
MP760
1 cluster de PC
Machine SMP
Serveur bipro
1,2 Ghz
Routeur actif
Service de dépôt de
données IBP
1 cluster Myrinet de 10 PCs
1 P.
cluster
de 8 PCs Primet
Vicat-Blanc
IBCP
16 power PC
linked by Myrinet
16 Sun Cobalt
Serveur 3* bipro
4
Grid Flows characteristics
Mice, Elephant, Lièvres et Tortues, …
Throughput:
Delay:
Very heterogeneous needs
Some applications are very sensitive to latency (MPI; visu)
Bulk Data Transfer delays have to be controlled
Reliability :
Rates: more than 9 orders of magnitude
Few bytes for interactive traffic or control traffic
To petabytes for bulk data transfer.
Generally reliable (=> TCP) but some apps are loss tolerant (Astro)
Communication models:
Point to point, point to multipoint, multipoint to point, multipoint to point
Collectives operations, synchronisation barriers...
P. Vicat-Blanc Primet
5
Medical Images processing : Pipeline
tagged MRI
sequences
From 20MB to
2GB/frame
1. Tags and myocardium
automatic extraction
time
2. Motion
estimation
P. Vicat-Blanc Primet
3. Quantification
6
How to control the performances?
Packet level (Network QoS)
~1 à 100ms
Mechanisms: classifiers, marquers et conditionners (routers)
Models: IntServ, DiffServ, Corestateless,Proportional, EDS…
Round trip time level (E2E QoS)
~1 à 100 ms
Congestion control and flow control (TCP, TFRC)
Session level
s, mn, or hr
Admission control, Resource reservation (RSVP), routing
Load sharing, MPLS-TE, BoD
Long term
Days, months...
Provisionning, planification, loD
P. Vicat-Blanc Primet
7
Explored Approaches (INRIA RESO)
Grid really need End to end QoS (bulk to MPI & vizual.)
Packet differentiation is already there in IP equipments
PQ, WFQ, CBQ, WRR, RED, WRED…
Lot of issues with IS & Diffserv
Service differentiation at transport level
Two approaches have been explored at INRIA:
E&E : DataTAG (assumption: bottleneck is in access&LAN)
Relative IP packet differentiated forwarding
Each connection manages its individual QoS
End protocol has to be adapted (SlowStart or AIMD)
Edge to Edge : e-Toile (assumption : bottleneck is in WAN)
An Independant API defined and integrated in mw to specify session
QoS goals
QoSINUS as a Grid network Service
Interact with the Grid Measurement Infrastructure
P. Vicat-Blanc Primet
8
EDS approach
P. Vicat-Blanc Primet
9
Equivalent Differentiated Service Model
Goal: Sharing the network resources (bottleneck) and control
the E2E performances according to the application specific
requirements
=> delay sensitive/ loss sensitive/rate sensitive…
Constraints: new PHB at IP level
Differentiated forwarding services without pricing
No admission control required.
PHB definition restricted to local parameters (no layer violation)
The transport layer has to integrate some adaptation
mechanisms to contribute to end to end performance control.
P. Vicat-Blanc Primet
10
Equivalent Differentiated Services
Proportionality
Asymmetry (cf ABE)
P. Vicat-Blanc Primet
11
Equivalent Differentiated Services
The EDS model defines an arbitrary number N of classes.
Differentiation on delay and loss rate for each class.
A class i gets a delay coef di and a loss rate coef li.
These coef are constants.
let i and j be two classes, the router schedules and drops their
packets so that there is a ratio di/dj between local queuing delays
and li/lj between local loss rates.
In order to avoid having privileged classes, coefficients are set:
if di<dj then li>lj
or
if di>dj then li<lj
for all I in [1,N] and j in [1,N]
P. Vicat-Blanc Primet
12
Adaptive Packet Marking: simple algorithm
loss
delay
t
Delay constraint
t
Selected class
Loss constraint
P. Vicat-Blanc Primet
13
AIMD EDS packet marking principle
P. Vicat-Blanc Primet
14
Validation
EDS layer3 has been implemented in NS and in the
Linux QoS kernel
EDS layer 4 has been implemented in SCTP via an
adaptation of the AIMD algorithm in NS and Linux
kernel and tested on a local emulated platform
(NistNet) and on DataTAG link
P. Vicat-Blanc Primet
15
Results for a mix of traffic (NS simulations)
Real-Time traffic
EDS3/4 Interactive traffic
Latency constraint respect 2x >
Transfer delay <60%
Bulk transfer
#timeout
P. Vicat-Blanc Primet
16
QoSINUS approach
P. Vicat-Blanc Primet
17
e-Toile GRID project goals
Develop a Grid testbed:
On the Very High Bandwidth experimental network (VTHD)
“Active Grid Technology” (dynamicity of the grid)
Develop a middleware prototype:
Programmable Network and communication Libraries
NFSp & GXFER, MPI madeleine, MOME (DSM),
Active network services (QoS, Mcast)
Management
Monitoring
Security
IHM
Globus 2.2
Duroc, GRAM
MDS, GRIS/GIIS
GSI
RSL
e-Toile:
Allocator, Loader
SIC - SPAM
GSI - authoriz.
LDT - GUIDE
Perform tests with high end applications
computing intensive, data intensive, network intensive
validation of a “high performance grid” model targeting large scale
numerical simulations.
P. Vicat-Blanc Primet
18
Programmable network
INRIA RESO/LIP)
Active nodes TAMANOIR and IBP depot (Loci/UTK) deployed at the edge
of VTHD
Gigabit supported with a TAN cluster (~1.3Gbits/s):
TAN cluster = a front-end with back-ends for load balancing
Actif flow
Receiver
TAN CEA
VTHD
Back-end 2
Receiver
…
Paris
frontale
Back-end 1
Back-end N
TAN CERN
ENS Lyon
Active Flow
Genève
P. Vicat-Blanc Primet
19
QoSINUS: E2E Performance controllability
QoSINUS: Quality of Service Negociate, Invoke, Use
Goals:
E2E QoS : an interface « application » <-> « network »
Application QoS objective: eg. E2E transfer delay
Use Network QoS: DiffServ (packet prioritization)
A programmable service (adapt API & algorithm)
QoSinus principles
Specification and negociation of a SLS for a microflow by Grid
scheduler or application
Programmable mapping of the QoS objective in a packet DSCP
in the first active node (use EF, AF, BE, LBE…).
Dynamic Adaptation of packet marking based on measurement
results (network and flow).
P. Vicat-Blanc Primet
20
QoS objectives programming
P. Vicat-Blanc Primet
21
VTHD++ plate-forme
ENST Br
Brest
FTRD
Caen
FTRD
Lannio
n
CHU
Rouen
IPv6 over MPLS
IPv6inIPv4
Rennes
ENST
Br
INRIA
Rennes
Renne
s
Paris
STL
FTRD
Rennes
IPv6 sur tunnel
Paris
MSO
Nancy
INT
INRIA
ENST
HEGP
EDF
FTRD
Issy
PRISM CEA
Opentransit
Connectivité IPv6
IPv6/IPv4
2.5Gbps
IPv6/IPv4
1 Gbps
IPv6/IPv
4
STM1/4
IPv4 seulement
INRIA
Nancy
Paris
AUB
CERN
CERN
INRIA Lyon
Lyon
INRIA Grenoble
FTRD
Grenoble
IMAG
Grenoble
Sun
Juniper :
M20/M40/T640
Nice
Eurecom
Cisco GSR
12000
Routeurs de sites
VTHD++/eToile
TSR Avici
P. Vicat-Blanc Primet
FTRD
Sophia
INRIA
Sophia
22
The VTHD backbone
“Really Very High Bandwidth”:
provides 1Gb/s to 2Gb/sdirect access links
Up to 4 x 2.5Gb/s in the core;
experimental network
great availability
Advanced services (Multicast, DiffServ, IPv6, MPLS,
GMPLS/UNI…)
connected to other research networks in EU through the
DataTAG link (CERN in Geneva).
The VTHD network is deployed by France Telecom
RNRT project VTHD and VTHD++
P. Vicat-Blanc Primet
23
DiffServ in VTHD
Edge
EF (10%)
Application
Streaming 30%
(AF TCP)
AF-UDP (30%)
VTHD
Best-effort 30%
P. Vicat-Blanc Primet
24
Experimental results in e-Toile/ VTHD
P. Vicat-Blanc Primet
25
Conclusion
Diffserv philosophie provides the mean to extend the IP
forwarding model with scalable and easy to deploy service
differentiation mechanisms.
Difficult to avoid it if we want to control performances in GRIDS !
Standard PHB are deployed (Premium, LBE) in EU NRNs
EDS or propDS provide simple and autonomous solutions to
add differentiated services in an IP network.
An incremental solution (for access links and LANs)
Adaptive end to end transport protocols (packet marking in AIMD...)
QoSINUS exploit and control DiffServ ingress point
transparently.
Provides a simple and extensible API to application (XML)
Provides a multi-domain and transparent solution
P. Vicat-Blanc Primet
26
Future Work: Grid5000
Measure the gain obtained with challenging grid applications
and grid infrastructures.
Interaction with novel transport protocols for bulk transfers
Explore deeply the multi-domain multi-service problem
Explore the scalability of the EDS and QoSINUS approaches.
GRID5000 project : a large scale cluster interconnection in
France
With about 5000 processors aggregated
With high performance DiffServ network links (RENATER)
+ high performance latency emulation tools.
http://www.grid5000.org
Interconnected with GN2
P. Vicat-Blanc Primet
27
More info
RESO project at INRIA:
http://www.ens-lyon.fr/LIP/RESO
e-Toile: http://www.urec.cnrs.fr/etoile
VTHD: http://www.vthd.org
GRID5000: http://www.grid5000.org
[email protected]
P. Vicat-Blanc Primet
28