NS-06-122v3 - perfSONAR for APMs

Download Report

Transcript NS-06-122v3 - perfSONAR for APMs

Connect. Communicate. Collaborate
GN2 Multidomain Monitoring
Service: Serving IP NOCs
Nicolas Simar, DANTE
APM Meeting, Utrecht
24th of November 2006
Place your organisation logo in this area
Agenda
Connect. Communicate. Collaborate
• Provide the general concepts of the Multi Domain
Monitoring service.
– Set the scenes.
– You’ll use it soon!
– The Support that will be offered to you.
• Demo the visualisations
– Provide feedback
• Explain the next steps and what your role will be:
– Taking part to the Pilot and Prototype.
– Using the tools.
– What metric and services will be available – validate the first
choice.
What is JRA1?
Connect. Communicate. Collaborate
• JRA1 (Performance Measurement and Management) main objective is
to build a multi-domain monitoring framework inter-operable across
which is the basis to offer a Multi-Domain Monitoring (MDM) Service.
• Consists of the following main parts:
• Design and develop the framework (perfSONAR).
• Integrate measurement tools and databases within the perfSONAR
framework.
• Build user visualisation tools using the perfSONAR framework.
• There are about 25 participants (12.5 FTE), from 17 organisations.
– Main partners are CARNet, CESNET, Cynet, Dante, DFN, NORDUnet,
PSNC.
perfSONAR philosophy
Connect. Communicate. Collaborate
What is perfSONAR?
Connect. Communicate. Collaborate
• perfSONAR is a consortium of organisations who seek
to build network performance middleware that is interoperable across multiple networks.
• perfSONAR is a protocol.
– SOAP XML messages and following the Open Grid Forum (OGF)
Network Measurement Working Group (NM-WG).
• perfSONAR is, an example set of code (implementation of
web-services using the perfSONAR protocol).
PerfSONAR Web-Services
Connect. Communicate. Collaborate
• The framework takes care of the data movement.
• It covers the following perfSONAR web-services
–
–
–
–
Auth Service (JRA5)
Autz Service
Lookup Service (LS)
Measurement Archives services (MA)
• RRD MA, SQL MA, Hades MA
– Measurement Point services (MP)
• BWCTL MP, SSH/Telnet MP, CLI MP (I2), L2 status MP (JRA4)
– Topology Service (TopS, cNIS – SA3).
• Allows diversity on the measurement layer and on the visualization
layer.
perfSONAR philosophy
Connect. Communicate. Collaborate
Multi-Domain Monitoring
Service (MDM)
Connect. Communicate. Collaborate
• User : role – group of people making use of a MDM
Service.
– There may be several categories of users having different needs.
• An MDM service is an access to a set of metrics or
functionalities offered to a group of users by several
networks using the perfSONAR protocol.
• An MDM service is offered by deploying on a set of
perfSONAR web-services and/or visualisations.
• E2E really means Edge to Edge, not End to End (unless
end institutions buy into it).
Multi-Domain Monitoring
Service
Connect. Communicate. Collaborate
User
Own
User
Visualisation
GN2
Visualisation
perfSONAR SOAP XML + JRA5 AA
BWCTL MP
OWD MA
Lookup
BWCTL MP
OWD MA
Lookup
BWCTL MP
OWD MA
Lookup
Domain A
Domain B
Domain C
Multi-Domain Monitoring
Service
Connect. Communicate. Collaborate
• Multi-Domain Monitoring Service
– Access to a set of monitoring functionalities (e.g. accessing metric
or performing tests) offered to a group of users accessible directly
through an XML SOAP interface (perfSONAR protocol) or through a
visualisation tools.
– Based on an underlying set of perfSONAR web-services.
• perfSONAR web-service
– Web service (providing data or allowing to perform an action) using
the XML NM-WG. The perfSONAR web-services are the basic
building blocs of a MDM service.
Demos
Connect. Communicate. Collaborate
• http://wiki.geant2.net/bin/view/JRA1/Jra1WorkingArea
– perfsonarUI
• http://wiki.perfsonar.net/jra1-wiki/index.php/PerfsonarUI
– CNM
• http://wiki.perfsonar.net/jra1-wiki/index.php/CNM
– BWCTL MP
• Any interest for a deployment?
– Looking-glass
• http://wiki.perfsonar.net/jra1-wiki/index.php/Looking_Glass
– ABW
• https://perfmon.cesnet.cz/abw-intro.html
– Questions
• [email protected]
Demos - thanks
Connect. Communicate. Collaborate
– perfsonarUI
• Vedrin Jeliazkov, Nina Jeliazkova (ISTF/ACAD)
– CNM
• Andreas Hanneman, David Schmitz (DFN)
– BWCTL MP
• Verena Venus, Stephan Kraft, Roland Karch (DFN)
– Looking-glass
• Stijn Verstichel (IBBT)
– ABW
• Sven Ubik (Cesnet)
– And to all those providing the data…
Users Segmentation
Connect. Communicate. Collaborate
Advance
Service
User group and their Monotoring Data
TroubleTroubleHealth
Usage.
shooting
shooting
Check
PERT
NOC
Layer2 Project
Layer3 Project
PIP Project
NREN non technical Staff
End-User
Network Researcher
Security
Yes
Yes
Yes
Yes
Yes
Yes
Tailored
Project
Project
SLA
Added
Service
TroubleVerificat Value
Health
shooting
ion
Function
check
al
Yes
[optional] [optional]
Yes
Yes
[optional] [optional]
Yes
Yes
[optional] [optional]
Yes
Yes
Yes
Yes
[optional]
MDM Service Benefits
Connect. Communicate. Collaborate
• For the NOCs
– NRENs, EU RENs, GÉANT2 (Abilene(?), ESnet(?), RNP(?), etc).
– In DJ1.1.1
• NOCs encounter 5-10% of the problems involving coordination of between
multiple domains.
– E2E services/IP packets don’t stop at the boundaries of a domain.
– To have an E2E view.
• In particular when offering added value E2E services.
• Link capacity, link utilisation, packet drops, topology.
– To have in multiple domain on stand-by tools to perform basic tests.
• TCP throughput, link utilisation, delay, looking glass.
–
–
–
–
To have the capability of finding out where the tools are located.
To answer the question “End system vs network based problem?”
Send tests results easily.
Save time.
MDM Service Benefits
Connect. Communicate. Collaborate
• PERT
– Similar than for the NOCs.
• L2 project users (LHC OPN, DEISA, eVLBI).
– Can see the health of their service.
– Verify SLA.
– Integrate the data within their own tools.
• L3 project users (EGEE, eVLBI).
–
–
–
–
Can see the health of their service.
Verify SLA.
Integrate the data within their own tools
We can provide them added value services (traffic matrix between project
sites).
• End-users when appropriate tools will be made available.
– Empowering the network users: indication about the network.
– Work not started.
Going Operational
Connect. Communicate. Collaborate
• Pre-roll Out – define and set-up support structure now – March 07.
• Pilot – April 07 – August 07 – 5 RENs + GÉANT2
–
–
–
–
For NOC and PERT (no AA)
Understand the issues of going operational.
Validate the support structure, get feedback for next phase.
Release in January, deployment training in February.
• Prototype – October 07 – February 08 – 11 RENs + GÉANT2
–
–
–
–
–
For NOC, PERT and a limited number of projects.
Verify the MDM SLA.
Dedicated support team.
Verify how to provide the service to external parties.
Test the turn key solution.
• Operation – April 08
– More RENs, closer to end-institution.
– More projects supported.
MDM Service Pilot portfolio
Service
RRD MA or SQL MA
L2 status MP(*)
SQL MA (*)
Hades MA
Telnet/SSH MP
BWCTL MP
Lookup Service
perfso
narUI
CNM
Historical
Historical
Latest
Historical
Yes
Yes
Yes
Yes
Historical
Historical
On-demand
On-demand
On-demand
On-demand
On-demand
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Metric
L3 link utilisation
L3 link capacity
L2 circuit status
L2 circuit status
OWD, IPDV, OWPL,
traceroute
Delay RTT
show command
Traceroute
Achievable throughput (TCP)
UDP throughput
Service discovery
(*) L2 status MP or SQL MA
Connect. Communicate. Collaborate
JRA4
Visualp
L2
XML
NEMO erfSON
visuali access
AR
sation
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
MDM Service Prototype
portfolio
Connect. Communicate. Collaborate
JRA4 L2
Visualperf
visualisati
SONAR
on
Yes
Yes
Yes
Yes
Yes
Yes
perfsonar
UI
CNM
NEMO
Historical
Historical
Historical
Historical
Latest
Historical
Historical
Historical
Historical
Historical
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Delay RTT
show command
On-demand
On-demand
Yes
Yes
Yes
Yes
Hades MP
Lookup Service
Traceroute
Achievable throughput (TCP)
UDP throughput
OWD, IPDV, OWPL
Service discovery
On-demand
On-demand
On-demand
On-demand
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Topology Serivce
Auth or GiDP
Autz
Topology information
Authentication Service
Authorisation Service
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Service
Prototype
L3 link utilisation
L3 link capacity
RRD MA or SQL MA
L3 Interface Output drops
L3 Interface Input drops
L2 status MP
L2 circuit status
SQL MA
L2 circuit status
OWD, IPDV, OWPL
Achievable throughput (TCP)
Hades MA
UDP throughput
traceroute
Telnet/SSH MP
BWCTL MP
In orange, the additional foreseen functionality from the prototype over the Pilot.
Yes
Yes
XML
access
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Taking part to the Pilot
Connect. Communicate. Collaborate
• Deploy the web-services and provide the appropriate data.
• Set-up an MDM Level2 support, provide an operational
service.
– Ensuring availability of the web-services (Monitor the webservices), reporting problems following the MDM service
procedures.
• Having the NOC and PERT using the infrastructure, solving
issues thanks to it and providing feedback.
– Training the NOC and PERT.
• Validate the Service at the end of the phase.
– Tools, metrics, services.
MDM Web-services (Pilot
phase)
Service
RRD MA or SQL MA
L2 status MP(*)
Connect. Communicate. Collaborate
Metric
L3 link utilisation
L3 link capacity
L2 circuit status
L2 circuit status
SQL MA (*)
Hades MA (3 tool
OWD, IPDV, OWPL,
deployment per REN) traceroute
Delay RTT
Telnet/SSH MP
show command
Traceroute
BWCTL MP (3
Achievable throughput (TCP)
instances per REN)
UDP throughput
Lookup Service
Service discovery
Historical
Historical
Latest
Historical
Historical
Historical
On-demand
On-demand
On-demand
On-demand
On-demand
(*) To offer L2 status information, you can either chose the L2 status MP or SQL MA.
An NRENs will only provide L2 status information when offering L2 circuits to LHC and DEISA.
MDM Service Support
Connect. Communicate. Collaborate
• Infrastructure to support the perfSONAR web-services and
the visualisation tools used by the MDM will be set-up.
– For the deployers: installation, configuration, incident, monitoring.
– For users: installation, utilisation.
Deployers
(RENs)
Deployer
Service
Desk
SLA
Users
(NOC, PERT,
Projects)
SLA
ISS
User Service Desk
MDM Service Support
Connect. Communicate. Collaborate
• Level1 – Service Desk (ISS)
– Help to install, configure the tools, run reachability tests, help on usability,
track the RFE, forward problem to proper person, log the requests, update
the documentation, track bugs. This is a central function (rotating member
or group of people - ownership).
• Level2 – Administrator (RENs)
– Administrator of the machines where the services are installed. The
function lies within the providers. They are in charge of taking care of the
security of the services, of their availability (up) and reachability (no
firewall, etc). The service should be available 24/7.
• Level3 – Developers (3 years subcontract).
– The JRA1 developers who have build the services. They are in charge of
implementing new features and fixing bugs and of answering the query
forwarded by level1.
• The three levels of support will be available to both the users and the
deployers.
MDM Service Support
Connect. Communicate. Collaborate
• A turn key solution service could be provided for the webservices of a MDM service or part of it.
– HW bought.
– Web-services installed, monitored and managed on the REN
behalves.
– REN would still have to do a little bit.
• More information about the MDM service in January
– Transition to Service session on Tuesday afternoon.
– What question have you got to be answered during that session?
Visualisations
Connect. Communicate. Collaborate
Advance
Service
User group and their Monotoring Data
TroubleTroubleHealth
Usage.
shooting
shooting
Check
PERT
NOC
Layer2 Project
Layer3 Project
PIP Project
Visualisation Tools…
perfsonarUI
CNM
NEMO
VisualperfSONAR
JRA4 E2E L2 visualisation
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Tailored
Project
Project
SLA
Added
Service
TroubleVerificati Value
Health
shooting
on
Function
check
al
Yes
Yes
v
… and their usages.
[o]+[a]
Yes
Yes
Yes
v
Yes
v
Yes
In Red: Targeted for the Pilot.
In Orange: Probably targeted for the Prototype (in addition to the Pilot ones)
To find out what user group will use as visualisation tool, chose one type of usage
and find out, in the same column in the second table the tools available for this usage.