EGI-InSPIRE EGI-InSPIRE RI

Download Report

Transcript EGI-InSPIRE EGI-InSPIRE RI

EGI-InSPIRE
Network Troubleshooting and
PerfSONAR-Lite_TSS
Mario Reale
GARR
EGI-InSPIRE RI-261323
www.egi.eu
A little bit of history
• During EGEE III a task of the SA2 activity
was dedicated to the provisioning of a
network monitoring solution for EGEE
• The emphasis drifted
– from “monitoring” to “troubleshooting”,
– from “scheduled measurements” to “ondemand”,
– .. and to light deployment
• DFN (RRZN Erlangen) designed a tool
based on the widely known/used
EGI-InSPIRE RI-261323
www.egi.eu
Same concept
• Launch test on demand from one site under the control of the central
server: ping, traceroute, DNS lookup, nmap and bandwith
measurements
2 Authentication
Authorization
Process
Login/passwd or
certificate
2
1
TOC
NOC / ROCs members
site administrator / troubleshooter
7
3
Mutual
authentication
6
5
Grid site A
4
Security
based on IP
address
BWCTL port
open on
demand
Grid site B
Local site light probe
Central web monitoring
server
EGI-InSPIRE RI-261323
www.egi.eu
• Network monitoring tools for efficient remote troubleshooting
Network
monitoring
tools
– Launch test on demand from a Grid site
– PerfSONAR-Lite TroubleShooting Service
under central server control:
– Bandwidth measurements,
DNS lookup, Traceroute, Port
testing, Ping
2 Authentication
Authorization
Process
2
1
ENOC supervisor
ROCs members
site administrator
7
ENOC
3
6
5
4
Grid site A
• PerfSONAR-Lite TSS
Grid site B
Local site light PerfSONAR’s probe
Central ENOC monitoring server
– is easy to use for the Grid administrators
– can be used quickly by site admin without the obligation to
make contact with the remote site involved in the problem
– fills the lack of network diagnostic tool
EGI-InSPIRE RI-261323
Networking Support – Xavier Jeannin - EGEE-III First Review 23-24
June 2010
www.egi.eu
4
• First version was released and installed on 6 sites
• Installation guide and procedure
– http://www.dfn.de/en/enhome/x-win/download-ofperfsonar-lite-tss/
– FAQ, tutorial, new features (users, sites, ROC
management)
– Software authorization schema was adapted to be able
to fit with hierarchical EGI/NGI model
• Difficult to deploy the software during the transition
phase toward EGI
Network monitoring tools
EGI-InSPIRE RI-261323
Networking Support – Xavier Jeannin - EGEE-III First Review 23-24
June 2010
www.egi.eu
5
New hierarchy
TOC
• TOC = Top Level Operation Center (in EGI context it should be
EGI NOC at GARR)
• OC = Operation Center (NOC, in EGI context it should be NGIs))
• S = Site
• This organization is more flexible
• a site can be included in several operation centers
• operation center can be created easily
EGI-InSPIRE RI-261323
www.egi.eu
Improvement
• Become a non specialized software rather than a
dedicated software
– Used by any kind of operating centers
• Improve security
– Probe should be accessible for bandwidth control test
only by the source site
• Accept both login/password & certificate
authentication
• Improve the general design of web forms that were
not user friendly
• Maintain continuously a list of active probes for the
end user
• Automate the probe installation more
EGI-InSPIRE RI-261323
www.egi.eu
Development choice
• As the database schema has to be changed
deeply, we decided to rewrite completely the
web server part
– But the probe software part should not be modified
(or very slightly) : OPPD, BWCTLD
• Use a Java framework technology ZK
http://www.zkoss.org/
– More efficient development
• Work has been started by 2 trainees
(Alexandre AL ABAYAJI and Youssef
Diouane)
EGI-InSPIRE RI-261323
www.egi.eu
Site’s probe
Site’s probe
ping
traceroute
OPPD
ping
traceroute
OPPD
DNS lookup
Port Scan
DNS lookup
Port Scan
BWCTLD
BWCTLD
Soap message
HINTS
Central server
AA
Web server
Local
DB
Software architecture
Users :
• ENOC
• NOC / ROCs members
• site administrator / troubleshooter
The state of the development
• Learning ZK and eclipse environment
• New design of the database and
implementation
• All the technical problems for the
development with ZK have been solved:
– Troubleshooting is working
– Multi-authentication and SSL authentication on
web server (login/passwd or certificate)
EGI-InSPIRE RI-261323
www.egi.eu
Pending work and time schedule
• Improvement of the web interface ergonomics
• Lot of features are still not available
– Manage sites
– Manage operating centers
– Update the list of active probes…
•
•
•
•
GOC-DB data importation and synchronization
SSL between web server and the probe
Automate probe installation
We plan to have a prototype version by the
end of November
EGI-InSPIRE RI-261323
www.egi.eu