ppt - To the INFN WWW Server

Download Report

Transcript ppt - To the INFN WWW Server

LCG-EDT Monitoring Service
DataTAG WP4 Monitoring Group
DataTAG WP4 meeting
Bologna – 2003.02.18
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
Summary





monitoring of Grid elements
GLUE schema
EDG-WP4 monitoring framework
Discovery process
tasks
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
Monitoring of grid elements
Computing Element
Worker Node
Worker Node
Worker Node
Worker Node
Storage Element
Resource Broker
Replica Catalog
Information Index
[…]
LOW LEVEL measurements

CPU load

memory usage

disk usage (per partition)

network activity

number of processes

number of users (UI)

…
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
SERVICE checks

gatekeeper

gsiftp

gris

gdmp

RB/LB

…
(1/2)
Replica Manager
“GRID” measurements

number of total CPUs

number of free CPUs

number of running jobs

number of waiting jobs

SE free disk space

…
Monitoring of grid elements

sources of information

LOW LEVEL measurements -> plugins/sensors installed on each machine
(published through GRIS)

SERVICE checks -> sensors installed on monitoring server

GRID measurements -> sensors installed on monitoring server
ALL => DataBase

(2/2)
aggregate information (monitoring server side)



per VO
per site
…
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
GLUE schema

Conceptual model of grid resources to be used as a base schema of the GIS
(Grid Information Service) for discovery and monitoring purposes




model of computing resources (CE)
model of storage resources (SE)
model of relationships among them (close CE/SE)
Implementation status (v. 1.0) (for Globus MDS)


LDAP schema (DataTAG WP4.1)
information providers (CE/SE)
GLUE schema extension to include all monitoring metrics
done – “host level” added to GLUE schema
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
EDG-WP4 monitoring framework
It provides a client (Monitoring Sensor Agent - MSA) running
sensors (Monitoring Sensors - MS) on each node to monitor, and a
central server (Fabric Monitoring Server - fmonServer) to collect
data.
The server receives samples as they are measured by MSA, and stores
them in a flat file / Oracle database
The client is provided with a sensor (sensorLinuxProc) which uses /proc
file system to measure various basic quantities on Linux (CPU load,
network, etc).
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
EDG-WP4 monitoring framework
local farm element
computing element
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
web interface
ldap query
GIIS (GLUE schema)
discovery service
information index
monitoring service
ldap query
monitoring server
Monitoring
DB
WP4 fmonserver
GRIS (GLUE schema)
run
write
ldif output
information providers
read
farm monitoring
archive
computing element
WP4 monitoring agent
run
metric output
WP4 sensor
read
metric output
/proc
filesystem
worker node
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
WP4 monitoring agent
run
metric output
WP4 sensor
read
metric output
/proc
filesystem
worker node
Discovery process



Through the GIIS, via LDAP, we can obtain the CE/SE available at a
specific time.
With the use of a DB we can compare the infos from the GIIS against
the past history of the availability of the resources (an object can be
new, disappeared, re-available)
Trough the GRIS of the CE/SE we can obtain SITE/HOSTS infos (we
can reiterate the discovery process at site level to get site
resources/infos – ex: CEIDs, WNs, Network Adapters, supported
transfer protocols,…)
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
Discovery process: base
schema
GIIS Server
LDAP
Monitoring
DB
SQL
Monitorig
Server
LDAP Query
available CE/SE
LDAP Query
CEIDs, WNs,
Steps 3,4 repeated for every CE/SE
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
GIIS
2
LDAP
4
1:
2:
3:
4:
1
3
GRIS
Computing Element/
Storage Element
Tasks

identify the requirements for Grid monitoring


(1/3)
done – Grid monitoring analysis draft [with some LCG inputs]
(available on http://gridmon.na.infn.it/lcg-edt)
evaluation of existing monitoring tools (sensors) to use as “first
monitoring layer” on each grid-element

done – tools evaluated:
EDG-WP4 fabric-monitoring tool (fmon)
 client-server model
 very easy to use
 very easy to install (one RPM – without dependencies)
 highly customizable (time interval for each metric, …)
 it is very easy to add a new metric
 historical archive
 database in Oracle/plain-text format
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
Tasks

extension of the WP4 fabric-monitoring tool (fmon) to include other
monitoring metrics



done – (all metrics added are available on http://gridmon.na.infn.it/lcg-edt)
GLUE schema extension to include all monitoring metrics


(2/3)
done – “host level” added to GLUE schema
development of information-providers “to fill” the GLUE host level
extension – done
definition of database structure to store snapshot/historical monitoring
data – done
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18
Tasks


(3/3)
automatic resource discovery using MDS infrastructure and GLUE
schema – in progress
development of a web interface to display various “grid-views”
(per VO, per site, etc.) – in progress
Gennaro Tortone, Sergio Fantinel – Bologna, 2003-02-18