Transcript Werner Nutt
WP3
R-GMA – DataGrid’s
Monitoring System
1/7/2003
Werner Nutt (Heriot-Watt University)
<[email protected]>
RGMA = Relational Grid
Monitoring Architecture
WP3
• Grid Monitoring and Information System
developed within DataGrid
(Work Package 3)
• Based on the “Grid Monitoring Architecture”
of the Global Grid Forum
• Code is open source and freely available
Homepage: type “wp3” into Google
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
2
Contributors
WP3
• Heriot-Watt, Edinburgh
– Andrew Cooke, Alasdair Gray, Lisha Ma, Werner Nutt
• IBM-UK
– James Magowan, Manfred Oevers, Paul Taylor
• Queen Mary, University of London
– Roney Cordenonsi
• CCLRC/PPARC
– Rob Byrom, Laurence Field, Steve Hicks, Manish Soni,
Antony Wilson, Jason Leake
– Linda Cornwall, Abdeslem Djaoui, Steve Fisher, Robin Middleton
• SZTAKI, Hungary
– Peter Kacsuk, Norbert Podhorszki
• Trinity College Dublin
– Brian Coghlan, Stuart Kenny, David O’Callaghan
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
3
Overview
WP3
• Grid monitoring: Requirements
• The R-GMA approach:
A virtual monitoring database
• Components of R-GMA:
–
–
–
–
Schema
Producers and Consumers
Registry
Republishers
• Query Planning
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
4
Major Components of DataGrid
WP3
Job Submission
User Interface
Resource Broker
Status
Information
Monitoring
System
Logging and
Bookkeeping
Replica
Catalogue
Computer
Computing
Element
Computer
Computer
Storage
Element
Computer
Computer
Computer
R-GMA -DataGrid's Monitoring System
Data Transfer
Werner Nutt - 1/7/2003
5
WP7: R-GMA Collects
Network Monitoring Data
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
WP3
6
The Grid Monitoring Problem
WP3
In a Grid we have
– Computers
– Storage elements
– Network nodes and connections
– Application programmes, …
Monitoring:
– What is the current state of the system?
– How did the system behave in the past ?
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
7
Monitoring Data
Come in two Kinds
WP3
A Grid monitoring system makes available
two kinds of data
• static data “pools”, e.g., databases on
– network topology, nodes connected
– applications available (versions, licences, ...)
• “streams” of data, e.g.,
– sensor data (cpu load, network traffic, ...)
Data streams may give rise to data pools if they are archived
Today: R-GMA is tailored towards streams,
but not pools
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
8
Examples of
Monitoring Queries
WP3
• “Show me the (average) cpu-load
of computers at Heriot-Watt!”
• “Between which nodes was yesterday
the average transportation time for 1 MB packets
higher than than 0.… seconds?”
• For every computing element CE,
how many computers of CE have currently
a cpu-load of no “ more than 30%?”
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
9
Grid Monitoring Requirements
WP3
• Support for publishing data “pools” and “streams”
•
Support for locating data sources
(automatic, if possible)
• Queries with different temporal interpretations
(continuous, latest state, history)
• Scalability
(there may be thousands of data sources)
•
Resilience to failure
(data sources may become unavailable)
• Flexibility
(we don’t know which queries will be posed)
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
10
Architecture Approach 1:
A Monitoring Data Warehouse
WP3
Idea:
– store all data about the Grid status into a huge
database
– and query it
Not realistic:
• Loading takes time
• Data occupy space
• Connections to the warehouse may fail
• Often monitoring data flow as data streams, and
queries ask for data streams as output
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
11
Approach 2: Monitoring with a
“Multi-agent System”
WP3
The Grid Monitoring Architecture (GMA) of the Global Grid Forum
distinguishes between:
• Consumers of information
• Producers of information
Consumer
find/
register
• Directory Service
– Producers register their
supply
– Consumers register their
demand
Sensor
Directory
Service
Producer
Data Base
MonitoringApplication
Directory Service mediates between producers and consumers
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
12
Questions about GMA:
WP3
• Which kinds of producers and consumers are there?
• In which language do producers register their supply
and consumers their demand ?
• What is the meaning of a registration?
• How does a consumer find suitable producers?
And how does a producer find suitable consumers?
• Producers have different capabilities to answer queries
(e.g. selections, joins, …).
Which of them should they register?
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
13
R-GMA: A Virtual Monitoring
Data Warehouse
• Language of producers and consumers:
relational queries (SQL)
• Vocabulary: Relations in a global schema
Consumer
• Consumer: poses queries over
global schema
Query
Global Schema S
DB-Producer
DB
V1
V2
.
.
.
Vn
Views
on S
WP3
V
Registry
R-GMA -DataGrid's Monitoring System
• Producer: – has a type
(stream p., database p.)
Stream
Producer
Sensor
– publishes relations
R1, … ,Rk
– for every R, registers a
simple view V on the
global schema
Werner Nutt - 1/7/2003
14
Schema & Contributions
WP3
CPULoad (Global Schema)
Country
Site
Facility
Load
Timestamp
UK
RAL
CDF
0.3
19055711022002
UK
RAL
ATLAS
1.6
19055611022002
UK
GLA
CDF
0.4
19055811022002
UK
GLA
ALICE
0.5
19055611022002
CH
CERN
ALICE
0.9
19055611022002
CH
CERN
CDF
0.6
19055511022002
CPULoad (Stream Producer 2)
CPULoad (Stream Producer 1)
UK
RAL
CDF
0.3
19055711022002
UK
RAL
ATLAS
1.6
19055611022002
UK
GLA
CDF
0.4
19055811022002
UK
GLA
ALICE
0.5
19055611022002
CPULoad (Stream Producer 3)
R-GMA -DataGrid's Monitoring System
CH
CERN
ATLAS
1.6
19055611022002
CH
CERN
CDF
0.6
19055511022002
Werner Nutt - 1/7/2003
15
Contributions are Views
WP3
CPULoad (Producer 1)
UK
RAL
CDF
0.3
19055711022002
UK
RAL
ATLAS
1.6
19055611022002
SELECT * FROM
cpuLoad
WHERE country = ’UK’ AND site = ’RAL’
CPULoad (Producer 2)
UK
GLA
CDF
0.4
19055811022002
UK
GLA
ALICE
0.5
19055611022002
SELECT * FROM
cpuLoad
WHERE country = ’UK’ AND site = ’GLA’
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
16
Keys in the Global Schema
WP3
Network throughput:
tp(src, dest, method, pcktSize, timestamp, time)
Intuitively, tp has the primary key
(src, dest, method, pcktSize, timestamp).
We need to know the primary keys
• to understand the global schema
• to answer latest snapshot queries
Primary keys are declared, but not enforced!
Although, sometimes they hold globally if they hold locally !
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
17
Metaphor: Roles and Agents
WP3
R-GMA Clients: Grid components or Grid applications
• Clients can play the roles of producers or consumers
A client would need special capabilities for a role:
• Clients are supported in their roles by agents
Implementation:
• APIs for client roles:
“new StreamProducer(…)”
• Agents are objects on a Web server
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
18
Primary Producers
WP3
Database producer
• supports queries over fixed set of tuples (static queries)
• can be used to publish a database
Stream producer
• supports queries over changing set of tuples
(continuous queries)
• supports “latest snapshot queries”
– offers up-to-date values for each primary key in a db
Today: DatabaseProducer’s and StreamProducer’s
in R-GMA are different from the above!
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
19
Communication Modes of
Stream Producers
WP3
Stream Producers may offer two communication modes
for continuous queries:
– lossless (… but tuples could become stale)
– lossy
(… but tuples are fresh)
Producer
Producer
Servlet
Consumer
Servlet
IIIIIIII...
IIIIIIII...
Queue
Queue
Consumer
Today: R-GMA’s StreamProducer’s are resilient and
support lossless communication
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
20
Republishers Publish
Query Answers
WP3
into database
into stream
Static Query
Materialised View
--
Continuous
Query
Archiver
Stream
Republisher
Archiver: shows the history of a stream.
Stream Republisher: enables
– merging,
– thinning,
– summarising of streams …
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
21
Republishers in R-GMA Today
WP3
Republishers are called “archivers”
(although some of them don't archive anything)
An archiver (= republisher)
• is defined by a query
• consumes only from “stream producers”
• publishes the query result according to its type, using
– a “stream producer”, or
– a “latest snapshot producer”, or
– a “database producer”
(which keeps an archive)
Republishers are used to answer complex queries!
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
22
The Next Step: Hierarchies of
Stream Republishers
National
Republisher
country
= ‘uk’
Local/site
Republisher
site
=‘ral’
WP3
site
= ‘hw’
Stream Producers
ral
R-GMA -DataGrid's Monitoring System
hw
Werner Nutt - 1/7/2003
23
Republisher Hierarchies:
The Issues
WP3
• Republishers are defined by queries:
hierarchies have to be maintained automatically
• new stream producers must only be added
to republishers at “lowest level”
• hierarchy has to be replanned if a republisher fails
• difficult: transition from one plan to the other
without loss of tuples
• How well can we describe the content of a stream?
Possibly need for descriptions that join
• stream relations
• static relations
CPULoad(machineID, load, timestamp)
locatedAt(machineID, site)
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
24
What is the Meaning of
a Query in R-GMA?
WP3
Assumption: the views of (primary) producers are
selections on a single relation, i.e., queries of the form
SELECT *
FROM
cpu_load
WHERE machine_id = ‘AB123’ AND loc = ‘hw’
(each producer contributes its parts of a relation)
• The virtual database contains
the union of the data of all the primary producers
• Conceptually, a query is evaluated
over the entire virtual db
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
25
Stream Queries can have
Various Temporal Interpretations
WP3
Consider a query over the relation “Transport Time”
tt(src, dest, pcktSize, method, timestamp, time)
SELECT * FROM tt
WHERE src = ral AND dest = bologna
What is meant? Measurements
– from now ?
(Continuous Query)
– up until now ?
(History Query)
– right now ?
(Latest Snapshot Query)
Today: Queries can be “flagged” with their type
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
26
Advanced Queries:
Mixing Temporal Query Types
WP3
• “Which connections have currently
a transportation time
that is higher than last week's average?”
(latest snapshot and history)
• “Show me the cpu load of those machines
where it is lower than yesterday's load average!”
(continuous and history)
We do not intend to support such queries by R-GMA!
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
27
In R-GMA Query Answering
Needs Mediation
WP3
Suppose P1, P2 publish for tp (throughput)
P1:
P2:
… WHERE src = hw
… WHERE src = ral AND pcktSize > 20
A global consumer poses its query over global relations
SELECT * FROM
tp
WHERE
pcktSize > 10
A mediator translates this into queries over local relations
SELECT * FROM
UNION
SELECT * FROM
P1.tp
WHERE
pcktSize > 10
P2.tp
Today: R-GMA’s mediator handles simple queries like the one above
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
28
Global and Local Consumers
WP3
• Global consumers pose queries over global relations
SELECT * FROM
tp
WHERE
pcktSize > 10 ,
which are translated into queries over local relations
SELECT * FROM
UNION
SELECT * FROM
P1.tp
WHERE
pcktSize > 10
P2.tp
• Local consumers pose queries over local relations directly
SELECT * FROM P1.tp WHERE method = ping
Today: a consumer can be global or local,
but local relations cannot be referred to explicitly
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
29
How does the Mediator Find
Suitable Publishers?
WP3
P1, P2, P3 publish for tt (Transport Time)
P1: … src = hw
P2: … src = ral AND pcktSize > 20
P3: … src = ral AND method = ping
Q:
SELECT * FROM tt WHERE src = ral AND method = ping
We see: P1 is not suitable for Q, but P2 and P3 are.
Why?
src = hw AND src = ral AND method = ping
src = ral AND pcktSize > 20 AND …
is never true
is sometimes true
Satisfiability Test!
Today: implemented
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
30
… So Which Publishers Should
the Mediator Ask?
WP3
P2: … src = ral AND pcktSize > 20
P3: … src = ral AND method = ping
SELECT * FROM tt WHERE
src = ral AND method = ping
Q:
All answers to Q returned by P2 are also returned by P3 :
whenever
src = ral AND pcktSize > 20 AND src = ral AND method = ping
is true, then
src = ral AND method = ping AND src = ral AND method = ping
is true.
Hence, R-GMA only needs to ask P3
Entailment Test!
Needed for Republisher Hierarchies! (not yet implemented)
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
31
… But What Did the
Producers Promise?
WP3
P registers view V
Does P promise
– some of V ?
– all of V?
(sound description)
(sound and complete description)
• The Entailment Test only makes sense when the
registered views are sound and complete descriptions
• Producers should register completeness flags
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
32
… Why May a Producer
not be Complete?
WP3
• The language of views is more restricted than the
language of queries
Hence: republishers may be unable to say exactly
what they publish
• Archivers may archive in lossy mode
• Producers may lose tuples
• A producer may not know everything
about the real world
Open to debate
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
33
Summary (1)
WP3
Monitoring data come in Pools and Streams
Global Schema
• primary keys
Types of Stream Queries
• continuous vs. history vs. latest snapshot
Producers
• DB producers: publish database
• stream producers: lossless vs.
lossy communication modes
R-GMA -DataGrid's Monitoring System
Werner Nutt - 1/7/2003
34