Transcript sensors

An Integrated
Instrumentation Architecture
for NGI Applications
Ian Foster,
Darcy Quesnel, Steven Tuecke
Argonne National Laboratory
The University of Chicago
DOE NGI Instrumentation Project
“A Uniform Instrumentation, Event, and
Adaptation Framework for Network-Aware
Middleware and Advanced Network
Applications”
– With UIUC (Dan Reed, Ruth Aydt)
– “Produce uniform notification and
adaptation mechanisms, with the goal of
catalyzing the development of both
network-aware middleware and
sophisticated network-aware applications”
Motivation

Environment incorporates multiple sensors
– Sources of events relating to behavior of
resources, middleware, and applications

Significant advantages to having uniform
mechanisms for publishing/discovering
sensors and for accessing sensor data
– E.g., find all sensors for path A->B
– Including historical data

Enables end-to-end, top-to-bottom, pastto-present analysis
Examples of Sensors

Network devices
– E.g., routers

End system devices
– E.g., computers, storage systems

Grid services
– E.g., Globus HBM, Network Weather Service

Libraries
– E.g., CAVERNsoft, MPI

Applications
For Example ...
App
Libs
Sys
H/W
S
S
MPICH
S
S
globus-io
S
GRAM
S
CAVERNsoft
S
HBM
S (netstat) S (SNMP)
H
R
S
NWS
S
R
(SNMP)
S
R
S
DPSS
S
...
(SNMP)
S (netstat)
H
Three Project Components
1. Mechanisms for creating, publishing,
discovering, and accessing sensors
2. Synthesis and analysis techniques for
identifying qualitative behavior and trends
in sensor data
3. Adaptation techniques that exploit sensor
data to adjust middleware and application
configurations to improve performance
Argonne focus: (1) and (3); UIUC: (2), (3)
Current Approach

Use a directory service (LDAP) to register
and publish event sources
– Publish: source, type, contact [online, archive]
– Discover: “find all event sources of type X”




Use NetLogger format for data
Develop sensor manager to handle publish,
subscribe, archiving
Use SQL database as archive
Initial sensor set based on Globus libraries,
applications, NetLogger-accessible devices
Initial
Instrumentation Architecture
Sensor
Sensor
Application
Discover
Events in
NetLogger
format
Sensor
Manager
Archive
File
SQL
Subscribe
(“what event
sources for
route A to B?”)
Publish
(“netstat,
host A,
time T,
contact X”)
Netarchive
MySQL
LDAP
Sensor Manager

We are building a program which:
– Archives sensor event streams
– Redirects sensor event streams to clients
using a publish/subscribe interface
– Generates sensor event streams from
archive, based on query language
– Publishes interfaces and index to LDAP

Relation to other work
– Superset of Netlogd (simple archiver)
– Might exploit Netarchiver (MySQL indexing)
Archiving Events

How to archive sensor event streams?
– SQL: Save each event as a record in an SQL
database
> Advantage: Rich query support
– Netarchive: Save each event into file. Use
SQL database to build index of file contents
> Advantage: Performance and scale?

We will explore the use of SQL databases
– Premise: Most sensors will not produce high
volume event streams; hence optimize for
simplicity and rich query support
NCSA Origin Nodes
Bandwidth/Latency
ANL-NASA Ames
ANL CPU Load
Bandwidth/Latency
ANL-Indiana
Applying Info Infrastructure to Instrumentation
Publishing & Discovering Sensors

Globus LDAP-based Metacomputing
Directory Service (MDS) provides scalable,
global infrastructure for publishing and
discovering sensor managers
– Sensors stream events to a sensor manager
– Sensor manager publishes availability of
streams into LDAP
– Clients discover sensor managers from
LDAP, and can subscribe to either current or
archived sensor event streams directly from
sensor managers
Initial Applications

Replica creation in “Data Grid” applications
– Online and historical instrumentation for
large data transfers (app, lib, network)
– Involves DPSS, globus-io
– Also application-level selection of replicas,
based on sensor information

MPI-based video streaming (Karonis,
Papka)
Security

Grid Security Infrastructure (GSI) will be
used throughout, hence possible to say e.g.
– “Manager M accepts only streams from
sensors of user U”
– “Manager N only publishes streams to clients
of users A, B, C”

As a first step, we have augmented the
Netlogger C client with GSI
Instrumentation Architecture
Showing Actuators
Monitor
Sensor
Subscribe
Discover
Sensor
Events
Events
Actuator
Publish
Subscribe
Sensor Manager
File
SQL
Netarchive
MySQL
Discover
Publish
LDAP
Future Directions

XML
– Netlogger is an ASCII based format
– If you using ASCII, why not use XML?
– XML database could be used for archive

Events
– Performance related events should be just
one part of a larger, integrated event system

Typing
– Netlogger is weakly typed
– Various advantages to strongly typed events
Future Directions (2):
Publish/Subscribe for Sensors

In first version:
– Netlogger based sensors stream events to
manager
– Manager publishes sensor availability to
LDAP
– Clients subscribe to sensor manager for
events

In later version:
– Sensor can publish existence to LDAP
– Client can subscribe directly to sensor for
events
Network Weather Service
(R. Wolski et al., U.Tenn)

Scalable, fault tolerant system for
– Real-time performance measurements
– Predictions of future state

When installed on N hosts, delivers:
– Network performance (<=N2 via netperf)
– Host cpu-load measurements (N)

We (USC/ISI crew) are working to integrate
this into MDS; hopefully will eventually be
consistent with approach described here (to
be discussed)
Structure of NWS data in MDS (old)
c=US
o=Globus
o=ISI
nn= the Internet
hs=source.isi.edu to destination.anl.gov
source:
hn=source.isi.edu, o=ISI, c=US
destination: hn=destination.anl.edu, o=ANL, c=US
serviceProvider:
NWS
throughput:
1.903
throughput_prediction:
1.709
throughput_MSE:
0.95
latency:
5.3
latency_throughput:
6.1
latency_MSE:
0.04
N2 Network performance entries for N hosts
...
hn=source.isi.edu
current_cpu:
current_cpu_prediction:
current_cpu_MSE:
weighted_cpu
weighted_cpu_prediction:
weighted_cpu_MSE:
0.802
0.802
0.000
0.414
0.414
0.000
N sets of cpu info for N hosts