Monitoring of Peer-to-Peer Systems
Download
Report
Transcript Monitoring of Peer-to-Peer Systems
1
Distributed Monitoring
of
Peer-to-Peer Systems
By
Serge Abiteboul, Bogdan Marinoiu
Docflow meeting, Bordeaux
2
Outline
The Monitoring Problem & Approach
A language for specifying monitoring tasks: P2PML
P2PMonitor System
ActiveXML Stream Algebra
Architecture of P2PMonitor
Monitoring Plan Generation & Query Rewriting
Focus on Filtering
Reusing running tasks
Work in progress
2
3
The Monitoring Problem
P2P systems are:
a popular support for content sharing communities, distributed applications
highly dynamic (intense communications, content changing rapidly, peers
come/leave)
and
difficult to observe
Observation is important !
error management & diagnosis
statistics gathering & optimization issues : the « busiest » peer in a network
business applications : billing & quality of service
Web surveillance
3
4
Is it possible to observe & analyse a P2P system ?
Difficult (if not impossible)
in a centralized way
Yes, in a distributed manner
4
5
Approach
Detect events at the
(monitored) peer level
Data changes, Web service
calls -> alerters
Each event is represented as
an XML document
XML Stream
(distributed) XML stream
processing system
XML Streams are published
5
6
Outline
The Monitoring Problem & Approach
A language for specifying monitoring tasks: P2PML
P2PMonitor System
ActiveXML Stream Algebra
Architecture of P2PMonitor
Monitoring Plan Generation & Query Rewriting
Focus on Filtering
Reusing running tasks
Work in progress
6
7
P2PML statement structure
XQuery FLWR flavour
For – maps streams to XML variables
Let – assigns new XML variables
Where – imposes conditions on events (filtering and join criteria)
Return – generates reports / restructures XML
By – specifies publication means :
in channels for inside system publication
e-mails, Web pages, RSS feeds for outside system publication
7
8
P2PML statement example
for $c on local: outCOM
let $timeCall := $c.call.time
and $duration := $c.response.time - $timeCall
where
$c.call.method = “GetTemp” and $duration > 10
and $c.call.site = "http://meteofrance.fr"
return <longGetTemp>
<callTime>{$timeCall}</callTime>
<duration>{$duration}</duration>
</longGetTemp>
by channel “QoS:Alerts”
8
9
Outline
The Monitoring Problem & Approach
A language for specifying monitoring tasks: P2PML
P2PMonitor System
ActiveXML Stream Algebra
Monitoring Plans
Architecture of P2PMonitor
Focus on Filtering
Reusing running tasks
Work in progress
9
10
ActiveXML Stream Algebra
is the support for monitoring plan representation and the basis for
the its optimization :
Distribute the work among the peers
Try to place computation close to data if possible
Try to reduce redundancy
10
11
Scenario
P2PMLQueries
XML streams
11
12
Monitoring Plans
12
13
Architecture of P2PMonitor(1)
Subscription Manager
Alerters (WS Alerter, Database Alerter, RSS Alerter)
Stream Processors
Without « storage »: Filter, Restructure, Union
With « storage »: Join, Group-By, Duplicate Removal
Publishers
E-mail, WebPage, RSS
Channel Publisher : a user or another peer may subscribe to it
13
Architecture of P2P Monitor(2)
14
14
15
Outline
The Monitoring Problem & Approach
A language for specifying monitoring tasks: P2PML
P2PMonitor System
ActiveXML Stream Algebra
Monitoring Plans
Architecture of P2PMonitor
Focus on Filtering
Reusing running tasks
Work in progress
15
16
Focus on Filtering
Filtering is a crucial operator in stream processing !
E.g., Many users might be interested in events coming from the same
source / alerter : bottleneck hazard
Our approach to the problem : two-stage filtering
Reasons:
Attributes of XML document’s root reflect the most important
properties of an event
The event’s details can be given intentionnally (ActiveXML style)
16
17
Two-stage filtering
A subscription is viewed as a conjunction of simple conditions (e.g.,
« attribute » = « value ») and « more difficult » XPath queries
1st data structure regroups the (ordered) simple conditions of all the
subscriptions by commonalities (Atomic Event Set structure)
2nd data structure regroups XPath queries of all the subscriptions (path –
based indexing YFilter style – using NFA)
On a XML document:
1st stage: read the root, evaluate AES, detect the « difficult » XPath
queries that remain to be evaluated
2nd stage (if needed): adapt the second structure and evaluate the set of
XPath queries on the body of the XML document (if necessary execute
Web service calls).
The output is the set of the subscriptions « hit » by the XML document
17
18
Outline
The Monitoring Problem & Approach
A language for specifying monitoring tasks: P2PML
P2PMonitor System
ActiveXML Stream Algebra
Monitoring Plans
Architecture of P2PMonitor
Focus on Filtering
Reusing streams / running tasks
Work in progress
18
Reusing running tasks
19
Optimization by trying to avoid redundancy
Before building new operators (and streams), try to discover
useful ones
Stream representation in XML:
Stream Definition Database – description of available streams
Distributed, not centralized (avoid bottlenecks)
Implemented using KadoP – index and repository system over
a DHT
19
20
Stream replication and equivalence(1)
Streams can be replicated between peers
With two similar operators on two replicas of the same stream, we
obtain two equivalent streams
Replication can be represented in the Stream Definition Database
Stream Equivalence is difficult to detect
20
21
Algorithm for discovering useful streams
It uses XPath queries on the Stream Definition Database:
E.g. for identifying the output stream of alerter inCOM :
/Stream[@PeerId=$p1][Operator/inCOM]
->(S1, P1)
It goes bottom-up
on the query tree
E.g.,
JoinP(σF(inCOM@P1),
outCOM@P2)
(S5, P1)
21
22
Outline
The Monitoring Problem & Approach
A language for specifying monitoring tasks: P2PML
P2PMonitor System
ActiveXML Stream Algebra
Monitoring Plans
Architecture of P2PMonitor
Focus on Filtering
Reusing streams / running tasks
Work in progress
22
23
Work in progress (1)
Link with Incremental View Maintenance
Defining a monitoring task by a tree-pattern query on an active
document with streams - powerful way of expressing complex
monitoring tasks (difficult to express directly in P2PML)
23
24
Work in progress (2)
Introducing explictly the « time » in P2PML -possible impact on
P2PMonitor performance (reactivity) and resource consumption
(needed storage)
E.g.
for $e1 on P1:inCOM, $e2 on P2:outCOM
where $e1.timeEvent > $e2.timeEvent +25
…
Queries on traces obtained by P2PMonitor – diagnosis, detecting
patterns of evolution for the monitored system
E.g. Trace = I1,I2…In - instances of a document
For each new order detected in instance Ik, there is a payment
present in one of the following instances
24
25
Thank you very much!
25