Transcript Document
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Modelling Performance
Optimizations for
Content-based
Publish/Subscribe
Alex Wun and Hans-Arno Jacobsen
Department of Electrical and Computer Engineering
Department of Computer Science
University of Toronto
Matching Performance
Optimizations
Often based on exploiting similarities between
subscriptions
Avoid
unnecessary subscription and predicate
evaluations
Can we abstract these optimizations?
Formalize
content-based Matching Plans (order of
predicate evaluations)
Theoretically quantify performance of matching plans
Compare heuristic techniques with optimal matching
plans
Commonality Model
For a subscription set
Disjunctive
Commonality
Expression
{S1 Sm }
S1 Sm C
or
Conjunctive
Commonality
Expression
C S1 Sm
• Per-Link Matching
• DNF Subscriptions
• Shared predicates
• Clustering on
subscription classes or
attributes
• “Pruning” strategies
(e.g., number of
attributes)
A set of commonality expressions is a subscription topology.
Link-Group Topology
S1 Sm L
LP S1 P S m P
11 P 1n1 P
m1 P mn m P
S1 Sm C
Depth First Algorithm to
determine probabilistically
optimal matching plan
[Greiner2006] in ON lnN
Link-Group Topology
Low Selectivity
X
X
High Selectivity
o
o
Link-Cluster Topology
...
...
Dynamic Programming
(not very efficient)
...
...
Multi-Link Topology
...
Multi-Cluster-Link Topology
...
...
Arbitrary Topologies
...
Cluster Topology
Cluster Topology
o
• Dramatic scalability effects of clustering in CPS
• Observed trend depends on proportion of
commonalities not number of predicates
X
...
Applications – DoS Resilience
Normal
Subscription
Migration
Applications – DoS Resilience
High
Commonality
Low
Commonality
High
Commonality
Related Work
Carzaniga et al. [Carzaniga2001]
Formal
Mühl [Mühl2002]
Formal
notation for covering
syntax for CPS routing
Li et al. [Li2005] and Campailla et al.
[Campailla2001]
BDD
based CPS matching algorithms
Conclusion
Probabilistically optimal matching plans are
known for some subscription topologies
Scalable CPS matching depends heavily on
commonalities
Focus
on abstracting commonalities
Future work
Express
covering, correlation, …
Arbitrary subscription topologies
Metrics for expressing compression due to existence
of commonalities
References
[Greiner2006]
[Carzaniga2001]
Large-Scale Content-Based Publish/Subscribe Systems, PhD Thesis
[Li2005]
Design and Evaluation of a Wide-Area Event Notification Service, ACM
Transactions on Computer Systems
[Mühl2002]
Finding optimal satisficing strategies for And-Or trees, Artificial
Intelligence
A Unified Approach to Routing, Covering and Merging in
Publish/Subscribe Systems based on Modified Binary Decision
Diagrams, ICDCS
[Campailla2001]
Efficient filtering in Publish-Subscribe Systems using Binary Decision,
International Conference on Software Engineering
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Extra Slides
Table-based versus Tree-based
Naive
C N NS
Table-based
Tree-based
C n S nS S
pk 1
p N 1
k
C
p S
p 1
p 1
S
N n
N n
n
n
N
N
Rc
n
N
Rc 1
k 1
1
N S
Disjunctive Commonalities
S1 Sm C
Given some publication P
Si P C P
Computed by matching algorithm
“Shortcut” unnecessary subscription/predicate
evaluations
Examples:
Per-Link
Matching [Banavar1999,Carzaniga2003]
DNF Subscriptions
Conjunctive Commonalities
C S1 Sm
Given some publication P
CP Si P
Computed by matching algorithm
“Shortcut” unnecessary subscription/predicate
evaluations
Examples:
Shared
predicates
Clustering on subscription classes or attributes
“Pruning” strategies (e.g., number of attributes)