Scalable Scalable Publish/Subscribe Publish/Subscribe Architectures & Architectures & Algorithms Algorithms Part I: Introduction Pascal Felber Pascal Felber University of Neuchatel [email protected]Based on work with many others C.-Y. Chan, W. Fan, M. Garofalakis, R. Rastogi R. Chand, S. Bianchi, M. Gradinariu
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Part III: Publish/subscribe overlaysPart III: Publish/subscribe overlays From broker overlays to P2P architectures Semantic communities for publish/subscribe
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 3
The publish/subscribe The publish/subscribe problemproblem
Publishers:Publishers: producers of information(e.g., stock quote, news feeds…)
Subscribers:Subscribers: consumers of
information
Filters:Filters: identify events that match
consumer interests
Publish/subscribe middleware
Centralized vs. distributed, persistent (DB) vs. transient, topic- vs. content-based, reliable vs. best-effort, etc.
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 4
I. Flooding Broadcast, filter by consumerPros: simple protocol / routersPros: simple protocol / routersCons: network-inefficientCons: network-inefficient
InternetInternet
Compute
dst. list
Compute
dst. list
II. Match-first Precompute destination listPros: bandwidth-efficient, Pros: bandwidth-efficient, simplesimpleCons: time- and space-Cons: time- and space-inefficientinefficient
III. Distributed routing Brokers have partial view of subscriptions Determine who to forward events to
qq
ppForward e?Forward e?e
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 8
Our focusOur focus
Content-basedContent-based filtering and routing DecentralizedDecentralized architecture (broker overlay &
P2P) DistributedDistributed routing protocol ScalableScalable to millions of subscriptions EfficientEfficient (near real-time) processing (Semi-)structured data based on standards
XMLXML data (mainly) XPathXPath subscriptions (mainly)
Part III: Publish/subscribe overlaysPart III: Publish/subscribe overlays From broker overlays to P2P architectures Semantic communities for publish/subscribe
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 11
Publish/subscribe modelPublish/subscribe model Consumers register
subscriptionssubscriptions Producers publish eventsevents Messages are routed to
interested consumers Interested message
matchesmatches subscription Matching based on the
contentcontent of messages P/S broker overlaybroker overlay Large number of
consumers (100s of 1,000s)
Large amounts of data
InternetInternet
Symbol: LUPrice: $10Volume: 101,000
Symbol: LUPrice: $10Volume: 101,000
City: NiceWeather: SunnyTemp: 24ºC
City: NiceWeather: SunnyTemp: 24ºC
Stock QuotesSymbol = LU andPrice ≥ 10
Stock QuotesSymbol = LU andPrice ≥ 10
Weather ForecastCity = Nice
Weather ForecastCity = Nice
Stock QuotesVolume > 100,000
Stock QuotesVolume > 100,000
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 12
Distributed content routingDistributed content routing We have a network of brokers that collectively
route events based on their content Given an event, a broker must determine which other
brokers and consumers to forward it to (like IP routing)
Goal: design a distributed routing protocol such that Routing is “perfect”: messages are received by all,
and only those, consumer that have a matching subscription
Space-, time- and bandwidth-efficientSpace-, time- and bandwidth-efficient
qq
ppForward e?Forward e?e
RTRTNext-hop computation
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 13
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 17
AggregationAggregation
Observation: if one is interested in messages that match filters p and q, and p ⊒ q, then it is sufficient to test messages against p
Aggregation:Aggregation: combine a set of filters S into an aggregate filter pa s.t. q S, pa ⊒ q E.g., IP prefix aggregation in BGP tables Smaller routing tables, more efficient filtering
qq
pp
Forward iif matches pForward iif matches p
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 18
Aggregation (cont’d)Aggregation (cont’d)
Perfect aggregation:Perfect aggregation: any message that matches pa matches some q S (pa = qS q)
Part III: Publish/subscribe overlaysPart III: Publish/subscribe overlays From broker overlays to P2P architectures Semantic communities for publish/subscribe
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 23
Simple language: navigate/select parts of XML tree
XPath Expression: sequence of node tests, child (/), descendant (//), wildcard (*), qualifiers ([...]) Constraints on structure and content of messages Using qualifiers, define tree patterntree pattern: specifies
existential condition for paths with conjunctions at branching nodes
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 32
XTrie performanceXTrie performance
Varying # of unique XPEs P withT≈100, L=20, pw=pd=0.1, pb=0, =0
Varying document length T withP=100k, L=20, pw=pd=0.1, pb=0, =0
Scalability vs. # XPEs Scalability vs. # tags
10 DTDs (up to 2727 elements, 8512 attributes)Intel P4 (1.5 GHz) with 512 MB memory, Linux, GNU C++10 DTDs (up to 2727 elements, 8512 attributes)Intel P4 (1.5 GHz) with 512 MB memory, Linux, GNU C++
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 33
AgendaAgenda
Part I: IntroductionPart I: Introduction Part II: Routing and Filtering Part II: Routing and Filtering
Part III: Publish/subscribe overlaysPart III: Publish/subscribe overlays From broker overlays to P2P architectures Semantic communities for publish/subscribe
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 34
Tree pattern aggregationTree pattern aggregation Problem:Problem: content routers need to store and match
content against huge numbers of subscriptions Need techniques to aggregate user subscriptionsaggregate user subscriptions
to a smaller set of aggregated content specifications Networking analog: Heavy aggregation of IP addresses in the
routing tables of routers on the Internet backbone However, subscription aggregation also implies a
“precision loss” False positives matching the aggregated content
specifications without matching the original subscriptions Goal: Goal: aggregate subscriptions to a small
collection while minimizing the “precision loss”
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 35
Aggregation:Aggregation: problem problem statementstatement
Given a set of tree patterns S and a space bound k, compute anew set S’ of aggregate patterns such that:1) S’ ⊒ S (i.e., S’ “generalizes” S — for each p S
there exists q S’ s.t. p ⊒ q)
2) (i.e., S’ is concise — |p| = number of tree nodes in p)
3) S’ is as precise as possible (i.e., any other set of patternssatisfying (1) and (2) is at least as general as S’) Minimize extra coverage (false positives) for the aggregated set S’
Least-upper-bound:Least-upper-bound: given tree patterns p and q, find the most precise/specific tree pattern containing both p and q LUB(p, q) = tightest generalization of p, q Shown that LUB(p, q) exists and is unique (up to
pattern equivalence) Straightforward generalization to any set of tree
patterns Algorithm LUB[p, q]: computes LUB of p and q
Uses of pattern containment and minimization algorithms
Similar, dynamic-programming flavor as CONTAINS[ ] algorithm, but somewhat more complicated
Details in [VLDB 02]
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 39
Quantifying precision lossQuantifying precision loss Consider aggregated pattern pa that generalizes
a set of patterns S (i.e., pa ⊒ q for each q S) Want to quantify the “loss in precision” when using pa
instead of S, i.e., the fraction of “false positives” Selectivity(pa) = fraction of documents matching pa
Selectivity(S) = fraction of documents matching any q S
Clearly, Selectivity(p) ≥ Selectivity(S) Precision loss = Selectivity(p) - Selectivity(S)Precision loss = Selectivity(p) - Selectivity(S)
Idea: use document distribution statistics to estimate selectivity and quantify precision loss Cannot keep the entire document distribution! Use coarse statistics (“document tree” synopsis)
computed on-the-fly over the streaming documents
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 40
The document-tree The document-tree synopsissynopsis
Document-tree synopsis:Document-tree synopsis: tree with paths labeled by frequency counts (# documents containing path) Summary of path-distribution characteristics of
documents Construction
Identify distinct document paths
Install all skeleton-tree paths in the synopsis Trace each path from the root, increasing frequency
counts and adding new nodes where necessary
Coalesce same-tag siblings
XML Document Skeleton Tree
xx
aaaa bb
bb cccc dd
xx
aa bb
bb cc dd
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 41
Part III: Publish/subscribe overlaysPart III: Publish/subscribe overlays From broker overlays to P2P
architectures Semantic communities for publish/subscribe
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 48
Broker-based approachBroker-based approach
Fixed infrastructure of reliable brokers (Subset of) subscriptions stored at brokers in
routing tables Typically takes advantage of “containment”
relationship Filtering engine matches message against
subscriptions to determine next hop(s)
Cons:Cons: dedicated infrastructure, large routing tables, complex filtering algorithms
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 49
P2P approachP2P approach Producers and consumers also act as routers
Directly communicate with each other Filter and forward events to interested consumers
Key idea:Key idea: Place consumers with similar interests close to each
others Trivial routing: forward to neighbors
iif event matches our interests(disseminate messages in “semanticcommunity” & stop when reachingboundaries)
Pros:Pros: broker-less, space-efficient, low filtering cost Cons:Cons: hard to maintain, less reliable, some FPs (& FNs)
Key problem:Key problem: build overlay according to interests
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 50
P2P approach (cont’d)P2P approach (cont’d) Problem: build overlay according to interestsI. Use “rigid” structure
Based on containment trees, spatial filters, DHTs, etc. New consumers inserted at specific position in overlay Overlay designed to avoid false negatives, limit false
positives
II.Use “loose” structure Gather consumers in semantic communities build using
proximity metric New consumers connect to peers with “close” interests More flexible architecture, but can have false negatives
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 51
Building interest-based Building interest-based overlaysoverlays
Exploit containment relationship and organize consumer in containment treecontainment tree Assumption: 1 subscription = 1 node Sa is S’s parent if Sa is the most specialized
subscription (deepest in tree) such that S ⊒ Sa
Virtual root node(s) Equivalence trees for same
subscriptionsPrice>10
Name=A10<Price<
30
Name=APrice=20
Name=A
Name=APrice=30
Volume=100
Name=AVolume=15
0
Name=AVolume<2
00
Name=AVolume>500
Name=AVolume>50
0
Name=AVolume>50
0
pr
p1
p2
p3
p4
p5
p6 p7
p8
p9 p10
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 52
Routing eventsRouting events Events forwarded downward and upward
If received from downward, also propagate upward (even if it does not match local subscription)
No false negatives, some false positives
Price>10
Name=A10<Price<
30
Name=APrice=20
Name=A
Name=APrice=30
Volume=100
Name=AVolume=15
0
Name=AVolume<2
00
Name=AVolume>500
Name=AVolume>50
0
Name=AVolume>50
0
pr
p1
p2
p3
p4
p5
p6 p7
p8
p9 p10
e: Name=APrice=30Volume=100
e
Problems• Tree is often unbalanced• Root node(s) heavily
loaded• Non-trivial reorganization
upon arrival, departure
Problems• Tree is often unbalanced• Root node(s) heavily
loaded• Non-trivial reorganization
upon arrival, departure
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 53
Low FP ratio, decreases exponentially with # peers
Reorganizations help: a new peer may be a better parent for an existing one
Broadcast would give 75% false positives Details in [EP 05]
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 54
Spatial filtersSpatial filters
Often, events are simple attribute-value pairs and subscriptions are predicates over these values Each attributes represents one dimension Events are points in an N-dimensional space Predicates are ranges, i.e., poly-space rectangles in
the N-dimensional spaceSpatialrepresentation
(N=2)
Associatedcontainment
graph
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 55
R-tree spatial filtersR-tree spatial filters Height-balanced tree
data structures for indexing multi-dimensional data Leaves: subscriptions Inner nodes: bounding
rectangles
R-tree
Spatialrepresentation
of R-tree
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 56
Distributed R-treesDistributed R-trees Idea: organize consumers in R-tree structure
Peers at leaves and inner nodes An inner node is its own child Promote more general (i.e., larger) subscription as
parent Events routing as for containment tree Use classical rules for constructing R-trees (or R+,
R*) No false negativeDistributed
R-treeAssociated
communicationgraph
Details in [ICDCS 07] [TPDS 09]
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 57
Document-tree synopsis:Document-tree synopsis: tree with paths labeled with matching setsmatching sets (documents containing path) Summary of path-distribution characteristics of
documents Adding a document to the synopsis:
Trace each path from the root of the synopsis, updating the matching sets and adding new nodes where necessary
Synopses with 3 variants for matching sets Different space budgets (sizes of matching sets,
compression degrees for pruning) Compare result of proximity metrics with exact
value computed from sets of matching documents
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 72
Evaluation: error metricsEvaluation: error metrics Let
P(p): exact selectivity of pP’(p): our estimate of the selectivity of p
Mi(p,q): exact proximity of p and q using Mi
M’i(p,q): our estimate of the proximity of p and q using Mi
Positive error:
Negative error:
Metrics error:
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 73
Positive error vs. hash sizePositive error vs. hash sizeHashes outperforms other approaches in
terms of accuracyHashes outperforms other approaches in
terms of accuracy
Less than 5% with 1,000 entries
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 74
Negative error vs. hash Negative error vs. hash sizesize
Hashes also outperforms other approaches (no error with xCBL for Hashes & Sets)
Hashes also outperforms other approaches (no error with xCBL for Hashes & Sets)
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 75
Positive error vs. synopsis Positive error vs. synopsis sizesize
For a given space budget, Hashes is the most accurate (after some threshold)
For a given space budget, Hashes is the most accurate (after some threshold)
Hashes becomes more accurate than Counters
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 76
Error of proximity metricsError of proximity metrics
Hashes produces the best
estimates
Hashes produces the best
estimates
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 77
Error vs. compression ratioError vs. compression ratioError remains small even for relatively high
compression degreesError remains small even for relatively high
compression degrees
Less than 15% error with 1:5 compression
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 78
ConclusionConclusion Decentralized (P2P) architectures for P/S
Key idea:Key idea: create P2P overlay with consumers sharing similar interests close to each other
I. “Rigid” structure (trees, R-trees, etc.) Organize peers according to containment relationship Trivial routing protocol, more complex maintenance Problem:Problem: build robust structure, balance load
II. “Loose” structure Create semantic communities for publish/subscribe Easier maintenance, may have false negatives Problem:Problem: estimate similarity of (seemingly unrelated)
subscriptions
Extra slidesExtra slides
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 80
ReferencesReferences[ICDE 02] C.Y. Chan, P. Felber, M.N. Garofalakis, R. Rastogi. Efficient Filtering
of XML Documents with XPath Expressions. In Proceedings of the 18th International Conference on Data Engineering (ICDE'02), San Jose, CA, February-March 2002.
[VLDBJ 02] Extended version of [ICDE 02] in VLDB Journal, Special Issue on XML, Volume 11, Issue 4, pp. 354-379, 2002.
[VLDB 02] C.Y. Chan, W. Fan, P. Felber, M.N. Garofalakis, and R. Rastogi. Tree Pattern Aggregation for Scalable XML Data Dissemination. In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB'02), Hong Kong, China, August 2002.
[IC 03] P. Felber, C.Y. Chan, M.N. Garofalakis, R. Rastogi. Scalable Filtering of XML Data for Web Services. In IEEE Internet Computing, Volume 7, Issue 1, pp 49-57, 2003.
[NCA 03] R. Chand and P. Felber. A Scalable Protocol for Content-Based Routing in Overlay Networks. In Proceedings of the IEEE International Symposium on Network Computing and Applications (NCA'03), Cambridge, MA, April 2003.
[CS 03] P. Eugster, P. Felber, R. Guerraoui, and A.-M. Kermarrec. The Many Faces of Publish/Subscribe. In ACM Computing Surveys, Volume 35, Issue 2, pp. 114-131, June 2003.
[DEBS 04] R. Chand and P. Felber. Efficient Subscription Management in Content-based Networks. In Proceedings of the International Workshop on Distributed Event-Based Systems (DEBS'04), Edinburgh, Scotland, May 2004.
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 81
References (cont’d)References (cont’d)[SRDS 05] R. Chand and P. Felber. XNet: A Reliable Content Routing
Network. In Proceedings of the 23rd IEEE Symposium on Reliable Distributed Systems (SRDS'04), pp. 264-273, Florianopolis, Brazil, October 2004.
[EP 05] R. Chand and P. Felber. Semantic Peer-to-Peer Overlays for Publish/Subscribe Networks. In Proceedings of the International Conference on Parallel and Distributed Computing (Euro-Par'05), Lisboa, Portugal, August 2005.
[ICDE 07] R. Chand, P. Felber, and M. Garofalakis. Tree-Pattern Similarity Estimation for Scalable Content-based Routing. In Proceedings of the 23rd International Conference on Data Engineering (ICDE'07), Istanbul, Turkey, April 2007.
[ICDS 07] S. Bianchi, A.K. Datta, P. Felber, and M. Gradinariu. Stabilizing Peer-to-Peer Spatial Filters. In Proceedings of the 27th International Conference on Distributed Computing Systems (ICDCS'07), Toronto, Canada, June 2007.
[EP 07] S. Bianchi, P. Felber, and M. Gradinariu. Content-based Publish/Subscribe using Distributed R-trees. In Proceedings of the International Conference on Parallel and Distributed Computing (Euro-Par'07), Rennes, France, August 2007.
[TPDS 08] R. Chand and P. Felber. Scalable distribution of XML content with XNet. In IEEE Transactions on Parallel and Distributed Systems, Volume 19, Issue 4, pp. 447-461, April 2008.
[TPDS 09] S. Bianchi, P. Felber, and M. Gradinariu. Stabilizing Distributed R-trees for Peer-to-Peer Content Routing. In IEEE Transactions on Parallel and Distributed Systems, 2009.
Scalable Publish/Subscribe Architectures & Algorithms — P. Felber 82
What is an overlay What is an overlay network?network?
Physical Network
Overlay Network
AA
BB
CC
Focus on the application layer
Focus on the application layer
Treat multiple hops through IP network as one hop in an overlay
network
Treat multiple hops through IP network as one hop in an overlay