Sven Bittner and Sven Bittner and Annika Hinze Annika Hinze , , 2 November 2005 2 November 2005 Talk at the 13th International Conference on Talk at the 13th International Conference on Cooperative Information Systems (CoopIS 2005) Cooperative Information Systems (CoopIS 2005) A Detailed Investigation of A Detailed Investigation of Memory Requirements for Memory Requirements for Publish/Subscribe Filtering Publish/Subscribe Filtering Algorithms Algorithms
28
Embed
Sven Bittner and Annika Hinze, 2 November 2005 Talk at the 13th International Conference on Cooperative Information Systems (CoopIS 2005) A Detailed Investigation.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Sven Bittner and Sven Bittner and Annika HinzeAnnika Hinze,,
2 November 20052 November 2005
Talk at the 13th International Conference onTalk at the 13th International Conference onCooperative Information Systems (CoopIS 2005) Cooperative Information Systems (CoopIS 2005)
A Detailed Investigation of A Detailed Investigation of Memory Requirements for Memory Requirements for Publish/Subscribe Filtering Publish/Subscribe Filtering
• A subscriber is interested in A subscriber is interested in FrenchFrench books whose books whose title contains the phrase “title contains the phrase “Harry PotterHarry Potter”. ”.
• According to the condition of the copy of the book According to the condition of the copy of the book ((newnew, , usedused), she wants to pay at most ), she wants to pay at most NZ$10.0NZ$10.0 or or NZ$15.0NZ$15.0. .
• To avoid unnecessary notifications, the subscriber To avoid unnecessary notifications, the subscriber will be notified not earlier than will be notified not earlier than one dayone day before the before the auction ends.auction ends.
title like “Harry Potter” endingWithin < 1 day language = FRENCH
condition = NEW condition = USEDprice < 10.0price < 15.0
AND AND
AND
OR
Annika Hinze – Expressive Event Filtering in Distributed SystemsAnnika Hinze – Expressive Event Filtering in Distributed Systems
44/26/26
Motivation: Research Motivation: Research QuestionQuestion
• Current approaches only support Current approaches only support conjunctionsconjunctions
Effective in DBMS, but also in pub/sub?Effective in DBMS, but also in pub/sub?Annika Hinze – Expressive Event Filtering in Distributed SystemsAnnika Hinze – Expressive Event Filtering in Distributed Systems
55/26/26
title like “Harry Potter” endingWithin < 1 day language = FRENCH
condition = NEW condition = USEDprice < 10.0price < 15.0
AND AND
AND
OR
title like “Harry Potter” endingWithin < 1 day language = FRENCHcondition = NEW price < 15.0
AND
title like “Harry Potter” endingWithin < 1 day language = FRENCHcondition = USEDprice < 10.0
– Define scheme to characterise Define scheme to characterise subscriptionssubscriptions
• Describe structure of subscriptionsDescribe structure of subscriptions• Abstraction from specific application scenarioAbstraction from specific application scenario
Derive memory requirements of Derive memory requirements of algorithmsalgorithms
Annika Hinze – Expressive Event Filtering in Distributed SystemsAnnika Hinze – Expressive Event Filtering in Distributed Systems
• Theoretical Analysis and ComparisonTheoretical Analysis and Comparison
• Practical AnalysisPractical Analysis
• Summary and OutlookSummary and OutlookAnnika Hinze – Expressive Event Filtering in Distributed SystemsAnnika Hinze – Expressive Event Filtering in Distributed Systems
• Theoretical Analysis and ComparisonTheoretical Analysis and Comparison
• Practical AnalysisPractical Analysis
• Summary and OutlookSummary and OutlookAnnika Hinze – Expressive Event Filtering in Distributed SystemsAnnika Hinze – Expressive Event Filtering in Distributed Systems
99/26/26
Characterisation Scheme (1)Characterisation Scheme (1)• Fourteen parameters in four classesFourteen parameters in four classes
– Subscription-related (S)Subscription-related (S)• Characteristics of subscriptionsCharacteristics of subscriptions
• Theoretical Analysis and ComparisonTheoretical Analysis and Comparison
• Practical AnalysisPractical Analysis
• Summary and OutlookSummary and OutlookAnnika Hinze – Expressive Event Filtering in Distributed SystemsAnnika Hinze – Expressive Event Filtering in Distributed Systems
• All formulae All formulae – grow linearly with |s|grow linearly with |s|– Cut ordinate in zeroCut ordinate in zero
Comparison of first derivations in |s| sufficientComparison of first derivations in |s| sufficient
• Assumptions (less parameters)Assumptions (less parameters)– Reasonable values for algorithm-related Reasonable values for algorithm-related
parameters (A)parameters (A)– Usage of relative parametersUsage of relative parameters
• Determine Determine turning pointturning point when NCA requires when NCA requires less memory than canonical solutionsless memory than canonical solutions
• Description of turning point by number of Description of turning point by number of disjunctive elements in DNF (disjunctive elements in DNF (SSss))– Beneficial behaviour of NCABeneficial behaviour of NCA– Boolean subscriptions worthwhileBoolean subscriptions worthwhile
Counting requires less memory than cluster Counting requires less memory than cluster algorithmalgorithm
||pp| = 7 | = 7 (number of predicates)(number of predicates)
opop rr = 4/7 = 4/7 (relative number of Boolean operators)(relative number of Boolean operators)
ss rr = 5/7 = 5/7 (relative conjunctive elements per predicate)(relative conjunctive elements per predicate)
= 89/49 1.82
= 89/56 1.59
Practice: Practice: SSss = 2 = 2
NCA uses less memory (turning point less than one disj.)NCA uses less memory (turning point less than one disj.)Motivation Characterisation/Algorithms Motivation Characterisation/Algorithms Theoretical AnalysisTheoretical Analysis Experiments Experiments OutlookOutlook
title like “Harry Potter” endingWithin < 1 day language = FRENCH
condition = NEW condition = USEDprice < 10.0price < 15.0
AND AND
AND
OR endingWithin < 1 day language = FRENCH
condition = NEW condition = USEDprice < 10.0price < 15.0
– half as many operators as predicates (half as many operators as predicates (opop rr))– conjunctions per predicate vary (conjunctions per predicate vary (ss rr))
Only one disjunction per subscriptionOnly one disjunction per subscriptionresults in less memory requirementsresults in less memory requirementsof non-canonical approach.of non-canonical approach.
Counting vs. non-canonicalCounting vs. non-canonical Cluster vs. non-canonicalCluster vs. non-canonical
• Theoretical Analysis and ComparisonTheoretical Analysis and Comparison
• Practical AnalysisPractical Analysis
• Summary and OutlookSummary and OutlookAnnika Hinze – Expressive Event Filtering in Distributed SystemsAnnika Hinze – Expressive Event Filtering in Distributed Systems
2020/26/26
Practical AnalysisPractical Analysis• Verification of theoretical resultsVerification of theoretical results
• More memory required for More memory required for management of data structures, e.g.,management of data structures, e.g.,– ListsLists– Dynamic arraysDynamic arrays– Hash tablesHash tables
Overhead for different algorithms Overhead for different algorithms similar?similar?
• Nearly similar efficiency propertiesNearly similar efficiency properties
Overhead of converted (=more) Overhead of converted (=more) subscriptions outweighs more efficient subscriptions outweighs more efficient filtering (time and space)filtering (time and space)
• Theoretical Analysis and ComparisonTheoretical Analysis and Comparison
• Practical AnalysisPractical Analysis
• Summary and Future WorkSummary and Future WorkAnnika Hinze – Expressive Event Filtering in Distributed SystemsAnnika Hinze – Expressive Event Filtering in Distributed Systems
– Describe subscriptionsDescribe subscriptions– Calculate memory requirements of filter Calculate memory requirements of filter
algorithmsalgorithms
• Theoretical analysis and comparisonTheoretical analysis and comparison– Three algorithmsThree algorithms– Determination of point when NCA requires Determination of point when NCA requires
less memoryless memory
Even one disjunction might favour NCAEven one disjunction might favour NCA
– Optimise event and subscription routingOptimise event and subscription routing
– Problem: Problem:
Current routing optimisations only work for Current routing optimisations only work for conjunctive subscriptions (covering, conjunctive subscriptions (covering, merging)merging)
Design novel routing optimisationsDesign novel routing optimisations• Support arbitrary subscriptionsSupport arbitrary subscriptions• Subscription tree pruningSubscription tree pruning• Predicate replacementPredicate replacement
Thank you for your Thank you for your attention!attention!
Contact:Contact:
Sven Bittner, Sven Bittner, Annika HinzeAnnika Hinze{s.bittner, a.hinze}@cs.waikato.ac.nz{s.bittner, a.hinze}@cs.waikato.ac.nz
ReferencesReferences[Ashayer02][Ashayer02] G. Ashayer, H.-A. Jacobsen, and H. Leung. Predicate Matching and G. Ashayer, H.-A. Jacobsen, and H. Leung. Predicate Matching and
Subscription Matching in Publish/Subscribe Systems. In Subscription Matching in Publish/Subscribe Systems. In Proceedings of the Proceedings of the 22nd IEEE International Conference on Distributed Computing Systems 22nd IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW ’02)Workshops (ICDCSW ’02), pages 539–548, Vienna, Austria, July 2–5 2002., pages 539–548, Vienna, Austria, July 2–5 2002.
[Bittner05a][Bittner05a] S. Bittner and A. Hinze. On the Benefits of Non-Canonical Filtering in S. Bittner and A. Hinze. On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems. In Publish/Subscribe Systems. In Proceedings of the 25th IEEE International Proceedings of the 25th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW ’05)Conference on Distributed Computing Systems Workshops (ICDCSW ’05), , pages 451–457, Columbus, USA, June 6–10 2005.pages 451–457, Columbus, USA, June 6–10 2005.
[Bittner05b][Bittner05b] S. Bittner and A. Hinze. On the Benefits of Non-Canonical Filtering in S. Bittner and A. Hinze. On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems. In Publish/Subscribe Systems. In Proceedings of the 13th International Conference Proceedings of the 13th International Conference on Cooperative Information Systems (CoopIS 2005)on Cooperative Information Systems (CoopIS 2005), Agia Napa, Cyprus, , Agia Napa, Cyprus, October 31–November 4 2005.October 31–November 4 2005.
[Fabret01][Fabret01] F. Fabret, A. Jacobsen, F. Llirbat, J. Pereira, K. Ross, and D. Shasha. F. Fabret, A. Jacobsen, F. Llirbat, J. Pereira, K. Ross, and D. Shasha. Filtering Algorithms and Implementation for Very Fast Publish/Subscribe Filtering Algorithms and Implementation for Very Fast Publish/Subscribe Systems. In Systems. In Proceedings of the 2001 ACM SIGMOD International Conference on Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD 2001)Management of Data (SIGMOD 2001), pages 115-126, Santa Barbara, USA, May , pages 115-126, Santa Barbara, USA, May 21–24 2001.21–24 2001.
[Hanson90][Hanson90] E. N. Hanson, M. Chaabouni, C.-H. Kim, and Y.-W. Wang. A Predicate E. N. Hanson, M. Chaabouni, C.-H. Kim, and Y.-W. Wang. A Predicate Matching Algorithm for Database Rule Systems. In Matching Algorithm for Database Rule Systems. In Proceedings of the 1990 Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data (SIGMOD ACM SIGMOD International Conference on Management of Data (SIGMOD 1990)1990), pages 271-280, Atlantic City, USA, May 23-25 1990., pages 271-280, Atlantic City, USA, May 23-25 1990.
[Yan94][Yan94] T. W. Yan and H. Garcia-Molina. Index Structures for Selective T. W. Yan and H. Garcia-Molina. Index Structures for Selective Dissemination of Information Under the Boolean Model. Dissemination of Information Under the Boolean Model. ACM Transactions on ACM Transactions on Database Systems (TODS)Database Systems (TODS), 19(2):332–364, 1994., 19(2):332–364, 1994.
Annika Hinze – Expressive Event Filtering in Distributed SystemsAnnika Hinze – Expressive Event Filtering in Distributed Systems