Top Banner
1998 년 8 년 7 년 Data Engineering Lab 년 년년 1 Exploratory Mining and Pruning Optimization of Constrained Associations Rules
22

Exploratory Mining and Pruning Optimization of Constrained Associations Rules

Feb 25, 2016

Download

Documents

angus

Exploratory Mining and Pruning Optimization of Constrained Associations Rules. Abstract. Standpoint of supporting human-centered discovery of Knowledge lack of user exploration and control lack of focus rigid notion of relationship Constrained association queries - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 1

Exploratory Mining and Pruning Optimization of Constrained

Associations Rules

Page 2: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 2

Abstract

• Standpoint of supporting human-centered discovery of Knowledge– lack of user exploration and control– lack of focus– rigid notion of relationship

• Constrained association queries– pruning using monotonicity, succinctness

Page 3: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 3

Introduction• Problem1 (Lack of User Exploration

and Control)– Mining Process => Black Box – (user can’t preempt and needs to wait for hours)– establish clear breakpoints to allow user

feedback

• Problem2 (Lack of Focus)– on which to focus the mining to find association between sets of items whose

types do not overlap

Page 4: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 4

associations from item sets whose total price is at least $1,000

– provide a rich interface for the user to express focus (CAQ)

• Problem3 (Rigid notion of Relationship)– significance metrics :– separate criteria for selecting candidates for

the antecedent and consequent: association from items to sets of types pepsi => snacks

Page 5: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 5

Page 6: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 6

Architecture• Phase 1

– user initially specifies CAQ• includes a set of constraints C• C is applicable to the antecedent and consequent

– output: • pairs of candidates(Sa, Sc)

• Sa, Sc have support over thresholds

– user can add, delete, of modify the constraints as many times as desired

Page 7: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 7

• Phase 2– significance metric – a threshold for the metric– whatever further conditions to be imposed ont

the antecedent and consequent classical association mining - confidence (as significance metric) - confidence threshold - require ( SaSc) be frequent

Page 8: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 8

Page 9: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 9

Constrained Association Queries• CAQ

– S Item : S is a set variable on the Item domain

– {(S1, S2) |C}, C is a set of constraints on S1, S2

– frequent constraints freq(Si)– trans(TID, Itemset), iteminfo(Item, Type, Price)– S.price 100 : all items in S are of price less

than of equal to $100– {snacks, sodas} S.Type

Page 10: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 10

• CAQ Examples– {(S1, S2) | S1 Item & S2 Item & count(S1) = 1 & count(S2) = 1

& freq(S1) & freq(S2)}• S1.Type S2.Type and max(S1.Price) avg(S2.Price)

– {(S1, S2) | agg1(S1.Price) 100 & agg2(S2.Price 1000}

– {(S1, S2) | S1.Type {Snacks} & S2.Type {beers} & max(S1.Price) min(S2.Price)

• Sound/Complete– algorithm is sound if it only finds frequent sets that satisfy the

given constraints– algorithm is complete if all frequent sets satisfying the given

constraints are found

Page 11: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 11

• Goal– to push the constraints as deeply as possible

inside the computation of frequent set– classical algorithm + test them for constraint

satisfaction => too inefficient– sound/complete : anti-monotone, succinctness

Page 12: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 12

Anti-Monotone Constraints

• Find constraints which satisfy anti-monotone – prune away a significant num of candidates

• Definition – A 1-var constraint C is anti-monotone iff for all sets

S, S’: • S S’ & S satisfies C S’ satisfies C

• Identify which constraints are anti-monotone– Fig3– min(S) v (anti-monotone) , min(S) v (not )

Page 13: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 13

Page 14: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 14

Succinct Constraints• once-and-for-all (before any iteration takes place)

– not generate and test paradigm– how to

• succinctness • member generating functions

– definition• SATc(Item) : the set of item sets satisfying C , pruned space

– C1 S.Price 100 , pruned space for C1 contains only item sets such that each item in the set has a price at least $100

• selection predicate, p

Page 15: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 15

Page 16: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 16

Example

C1 S.Price 100 , let Item1 = price 100 (Item):

C1 is succinct because its pruned space SATc1(Item) is simply 2item1

C2 {snacks, sodas} S.Type : Let Item2, Item3 ,

Item4 be the sets type = ‘snacks’(Item), type = ‘sodas’(Item) , type ‘snacks’

type ‘sodas’ (Item)

C2 is succint SATC2(Item) can be expressed as 2item - 2item2 - 2item3 - 2item4 - 2item2 item4 - 2item3 item4

Page 17: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 17

Example

C1 S.Price 100, MGF = {X |X Item1 & C }

C2 {snacks, sodas} S.Type, MGF = {X1 X2 X3|

X1 Item2 & X1 & X2 Item3 & X2 & X3 Item4}

Page 18: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 18

Algorithms • Algorithm Apriori+

– computes the frequent set => among frequent set, those which satisfy constraints become answer set

• Algorithm Hybrid(m)– in case (C - Cfreq ) is more selective , apriori+ is

inefficient – First check Cfreq for m iterations – to reduce the remaining I/O cost, it switches to

checking (C- Cfreq)

Page 19: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 19

Page 20: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 20

CAP algorithm• 4 Cases succinct and Anti-monotone

– Replace C1 in the Apriori Algorithm by C1c

succinct but not anti-monotone

Page 21: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 21

Anti-monotone but Non-succinct – Define Ck as in apriori algorithm, drop the candidates S if S fails

C– constraint satisfaction is tested before counting is done

neither– Induce any weaker constraint C’ from C, depending on

whether C’ is anti-monotone and /or sucinct, use the above strategies

– Once all frequent sets are generated, test them for satisfaction of C

Page 22: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 22