Top Banner
Extracting Decisional Correlation Rules Alain Casali Christian Ernst
24

Extracting Decisional Correlation Rules

Feb 23, 2016

Download

Documents

Riva

Extracting Decisional Correlation Rules. Alain Casali Christian Ernst. Industrial Problem. Given a supply chain (in micro- electronics) , we want to find links between some parameters ’ values and values of a specific attribute of the supply chain (the yield) . - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Extracting Decisional Correlation Rules

Extracting Decisional Correlation Rules

Alain Casali

Christian Ernst

Page 2: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Industrial ProblemGiven a supply chain (in micro-electronics), we

want to find links between some parameters’ values and values of a specific attribute of the supply chain (the yield).

The use of positive (and/or negative) association rules is not suitable in our context.

We use correlation tests because: it is a more significant measure in a statistical way; the measure takes into account not only the presence but

also the absence of the items; the measure is non-directional, and can thus highlight

more complex existing links than a “simple ” implication.

Page 3: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

OutlinePreliminariesDecision Correlation RulesContingency VectorsLHS-χ2 algorithmExperimental Analysis Conclusion

Page 4: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Literal SetA literal set XY is composed by:

a positive part (X);a negative part (Y);

The variation of a literal set XY encompasses all the combinations that we can obtain from XY.Ex: Var(AB) = AB, AB, AB, AB

The support of a literal set is the number of transactions that contain its positive part and contain no 1-item of its negative part.

Page 5: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Correlation rule and χ2 (1)Contingency table

Expected Value

Tid Item Target

1 B C F T1

2 B C F T1

3 B C E T1

4 F T1

5 B D F T2

6 B F

7 B C F

8 A E

9 B C F

10 B F

Each cell of the contingency table (CT) of a pattern X contains the support of all literal sets YZ related to its variation:

CT (BF) B B ∑ line

F 7 1 8

F 1 1 2

∑ column 8 2 10

Page 6: Extracting Decisional Correlation Rules

Correlation rule and χ2 (2)Computation of χ2 (Brin’97)

Makes the link between real support and theoretical support (expected value)

Correlation rateutilization of a table giving the centile values with a single degree of freedom (existence of a bijection) Correlation (BF) ≈ 85%

Dexa'09 - Extracting Decision Correlation Rules

)( )ZE(Y

))²ZE(Y - )Z(Supp(Y)²(XTCZY

X ⇒χ2(BF) ≈ 1,67

Page 7: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Related ConstraintsAnti monotone constraint

(Cochran criteria):no cell of the CT must have a

null value; at least p% of the CT’s cells

must have a support greater or equal than MinSup;

Monotone ConstraintX symbolizes a valid correlation

rule: χ2(X) ≥ MinCor

Page 8: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Browsing the search spaceUtilization of levelwise algorithms to browse the

search space;Levelwise algorithms are adapted when:

the relation is on the disk;we have anti monotone constraints.

Problem: memory requirement for the contingency tables)*o( C2 i

n

1i

Level Memory requirement

2 4 MB3 2,5 GB4 1,3 TB

Example with |I| = 1000

Page 9: Extracting Decisional Correlation Rules

DEXA - Sept. 2006 9

Goal: enumerate the combinations (powerset lattice) with a balanced tree

Start point: 2 vectors; the 1st one is empty, the 2nd one contains the list of the itemsCreate 2 branches:

left: prune the last element of the 2nd vector (recursive call)

right: add the last element of the 2nd vector to the first (recursive call) Stop: when the 2nd vector is empty, then output the 1st vector

(,ABC)

(C,AB)(,AB)

(,A) (B,A)

(, ) (A,) (B,) (AB,)

Lectic Order & Lectic Search (LS)

Page 10: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

OutlinePreliminariesDecision Correlation RulesContingency VectorsLHS-χ2 algorithmExperimental Analysis Conclusion

Page 11: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Decision Correlation RulesWe are interested by rules satisfying the both

constraints:χ2(X) ≥ MinCorX contains 1 value of the target attribute

Problem: it does not exist a function f such that

χ2(X ∪ A) = f(χ2(X), supp(A))

Page 12: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

OutlinePreliminariesDecision Correlation RulesContingency VectorsLHS-χ2 algorithmExperimental Analysis Conclusion

Page 13: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Contingency Vector (1)Equivalence class associated with a literal

Contingency Vector of a pattern XSet of equivalence classes of the variation of X

[YZ] = {i Tid(r) / Y Tid(i) et Z Tid(i) = }Ex : [B F] = {3}

Ex : CV (B F) = { [BF], [BF], [BF], [BF]} = {{8}, {4}, {3},

{1,2,5,6,7,9,10}

Tid Item Target

1 B C F T1

2 B C F T1

3 B C E T1

4 F T1

5 B D F T2

6 B F

7 B C F

8 A E

9 B C F

10 B F

Page 14: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Contingency Vector (2)The contingency vector is a partition of the

Tid’sRecurrence relation:

In practice:

VC (X A) = (VC(X) [A]) (VC(X) [A])

Tid 1 2 3 4 5 6 7 8 9 10

VC(B) 1 1 1 0 1 1 1 0 1 1

Tid 1 2 3 4 5 6 7 8 9 10

VC(F) 1 1 0 1 1 1 1 0 1 1

Tid 1 2 3 4 5 6 7 8 9 10

VC(B) + VC(F) = VC(B F) 11 11 10 01 11 11 11 00 11 11

Additions in binary logic

Page 15: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Contingency Vector (3)Tid 1 2 3 4 5 6 7 8 9 10

VC(B) + VC(F) = VC(B F) 11 11 10 01 11 11 11 00 11 11

«Distribution» B F B F B F B F B F

TC[B F] 1 1 1 7

Computation of the contingency table

Page 16: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

OutlinePreliminariesDecision Correlation RulesContingency VectorsLHS-χ2 algorithmExperimental Analysis Conclusion

Page 17: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

LHS χ2 AlgorithmModification of LS in order to include the

contingency vectors;If we are on a node:

Call to the left branch: we do nothing;Before calling the right branch:

Computation of the new contingency vector; Test of the anti monotone constraints; [Add current pattern to the positive border] Test of the monotone constraints; Computation of the χ2

If all tests are OK, then output the pattern and its χ2

Page 18: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Memory RequirementsWhat is the needed storage requirement?Contingency Vectors of the 1-item:

|I|*|r| bitsCurrents contingency vectors (including the

previous one due to recursive call):|I|*|I|*|r| bits in theory|I|*|r| bytes in practice since we never

exceed pattern having a length greater than 8Finally we need: |r|*(|I|+|I|/8) bytes

this result has to be compared with )*o( C2 i

n

1i

Page 19: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

OutlinePreliminariesDecision Correlation RulesContingency VectorsLHS-χ2 algorithmExperimental Analysis Conclusion

Page 20: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Experimental Analysis (1) Experiments are made on PC with a 1.8 GHz

processor with a RAM of 2GoFiles are provided by 2 manufacturers

(STMicroelectronics and ATMEL)

STMicroelectronics

ATMEL

# transactions 492 426# Items 3384 1136

Page 21: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Experimental Analysis (2)

Page 22: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

Experimental Analysis (2)

Page 23: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

OutlinePreliminariesDecision Correlation RulesContingency VectorsLHS-χ2 algorithmExperimental Analysis Conclusion

Page 24: Extracting Decisional Correlation Rules

Dexa'09 - Extracting Decision Correlation Rules

ConclusionWe have discovered new parameters having an

influence on the yield (above 25% was not known before);

Better response time between 30 and 70% with LHS-χ2 compared to a levelwise algorithm;

Perspectives:Utilization of “divided and conquer” strategy for

better performances;« Cleaning » / Transformation of original data;Generalization of the rules by integrated literal

sets.