Top Banner
Chase Methods based on Knowledge Discovery Agnieszka Dardzinska & Zbigniew W. Ras [email protected] & [email protected]
56

Chase Methods based on Knowledge Discovery

Jan 21, 2016

Download

Documents

lerato

Chase Methods based on Knowledge Discovery. Agnieszka Dardzinska & Zbigniew W. Ras [email protected] & [email protected]. Algorithm Chase. GIVEN:  Incomplete Information System ( IIS ) Constraints (functional dependencies,..) which IIS satisfies. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chase Methods  based on Knowledge Discovery

Chase Methods

based on Knowledge Discovery

Agnieszka Dardzinska & Zbigniew W. Ras [email protected] & [email protected]

Page 2: Chase Methods  based on Knowledge Discovery

X Faculty-Name Dept. Chair

x1 Bob

x2 John Jones

x3 Mike

x4 EE

x5 Tom EE

GIVEN: Incomplete Information System (IIS) Constraints (functional dependencies,..) which IIS satisfies

[ Dept Chair, Chair Dept Faculty-Name, … Dept(x1) =Dept(x2) ]

Algorithm Chase

Page 3: Chase Methods  based on Knowledge Discovery

Tableau System for IIS – information system with null values replaced by variables

X Faculty Name Department Chair

x1 Bob vd n1

x2 John vd Jones

x3 Mike n2 n3

x4 vE EE n4

x5 Tom EE n5

Page 4: Chase Methods  based on Knowledge Discovery

distinguished variables, one for each attribute (if b is an attribute

of interest, then vb is the corresponding distinguished variable)

nondistinguished variables (there are countably many of them:

n1, n2, n3, ….)

Variables in Tableaux System

Page 5: Chase Methods  based on Knowledge Discovery

X Faculty Name Department Chair

x1 Bob vd n1

x2 John vd Jones

x3 Mike n2 n3

x4 vE EE n4

x5 Tom EE n5

X Faculty Name Department Chair

x1 Bob vd Jones

x2 John vd Jones

x3 Mike n2 n3

x4 Tom EE n4

x5 Tom EE n4

Functional Dependencies:

[Department → Chair]

[Department *Chair → Faculty Name]

Page 6: Chase Methods  based on Knowledge Discovery

Input: tableaux system S and set of functional dependencies F

Output: tableaux system CHASEF(S)

BeginS1:=S;

while there are t1, t2 S1 and (B b) F

such that t1[B]= t2[B] and t1[b] < t2[b]

do change all the occurrences of the value

t2[b] in S1 to t1[b]

CHASEF(S):=S1

End

Algorithm Chase

Page 7: Chase Methods  based on Knowledge Discovery

Input: tableaux system S and set of functional dependencies F

Output: tableaux system CHASEF(S)

Begin

S1:=S;

while there are t1, t2 S1 and (B b) F

such that t1[B]= t2[B] and t1[b] < t2[b]

do change all the occurrences of the value

t2[b] in S1 to t1[b]

CHASEF(S):=S1

EndThe algorithm always terminates if applied to a finite tableaux system. If one execution of the algorithm generates a tableaux system satisfying F, then every execution of the algorithm generates the same tableaux system.

Algorithm Chase

Page 8: Chase Methods  based on Knowledge Discovery

Algorithm Chase 1

Page 9: Chase Methods  based on Knowledge Discovery

1. Chase 1 identifies all incomplete attributes (their values are called

concepts) in IS .

2. Main Algorithm

- Extraction of rules from IS describing these concepts,

- Null values in IS are replaced by values suggested by these

rules.

3. These two steps are repeated till fixpoint is reached.

Chase supported by rules extracted from IIS (Chase 1)

Page 10: Chase Methods  based on Knowledge Discovery

X b c d e f g

x1 b1 c1 e2 f1

x2 b2 c2 d2 e1 f2 g3

x3 b1 c1 d3 e1 f1 g1

x4 b3 c3 d3 e3 f1 g3

x5 b2 c2 e3 f1 g2

x6 c1 d2 f2 g1

x7 b1 d2 e2 f4 g1

x8 d2 e2 f2 g3

x9 b3 c1 d1 f2

x10 b2 c1 e3 f4 g2

X = {x1, x2, x3, x4, x5, x6, x7, x8, x9, x10}

A = {b, c, d, e, f, g}

e2 b1 (support 2), c1 f1 b1 (support 2), g2 b2 (support 2), c3 b3 (support 1), c2 b2 (support 2), g3d2 b2 (support 1), e3d3 b3 (support 1), f2d2 b2 (support 1).

Attribute b

),,( VAXS

Example (Chase1)

Page 11: Chase Methods  based on Knowledge Discovery

X b c d e f g

x1 b1 c1 e2 f1

x2 b2 c2 d2 e1 f2 g3

x3 b1 c1 d3 e1 f1 g1

x4 b3 c3 d3 e3 f1 g3

x5 b2 c2 e3 f1 g2

x6 c1 d2 f2 g1

x7 b1 d2 e2 f4 g1

x8 d2 e2 f2 g3

x9 b3 c1 d1 f2

x10 b2 c1 e3 f4 g2

Attribute b

Two null values in S: b(x6), b(x8)

b(x6):e2 b1 (support 2), c1 f1 b1 (support 2), g2 b2 (support 2), c3 b3 (support 1), c2 b2 (support 2), g3d2 b2 (support 1), e3d3 b3 (support 1), f2d2 b2 (support 1).

Example (Chase1)

Page 12: Chase Methods  based on Knowledge Discovery

X b c d e f g

x1 b1 c1 e2 f1

x2 b2 c2 d2 e1 f2 g3

x3 b1 c1 d3 e1 f1 g1

x4 b3 c3 d3 e3 f1 g3

x5 b2 c2 e3 f1 g2

x6 c1 d2 f2 g1

x7 b1 d2 e2 f4 g1

x8 d2 e2 f2 g3

x9 b3 c1 d1 f2

x10 b2 c1 e3 f4 g2

Attribute b

Two null values in S: b(x6), b(x8)

b(x6):e2 b1 (support 2), c1 f1 b1 (support 2), g2 b2 (support 2), c3 b3 (support 1), c2 b2 (support 2), g3d2 b2 (support 1), e3d3 b3 (support 1), f2d2 b2 (support 1).

Example (Chase1)

Page 13: Chase Methods  based on Knowledge Discovery

X b c d e f g

x1 b1 c1 e2 f1

x2 b2 c2 d2 e1 f2 g3

x3 b1 c1 d3 e1 f1 g1

x4 b3 c3 d3 e3 f1 g3

x5 b2 c2 e3 f1 g2

x6 c1 d2 f2 g1

x7 b1 d2 e2 f4 g1

x8 d2 e2 f2 g3

x9 b3 c1 d1 f2

x10 b2 c1 e3 f4 g2

Attribute b

Two null values in S: b(x6), b(x8)

b(x8):e2 b1 (support 2),

c1 f1 b1 (support 2),

g2 b2 (support 2),

c3 b3 (support 1),

c2 b2 (support 2),

g3d2 b2 (support 1),

e3d3 b3 (support 1),

f2d2 b2 (support 1). b(x6) = b2

Example (Chase1)

Page 14: Chase Methods  based on Knowledge Discovery

X b c d e f g

x1 b1 c1 e2 f1

x2 b2 c2 d2 e1 f2 g3

x3 b1 c1 d3 e1 f1 g1

x4 b3 c3 d3 e3 f1 g3

x5 b2 c2 e3 f1 g2

x6 c1 d2 f2 g1

x7 b1 d2 e2 f4 g1

x8 d2 e2 f2 g3

x9 b3 c1 d1 f2

x10 b2 c1 e3 f4 g2

c(x7):

b(x6) = b2

Two null values in S: c(x7), c(x8).

b1 c1 (support 2), e2 c1 (support 1), f4 c1 (support 1), g1 c1 (support 2), b2 d2 c2 (support 1), b2e1 c2 (support 1), b2f2 c2 (support 1), b2g3 c2 (support 1), d2e1 c2 (support 1),d2g3 c2 (support 1).

Example (Chase1)

Page 15: Chase Methods  based on Knowledge Discovery

X b c d e f g

x1 b1 c1 e2 f1

x2 b2 c2 d2 e1 f2 g3

x3 b1 c1 d3 e1 f1 g1

x4 b3 c3 d3 e3 f1 g3

x5 b2 c2 e3 f1 g2

x6 c1 d2 f2 g1

x7 b1 d2 e2 f4 g1

x8 d2 e2 f2 g3

x9 b3 c1 d1 f2

x10 b2 c1 e3 f4 g2

c(x7):

b(x6) = b2

Two null values in S: c(x7), c(x8).

b1 c1 (support 2),

e2 c1 (support 1),

f4 c1 (support 1),

g1 c1 (support 2),

b2 d2 c2 (support 1),

b2e1 c2 (support 1),

b2f2 c2 (support 1),

b2g3 c2 (support 1),

d2e1 c2 (support 1),

d2g3 c2 (support 1).

Example (Chase1)

Page 16: Chase Methods  based on Knowledge Discovery

X b c d e f g

x1 b1 c1 e2 f1

x2 b2 c2 d2 e1 f2 g3

x3 b1 c1 d3 e1 f1 g1

x4 b3 c3 d3 e3 f1 g3

x5 b2 c2 e3 f1 g2

x6 c1 d2 f2 g1

x7 b1 d2 e2 f4 g1

x8 d2 e2 f2 g3

x9 b3 c1 d1 f2

x10 b2 c1 e3 f4 g2

c(x8):

c(x7) = c1

Two null values in S: c(x7), c(x8).

b1 c1 (support 2),

e2 c1 (support 1),

f4 c1 (support 1),

g1 c1 (support 2),

b2 d2 c2 (support 1),

b2e1 c2 (support 1),

b2f2 c2 (support 1),

b2g3 c2 (support 1),

d2e1 c2 (support 1),

d2g3 c2 (support 1).b(x6) = b2 ,

Example (Chase1)

Page 17: Chase Methods  based on Knowledge Discovery

Input: System S=(X, A, V)Set of incomplete attributes In(A)={a1, a2, …, ak}Set of rules L(D)

Output: System Chase1(S)begin j:=1; while j ≤ k do begin

Sj:=Sfor all vVaj do

while

there is xX and rule (t v)L(D) such that xNSj(t) and card(aj(x))≠1

begina(x):=v;

endj:=j+1

end S:={Sj:1 ≤ j ≤ k}, Chase1 (S, In(A), L(D)) end

Algorithm Chase1(S, In(A), L(D))

Page 18: Chase Methods  based on Knowledge Discovery

A1- “the no. of different attributes

used in a query"

A2- “the percent of null values

in a queried IS"

A3- “the no. of objects returned

by QAS when IS-complete"

A4- “the no. of objects returned

by QAS when IS-incomplete"

(optimistic interpretation)

A5- “the no. of objects returned

by QAS based on rule-based

chase algorithm"

(pessimistic interpretation)

A6- “the no. of bad objects retrieved"

A7- “the no. of passes of rule-based

chase algorithm"

query A1 A2 A3 A4 A6 A7A5

4

248

q1

q1

q1

q1

q2

q2

q2

q2

3

3

33

4

44

4

4

4

4

4

4

4

44

2

2

33

333

2

13

12

11

1

1

12

0

5

5

8

14

1414

14

13131313 13

8

15

17

9

151727 16

2q3

q3

q3

q3

2222 4 2

22

2

23

381214 22

222222 25

252732

3028 23

22

ZOO Database

Page 19: Chase Methods  based on Knowledge Discovery

Rules Discovery

from partially

Incomplete Information Systems

Page 20: Chase Methods  based on Knowledge Discovery

Information System S = ( X, A, V )

X - finite set of objects,

A - finite set of attributes,

- set of their values. AaVV a :

)( xa

Assumption

1. For any , Aa Xx

2. For any ba ba VV

)(: xaJi{ ai Va 1)(

xaJi

ip }),( ii pa

Data (Incomplete)

Page 21: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

Example

Page 22: Chase Methods  based on Knowledge Discovery

Goal: Describe e in terms of {a,b,c,d}

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

)},(),1,(),,{(* 32

5331

11 xxxa

)}1,(),1,(),,(),,(),,{(* 7631

541

232

12 xxxxxa

)}1,(),1,(),,{(* 8443

23 xxxa

)},(),1,(),,(),,(),,{(* 41

7521

431

231

11 xxxxxb

)}1,(),,(

),1,(),,(),1,(),,(),,{(*

843

7

621

4332

231

12

xx

xxxxxb

Algorithm ERID for Extracting Rules from partially Incomplete Information System

Page 23: Chase Methods  based on Knowledge Discovery

Goal: Describe e in terms of {a,b,c,d}

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

)}1,(),,(),,(),,(),1,{(* 831

721

331

211 xxxxxc

)},(),1,(),1,(),,{(* 32

25431

22 xxxxc

)}1,(),,(),,{(* 621

331

23 xxxc

)}1,(),,(),1,(),1,{(* 821

5411 xxxxd

)}1,(),1,(),,(),1,(),1,{(* 7621

5322 xxxxxd

Algorithm ERID for Extracting Rules from partially Incomplete Information System

Page 24: Chase Methods  based on Knowledge Discovery

Goal: Describe e in terms of {a,b,c,d}

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

For the values of the decision attribute we have:

)}1,(),,(),1,(),,{(* 532

4221

11 xxxxe

)}1,(),,(),,(),,{(* 731

631

421

12 xxxxe

)}1,(),,(),1,{(* 832

633 xxxe

Algorithm ERID

Page 25: Chase Methods  based on Knowledge Discovery

Goal: Describe e in terms of {a,b,c,d}.

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

2. Check the relationship “ ”

between values of classification

attributes {a,b,c,d} and values

of decision attribute e

Algorithm ERID

Page 26: Chase Methods  based on Knowledge Discovery

Goal: Describe e in terms of {a,b,c,d}

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

Niiii pxc )},{(* Njjjj qye )},{(*Let , .

and confidence of the rule are above some threshold values.

We say that:** ji ec iff support

ji ec

Algorithm ERID

Page 27: Chase Methods  based on Knowledge Discovery

Goal: Describe e in terms of {a,b,c,d}

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

Niiii pxc )},{(* Njjjj qye )},{(*Let , .

and confidence of the rule

are above some threshold values.

We say that:

** ji ec iff support

ji ec

How to define support and confidence

of a rule ?ji ec

Algorithm ERID

Page 28: Chase Methods  based on Knowledge Discovery

To define support and confidence of the rule a1 e3 we compute:

)},(),1,(),,{(* 32

5331

11 xxxa

10110)sup( 32

31

31 ea

)}1,(),,(),1,{(* 832

633 xxxe

)sup(

)sup()(

1

3131 a

eaeaconf

21)sup( 32

31

1 a

Support of the rule:

Support of the term a1:

Confidence of the rule:

Definition of Support and Confidence (by example)

Page 29: Chase Methods  based on Knowledge Discovery

Goal: Describe e in terms of {a,b,c,d}

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

** 11 ea )1(sup 65 - marked negative

** 21 ea )1(sup 61

** 31 ea )11(sup - marked positive

)5.0( conf

Thresholds (provided by user):

Minimal support (λ1 = 1)

Minimal confidence (λ2 = ½)

- marked negative

Extracting Rules from partially Incomplete Information System(Algorithm ERID(λ1, λ2))

Page 30: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

**

33ea )11(sup but )36.0( conf

**

12eb )1(sup 6

7 but )22.0( conf**

22eb )1(sup 12

17 but )27.0( conf**

31ec )1(sup 2

3 but )47.0( conf**

22ec )11(sup but )33.0( conf

**

11ed )1(sup 3

5 but )48.0( conf**

31ed )11(sup but )28.0( conf

**

12ed )1(sup 2

3 but )33.0( conf**

32ed )1(sup 3

5 but )37.0( conf

Algorithm ERID(λ1, λ2)

Page 31: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

)11(sup but )36.0( conf

)1(sup 67 but )22.0( conf

)1(sup 1217 but )27.0( conf

)1(sup 23 but )47.0( conf

)11(sup but )33.0( conf

)1(sup 35 but )48.0( conf

)11(sup but )28.0( conf

)1(sup 23 but )33.0( conf

)1(sup 35 but )37.0( conf

They all are not marked

**

33ea

**

12eb

**

22eb

**

31ec

**

22ec

**

11ed

**

31ed

**

12ed

**

32ed

Algorithm ERID(λ1, λ2)

Page 32: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

*)*( 313 eca )11(sup and )8.0( conf

*)*( 313 eda )11(sup and )5.0( conf

*)*( 323 eda )10(sup *)*( 122 edb )1(sup 3

2 *)*( 222 ecb )1(sup 2

1

Algorithm ERID(λ1, λ2)

Page 33: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

)11(sup and )8.0( conf

)11(sup and )5.0( conf

)10(sup

)1(sup 32

)1(sup 21

They all are marked positive.

*)*( 313 eca

*)*( 313 eda

*)*( 323 eda

*)*( 122 edb

*)*( 222 ecb

Algorithm ERID(λ1, λ2)

Page 34: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

)11(sup and )8.0( conf

)11(sup and )5.0( conf

)10(sup

)1(sup 32

)1(sup 21

They all are marked positive.

They all are marked negative.

*)*( 313 eca

*)*( 313 eda

*)*( 323 eda

*)*( 122 edb

*)*( 222 ecb

Algorithm ERID(λ1, λ2)

Page 35: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

The algorithm continues for terms

of length 3, 4, … till all of them

have either positive or negative

marks.

Rules are automatically constructed

from relations marked positive.

Algorithm ERID(λ1, λ2)

Page 36: Chase Methods  based on Knowledge Discovery

Algorithm Chase 2(for Partially IIS)

Page 37: Chase Methods  based on Knowledge Discovery

),,( VAXS - partially incomplete information system of type λ, if S

is incomplete and the following three conditions hold:

Xx )(xaS is defined for any , Aa

]1})1:),{()([()()( iiiS pmipaxaAaXx )])((})1:),{()([()()( iiiS pimipaxaAaXx

Algorithm Chase 2

Page 38: Chase Methods  based on Knowledge Discovery

S1, S2 - partially incomplete, both of type λ and both classifying

the same sets of objects (from X) using the same sets of attributes (A)

Let }1:),{)( 1111mipaxa iiS and }.1:),{()( 2222

mipaxa iiS

The pair (S1, S2) satisfies containment relation Ψ (or Ψ(S1)= S2) if:

))](())(([)()(21

xacardxacardAaXx SS |]]|||[))](())(([[)()( 112221 j

jiij

jiiSS ppppxacardxacardAaXx

We also denote that fact by )]())(([)()(21

xaxaAaXx SS

Algorithm Chase 2

Page 39: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c1d 3e

System S1 System S2

x8

c2x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a 2d

1a 2b ),( 31

1c),( 3

23c 2d

3e

3a 2c1d

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b 2d 2e

2b 1c1d 3e),( 3

21a

),( 31

2a

2b),( 3

11c

),( 32

2c

2e

Page 40: Chase Methods  based on Knowledge Discovery

Assumptions: - information system of type λ - set of all pairwise independent rules extracted by ERID from S

),,( VAXS )}(:){()( AIncDvtDL c

NS(t) - standard interpretation of term t in S, meaning that:

)}(),(:),{()( xapvpxvN S , for any aVv

)()()( 2121 tNtNttN SSS

)()()( 2121 tNtNttN SSS

where for any , we have:IiiiS pxtN )},{()( 1 JjjjS qxtN )},{()( 2

JIiiiiJIiiiIJjjjSS qpxpxpxtNtN )},max(,{()},{()},{()()( \\21

JIiiiiSS qpxtNtN )},{()()( 21

In(A) = {a1, … , ak} - incomplete attributes in S

Page 41: Chase Methods  based on Knowledge Discovery

..................

;0:,:)( jj nxbfor all do begin

if and is a maximal subset of rules from L(D) such that

then if thenbegin

end endpj:= pj +nj;

endif /containment relation holds between aj(x), [bj(x)/pj]/then

jaVv 1))(( xacard j }:){( Iivt i

)(),( iSi tNpxj

)]sup()([ vtvtconfp iiiIi

)])}sup()([,{()(:)( vtvtconfpvxbxb iiiIijj

)]sup()([: vtvtconfpnn iiiIijj

]/)([))(( jjj pxbxa ]/)([:)( jjj pxbxa

..................

Algorithm Chase 2 (S, In(A), L(D))

Page 42: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

][ 311 ear ,1)sup( 1 r 5.0)( 1 rconf

][ 222 ear ,)sup( 35

2 r 51.0)( 2 rconf

][ 114 ebr ,2)sup( 4 r 72.0)( 4 rconf

][ 325 ebr ,)sup( 38

5 r 51.0)( 5 rconf

][ 31110 edcr ,1)sup( 10 r 5.0)( 10 rconf

Incomplete Information System Sof type λ = 0.3

λ1=1, λ2=0.5ERID(λ1, λ2)

Page 43: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

Incomplete Information System Sof type λ = 0.3

λ1=1, λ2=0.5ERID(λ1, λ2)

Algorithm Chase 2 will try to replace

)},(),,{()( 21

221

11 eexe

by enew(x1) = {(e1, ), (e2, ), (e3, )}.

We will show that Ψ(e(x1)) = enew(x1)(the value e(x1) will be changed by Chase 2).

Page 44: Chase Methods  based on Knowledge Discovery

x8

x7

x6

x5

x4

x3

x2

x1

edcbaX

),( 32

2a),( 3

11b

1e

1c1d

),( 21

1e),( 2

12e

),( 31

1a ),( 32

1b),( 3

12b

),( 41

2a),( 4

33a ),( 3

22b 2d

1a 2b ),( 21

1c),( 2

13c 2d

3e

3a 2c1d

),( 32

1e),( 3

12e

),( 32

1a),( 3

12a 1b 2c

1e

2a 2b 3c 2d),( 3

12e

),( 32

3e

2a ),( 41

1b),( 4

32b

),( 31

1c),( 3

22c 2d 2e

3a2b 1c

1d 3e

For x1:

)24.0,(

))3/14/()111(,(

3

21

10051

38

31

21

31

3

e

e

)97.0,())3/5/()(,( 2100

5135

32

2 ee

)48.0,()2/)2(,( 110072

32

1 ee

we have: )}24.0,(),97.0,(),48.0,{()( 3211 eeexe

Because the confidence assigned to e3 is below the threshold λ, then only two values remain:

(e1, 0.48), (e2, 0.97).The value of attribute e assigned to x1 is:

{(e1, 0.33), (e2, 0.67)}.Incomplete Information System Sof type λ = 0.3

Page 45: Chase Methods  based on Knowledge Discovery

Distributed Chase Algorithm (Chase 3)

Page 46: Chase Methods  based on Knowledge Discovery

Let:

),}({ LSDS Iii - distributed autonomous information systems

),,( iiii VAXS - information system for any Ii (I - set of sites)

Ii

iki DD

, - knowledge-base at site Ii

ikD , - set of (k, i)-rules (constructed at site k and sent

to site i)

Page 47: Chase Methods  based on Knowledge Discovery

Strategy for constructing

knowledge-base and Algorithm Chase 3Ii

iki DD

,

Notation:

qS=[a, b, c : d, e] - request by S for definitions of a, b, c

with additional information that d, e

are complete attributes in S.

Page 48: Chase Methods  based on Knowledge Discovery

Global Ontology

g a b c

g1 b2

g1 a2 b1 c2

g1 a2 c1

g1 a1 b1 c1

S2 b a d e

a1 d2

b2 a2 d2 e2

b1 a2 d1 e1

d1

S1

a b c d

a1 b2

b1 c2

a2 b2 d2

a2 b1 c1

rule support systemS KB

KBS

r1r2

qS=[a, c, d : b]qS

Page 49: Chase Methods  based on Knowledge Discovery

Global Ontology

g a b c

g1 b2

g1 a2 b1 c2

g1 a2 c1

g1 a1 b1 c1

S2 b a d e

a1 d2

b2 a2 d2 e2

b1 a2 d1 e1

d1

S1

a b c d

a1 b2

b1 c2

a2 b2 d2

a2 b1 c1

rule support system

b1a2 1 S

b2*d2a2 1 S

b2a2 1 S1

c1*b1a1 1 S2

S

r1

r2

KBS

r1r2

qS=[a, c, d : b]qS

Page 50: Chase Methods  based on Knowledge Discovery

Assumption:

.

),,( iiii VAXS

Di - granularity level of values of attributes used in rules from Di may differ from the granularity level of values of attribute used in descriptions of objects in

Chase 3 algorithm to be applicable to Si has to be based on rules from Di satisfying the following two conditions:

.

attribute value used in the decision part of a rule has the granularity level either equal to or finer than the granularity level of the corresponding attribute in Si

the granularity level of any attribute used in the classification part of a rule is either equal or softer than the granularity level of the corresponding attribute in Si

Page 51: Chase Methods  based on Knowledge Discovery

Hierarchical attributes: age, salaryRule in Di: (age, young) (salary, 40k)

age

young middle-aged old

salary

low medium high

18 … 29 30 … 60 61 … 80 10k…40k 50k 60k 70k 80k…100k

Example

Page 52: Chase Methods  based on Knowledge Discovery

Assumption:tuple t in Si supports rule . )( ii SDds

1. An overlapping attribute between rule and the tuple is the decision attribute in . ds

If two attributes, involved in that match, have different granularities, then the decision value d has to be replaced by a softer value which granularity will match the granularity of the corresponding attribute in Si.

2. An overlapping attribute between rule and the tuple is the classification attribute in . ds

If two attributes, involved in that match, have different granularities, then the value of attribute a has to be replaced by a finer value which granularity will match the granularity of a in Si.

Two cases:

Algorithm Chase 3 (Construction of new Di followed by Chase 2)

Page 53: Chase Methods  based on Knowledge Discovery

Chase 4(All Information Systems are equally

involved in chase)

Page 54: Chase Methods  based on Knowledge Discovery

cbag edabS3 S2

KB KB

qS2

qS3

dcbaS1

KB

qS2

qS1qS1

qS3

qS1=[a, c, d : b]

qS2=[b, a, e : d]

qS3=[a, b, c : g]

Page 55: Chase Methods  based on Knowledge Discovery

cbag edabS3 S2

r1, r2

r5, r6

KB

r5, r6

r3, r4

KB

qS2

qS3

r3, r4 – extracted from S3 r1, r2 – extracted from S2

dcbaS1

r1, r2

r3, r4

KB

r5, r6 – extracted from S1

qS2

qS1qS1

qS3

qS1=[a, c, d : b]

qS2=[b, a, e : d]

qS3=[a, b, c : g]

Page 56: Chase Methods  based on Knowledge Discovery