Learning Markov Logic Networks Using Structural Motifs
Post on 23-Feb-2016
43 Views
Preview:
DESCRIPTION
Transcript
1
Learning Markov Logic Networks Using Structural Motifs
Stanley KokDept. of Computer Science and Eng.
University of WashingtonSeattle, USA
Joint work with Pedro Domingos
Background Learning Using Structural Motifs Experiments Future Work
Background Learning Using Structural Motifs Experiments Future Work
2
Outline
3
Markov Logic Networks[Richardson & Domingos, MLJ’06]
A logical KB is a set of hard constraintson the set of possible worlds
Let’s make them soft constraints:When a world violates a formula,it becomes less probable, not impossible
Give each formula a weight(Higher weight Stronger constraint)
Teaches(p,c) ) Professor(p)2.7
4
Markov Logic A Markov logic network (MLN) is a set of
pairs (F,w) F is a formula in first-order logic w is a real number
vector of truth assignments to ground atoms
partition function
weight ofith formula
#true groundingsof ith formula
5
MLN Structure Learning
Input: Relational Data
Advises
Pete Sam
Pete Saul
Paul Sara
… …
TAs
Sam CS1
Sam CS2
Sara CS1
… …
Teaches
Pete CS1
Pete CS2
Paul CS2
… …
2.7 Teaches(p, c) Æ TAs(s, c) ) Advises(p, s)
1.4 Advises(p, s) Æ Teaches(p, c) ) TAs(s, c)
1.1 :TAs(s, c) ˅ : Advises (s, p)
…
Output: MLN
MLN Structure Learner
Generate-and-test or greedy MSL [Kok & Domingos, ICML’05] BUSL [Mihalkova & Mooney, ICML’07]
Computationally expensive; large search space Susceptible to local maxima
6
Previous Systems
7
AdvisesPete Sam Pete SaulPaul Sara
… …TAs
Sam CS1Sam CS2Sara CS1
… …
TeachesPete CS1Pete CS2Paul CS2
… …Sam
Pete CS1CS2CS3CS4CS5CS6CS7CS8
PaulPatPhil
SaraSaulSue
TAs
Advises Teaches
PetePaulPatPhil
SamSaraSaulSue
CS1 CS2
CS3 CS4
CS5 CS6
CS7 CS8
Teaches
TAs
Advises
Professor
Student
Course
‘Lifts’
LHL System[Kok & Domingos, ICML’09]
Trace paths &convert paths to first-order clauses
Background Learning Using Structural Motifs Experiments Future Work
8
Outline
First MLN structure learner that can learn long clauses Capture more complex dependencies Explore a larger space of clauses
Learning Using Structural Motifs (LSM)
Student1
Prof1
Course1
Student2
Course2
Student3
Course3
Prof2
Student4
Course4
Course5
Student5
Prof3
Course6
Student6
Prof4
Course7
Student7
Prof5
Student8
Course8
Course9
Student9
Prof6
Student10
Course10
Course11
Student11
Prof7
Student12
Course12
Course13
Student13
Prof8
Student13
Course1310
LHL Recap
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
Teaches
Advises
TAs
S9 S10 S11
P2
S12 S13
C3
C4
C5
S8
S20 S21 S22
P7
S23 S24
C7
C8
S33 S34 S35
P9
S36 S37
C13
C14
S25 S26
P8
S29 S30 S31
C10
C11
S32
C9
C12
S28S27
S14 S15 S16
P4P3
C6
S17 S18 S19
P6P5
C15
S38 S39 S40
C16
S41
P10 P11
11
Repeated Patterns
Student1
Prof1
Course1
Student2
Course2
Student3
Course3
Prof2
Student4
Course4
Course5
Student5
Prof3
Course6
Student6
Prof4
Course7
Student7
Prof5
Student8
Course8
Course9
Student9
Prof6
Student10
Course10
Course11
Student11
Prof7
Student12
Course12
Course13
Student13
Prof8
Student13
Course13
12
Repeated Patterns
Student1
Prof1
Course1
Student2
Course2
Student3
Course3
Prof2
Student4
Course4
Course5
Student5
Prof3
Course6
Student6
Prof4
Course7
Student7
Prof5
Student8
Course8
Course9
Student9
Prof6
Student10
Course10
Course11
Student11
Prof7
Student12
Course12
Course13
Student13
Prof8
Student13
Course13
Course
Student
Prof
Teaches
Advises
TAs
{ Teaches(p,c),TAs(s,c),
Advises(p,s) }
Finds literals that are densely connected Random walks & hitting times
Groups literals into structural motifs
Cluster nodes into high-level concepts Symmetrical paths & nodes
13
Structural motif = set of literals→ a set of clauses= a subspace of clauses
TAs(s,c)
:Teaches(p,c) ˅ TAs(s,c) ˅ Advises(p,s) Teaches(p,c) ˅ :TAs(s,c)
………
Learning Using Structural Motifs (LSM)
14
LSM’s Three Components
Advises
Pete
Sam
Pete Saul
Paul Sar
… …
TAs
Sam CS1
Sam CS2
Sara CS1
… …
Teaches
Pete CS1
Pete CS2
Paul CS2
… …
2.7 Teaches(p, c) Æ TAs(s, c) ) Advises(p, s)
1.4 Advises(p, s) Æ Teaches(p, c)) TAs(s, c)
-1.1 TAs(s, c) ) Advises(s, p)
…
Input: Relational DB Output:MLN
LSM
IdentifyMotifs
FindPaths
CreateMLN
AA
Random Walk Begin at node A Randomly pick neighbor n
E
F
D
B
C15
Random Walk Begin at node A Randomly pick neighbor n Move to node n
E
F
D A
2B
C16
Random Walk Begin at node A Randomly pick neighbor n Move to node n Repeat
E
F
D A
B
2C17
Expected number of steps starting from node i before node j is visited for first time Smaller hitting time → closer to start node i
Truncated Hitting Time [Sarkar & Moore, UAI’07]
Random walks are limited to T steps Computed efficiently & with high probability by
sampling random walks [Sarkar, Moore & Prakash ICML’08]
18
Hitting Time from node i to j
Finding Truncated Hitting Time By Sampling
E
F
D 1
B
C
A
A
T=5
19
Finding Truncated Hitting Time By Sampling
E
F
4 A
B
C
D
A D
T=5
20
Finding Truncated Hitting Time By Sampling
5
F
D A
B
C
E
A D E
T=5
21
Finding Truncated Hitting Time By Sampling
E
F
4 A
B
C
D
A D E D
T=5
22
Finding Truncated Hitting Time By Sampling
E
6
D A
B
CF
A D E D F
T=5
23
Finding Truncated Hitting Time By Sampling
5
F
D A
B
C
E
A D E D F E
T=5
24
Finding Truncated Hitting Time By Sampling
A D E D F E
T=5
E
F
D A
B
C
hAD=1hAE=2
hAF=4
hAA=0hAB=5
hAC=5
25
26
Symmetrical Paths
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
TeachesAdvises
TAs
S10 S11
P2
S12 S13
C3
C4
S8
S9
P1→S2P1, Advises, S2
P1→S3P1, Advises, S3 0, Advises, 1 0, Advises, 1
Physics History
Symmetrical
27
Symmetrical Paths
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
TeachesAdvises
TAs
S10 S11
P2
S12 S13
C3
C4
S8
S9
P1→S2
P1, Advises, S1, TAs, C1, TAs, S2 P1, Advises, S2
P1→S3P1, Advises, S3 0, Advises, 1 0, Advises, 1
0, Advises, 1, TAs, 2, TAs, 3 P1, Advises, S4, TAs, C1, TAs, S3 0, Advises, 1, TAs, 2, TAs, 3
Physics History
28
Symmetrical Nodes
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
TeachesAdvises
TAs
S10 S11
P2
S12 S13
C3
C4
S8
S9
P1→S2
P1, Advises, S1, TAs, C1, TAs, S2 P1, Advises, S2
P1→S3P1, Advises, S3 0, Advises, 1 0, Advises, 1
0, Advises, 1, TAs, 2, TAs, 3 P1, Advises, S4, TAs, C1, TAs, S3 0, Advises, 1, TAs, 2, TAs, 3
… …
Physics History
Symmetrical
Sym. nodes have identical truncated hitting times
Sym. nodes have identical path distributions in a
sample of random walks
29
Learning Using Structural Motifs
Advises
Pete
Sam
Pete Saul
Paul Sar
… …
TAs
Sam CS1
Sam CS2
Sara CS1
… …
Teaches
Pete CS1
Pete CS2
Paul CS2
… …
2.7 Teaches(p, c) Æ TAs(s, c) ) Advises(p, s)
1.4 Advises(p, s) Æ Teaches(p, c)) TAs(s, c)
-1.1 TAs(s, c) ) Advises(s, p)
…
Input: Relational DB Output:MLN
LSM
IdentifyMotifs
FindPaths
CreateMLN
30
Sample Random Walks
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
Teaches
Advises
TAs
S10 S11
P2
S12 S13
C3
C4
S9
Physics History
S8
0,Advises,1,TAs,2
…1…
31
Estimate TruncatedHitting Times
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
S10 S11
P2
S12 S13
C3
C4
S9
Physics History
S8
3.2
3.21
3.55
3.52 3.52 3.52 3.52
3.55 3.55 3.55 3.93
3.99
3.99
4 4
4 4
4
0
32
Prune ‘Faraway’ Nodes
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
Physics
S8
3.2
3.21
3.55
3.52 3.52 3.52 3.52
3.55 3.55 3.55 3.93S10 S11
P2
S12 S13
C3
C4
S9
History
3.99
3.99
4 4
4 4
4
0
33
Group Nodes with Similar Hitting Times
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
S8
3.2
3.21
3.55 3.55 3.55 3.55
Candidate symmetrical nodes
3.523.523.52 3.52
0
34
Cluster Nodes
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
Cluster nodes with similar path distributions
S8
0,Advises,1
…
0.5
…0,Advises,2,…,1 0.1
35
Create ‘Lifted’ Graph
S1 S2 S3 S4
S5 S6 S7
C1 C2
S8
P1
Teaches
Advises
TAs
Professor
Course
Student
36
Extract Motif with DFS
S1 S2 S3 S4
S5 S6 S7
C1 C2
S8
P1
Teaches
Advises
TAs
Professor
Course
Student
37
Create Motif
S1
C1
P1
Teaches
Advises
TAs{ Teaches(p,c),
TAs(s,c), Advises(p,s) }
{ Teaches(P1,C1),TAs(S1,C1),
Advises(P1,S1) }
true grounding of
Motif
38
Restart from Next Node
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
S10 S11
P2
S12 S13
C3
C4
S9
Physics History
S8
C2
39
S1 S2 S3 S4
P1
S5 S6 S7
C1
C2
Physics
S8
C2
Student1
Professor
Student
Course1
Course2
different motif over sameset of constants
Restart from Next Node
40
Select Motifs
Choose motifs with large #true groundings
{ Teaches(p,c),TAs(s,c),
Advises(p,s) }
Motif Est. #True Gndings
100
{ Teaches(p,c),…} 20
… …
Pass selected motifs to FindPaths & CreateMLN
41
LSM
Advises
Pete
Sam
Pete Saul
Paul Sar
… …
TAs
Sam CS1
Sam CS2
Sara CS1
… …
Teaches
Pete CS1
Pete CS2
Paul CS2
… …
2.7 Teaches(p, c) Æ TAs(s, c) ) Advises(p, s)
1.4 Advises(p, s) ) Teaches(p, c) Æ TAs(s, c)
-1.1 TAs(s, c) ) Advises(s, p)
…
Input: Relational DB Output:MLN
LSM
IdentifyMotif
FindPaths
CreateMLN
42
FindPaths
1+ 11+ 1
{ Teaches(p,c),TAs(s,c),
Advises(p,s) }
p
s
c
Teaches
TAs
Advises
Paths Found
Advises(p,s)
Advises(p,s) ,Teaches (p,c)
Advises(p,s) ,Teaches (p,c),TAs(s,c)
c
p
s
Advises(p, s) Æ Teaches(p, c) Æ TAs(s, c) Advises(p, s) V :Teaches(p, c) V :TAs(s, c) Advises(p, s) V Teaches(p, c) V :TAs(s, c) …
:Advises(p, s) V :Teaches(p, c) V :TAs(s, c)
43
Clause Creation
1+ 11+ 1
1+ 11+ 1
Advises(p, s)
Teaches(p,c)
TAs(s,c)
Æ
Æ
1+ 11+ 1
44
Clause Pruning
1+ 11+ 1
: Advises(p, s) V :Teaches(p, c) V TAs(s, c)
Advises(p, s) V :Teaches(p, c) V TAs(s, c)…: Advises(p, s) V :Teaches(p, c): Advises(p, s) V TAs(s, c)
… : Advises(p, s) : Teaches(p, c)
:Teaches(p, c) V TAs(s, c)
TAs(s, c)
Score -1.15 -1.17
-2.21 -2.23 -2.03
-3.13 -2.93 -3.93
…
…`
45
Clause Pruning
1+ 11+ 1
: Advises(p, s) V :Teaches(p, c) V TAs(s, c)
Advises(p, s) V :Teaches(p, c) V TAs(s, c)…: Advises(p, s) V :Teaches(p, c): Advises(p, s) V TAs(s, c)
… : Advises(p, s) : Teaches(p, c)
:Teaches(p, c) V TAs(s, c)
TAs(s, c)
Score -1.15 -1.17
-2.21 -2.23 -2.03
-3.13 -2.93 -3.93
…
…
Compare each clause against its sub-clauses (taken individually)
Add all clauses to empty MLN Train weights of clauses Remove clauses with absolute weights
below threshold
46
1+ 11+ 1
MLN Creation
Background Learning Using Structural Motifs Experiments Future Work
47
Outline
IMDB Created from IMDB.com DB Movies, actors, etc., and relationships 17,793 ground atoms; 1224 true ones
UW-CSE Describes academic department Students, faculty, etc., and relationships 260,254 ground atoms; 2112 true ones
48
Datasets
Cora Citations to computer science papers Papers, authors, titles, etc., and their
relationships 687,422 ground atoms; 42,558 true ones
49
Datasets
Five-fold cross validation Inferred prob. true for groundings of each pred.
Groundings of all other predicates as evidence For Cora, inferred four predicates jointly too
SameCitation, SameTitle, SameAut, SameVenue MCMC to eval test atoms: 106 samples or 24 hrs Evaluation measures: CLL, AUC
50
Methodology
LSM, LHL, BUSL, MSLTwo lengths per system
Short length of 4 Long length of 10
51
Methodology
Series10
0.2
0.4
0.6
0.8
1
52
AUC & CLL
Series1-1
-0.8
-0.6
-0.4
-0.2
0
Series1-1
-0.8
-0.6
-0.4
-0.2
0
AU
C
Series10
0.2
0.4
0.6
0.8
1
CLL
Short Clauses Long Clauses↑5%
↑45%
LSM LHL BUSL MSL LSM LHL BUSL MSL
LSM LHL BUSL MSLLSM LHL BUSL MSL
Series10123456789
10
Series10
100000
200000
53
Runtimeshr
Short Clauses Long Clauses
LSM LHL BUSL MSL LSM LHL BUSL MSL
1.3320.6
1.92
230,000
1.83
9.96
1.83 9.81
10,000X
same local maxima
VenueOfCit(v,c) ᴧ VenueOfCit(v,c') ᴧ AuthorOfCit(a,c) ᴧ AuthorOfCit(a',c') ᴧ SameAuthor(a,a') ᴧ
TitleOfCit(t,c) ᴧ TitleOfCit(t',c') ) SameTitle(t,t') SameCitation(c,c') ᴧ TitleOfCit(t,c) ᴧ TitleOfCit(t',c') ᴧ
HasWordTitle(t,w) ᴧ HasWordTitle(t',w) ᴧ AuthorOfCit(a,c) ᴧ AuthorOfCit(a',c') ᴧ SameAuthor(a,a')
54
Long Rules Learned on Cora
Background Learning Using Structural Motifs Experiments Future Work
55
Outline
Discover motifs at multiple granularities Combine LSM with generate-and-test
approaches Apply LSM to larger, richer domains, e.g.,
Web
56
Future Work
LSM finds structural motifs in data random walks & hitting times
Accurately and efficiently learns long clauses by searching within motifs
Outperforms state-of-the-art structure learners
57
Conclusion
58
AuthorOfCit(a,c) ᴧ AuthorOfCit(a',c') ᴧ SameAuthor(a,a') ᴧ TitleOfCit(t,c) ᴧ TitleOfCit(t',c') ᴧ SameTitle(t,t') ) SameCitation(c,c')
AuthorHasWord(a,w) ᴧ AuthorHasWord(a',w') ᴧ AuthorHasWord(a'',w) ᴧ AuthorHasWord(a'',w') ) SameAuthor(a,a')
59
Long Rules Learned (Cora)
top related