Markov Logic: A Simple and Powerful Unification Of Logic and Probability Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Stanley Kok, Daniel Lowd, Hoifung Poon, Matt Richardson, Parag Singla, Marc Sumner, and Jue Wang
67
Embed
Markov Logic: A Simple and Powerful Unification Of Logic and Probability Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Markov Logic:A Simple and Powerful Unification
Of Logic and Probability
Pedro DomingosDept. of Computer Science & Eng.
University of Washington
Joint work with Stanley Kok, Daniel Lowd,Hoifung Poon, Matt Richardson, Parag Singla,
Inferring the Most Probable Explanation Problem: Find most likely state of world
given evidence
)|(max xyPy
Query Evidence
Inferring the Most Probable Explanation Problem: Find most likely state of world
given evidence
i
iix
yyxnw
Z),(exp
1max
Inferring the Most Probable Explanation Problem: Find most likely state of world
given evidence
i
iiy
yxnw ),(max
Inferring the Most Probable Explanation Problem: Find most likely state of world
given evidence
This is just the weighted MaxSAT problem Use weighted SAT solver
(e.g., MaxWalkSAT [Kautz et al., 1997] ) Potentially faster than logical inference (!)
i
iiy
yxnw ),(max
The WalkSAT Algorithm
for i ← 1 to max-tries do solution = random truth assignment for j ← 1 to max-flips do if all clauses satisfied then return solution c ← random unsatisfied clause with probability p flip a random variable in c else flip variable in c that maximizes number of satisfied clausesreturn failure
The MaxWalkSAT Algorithm
for i ← 1 to max-tries do solution = random truth assignment for j ← 1 to max-flips do if ∑ weights(sat. clauses) > threshold then return solution c ← random unsatisfied clause with probability p flip a random variable in c else flip variable in c that maximizes ∑ weights(sat. clauses) return failure, best solution found
But … Memory Explosion
Problem: If there are n constantsand the highest clause arity is c,the ground network requires O(n ) memory
P(Formula|MLN,C) = ? MCMC: Sample worlds, check formula holds P(Formula1|Formula2,MLN,C) = ? If Formula2 = Conjunction of ground atoms
First construct min subset of network necessary to answer query (generalization of KBMC)
Then apply MCMC (or other) Can also do lifted inference [Braz et al, 2005]
MCMC: Gibbs Sampling
state ← random truth assignmentfor i ← 1 to num-samples do for each variable x sample x according to P(x|neighbors(x)) state ← state with new value of xP(F) ← fraction of states in which F is true
But … Insufficient for Logic
Problem:Deterministic dependencies break MCMCNear-deterministic ones make it very slow
Data is a relational database Closed world assumption (if not: EM) Learning parameters (weights)
Generatively Discriminatively
Learning structure (formulas)
Generative Weight Learning
Maximize likelihood Use gradient ascent or L-BFGS No local maxima
Requires inference at each step (slow!)
No. of true groundings of clause i in data
Expected no. true groundings according to model
)()()(log xnExnxPw iwiwi
Pseudo-Likelihood
Likelihood of each variable given its neighbors in the data
Does not require inference at each step Consistent estimator Widely used in vision, spatial statistics, etc. But PL parameters may not work well for
long inference chains
i
ii xneighborsxPxPL ))(|()(
Discriminative Weight Learning
Maximize conditional likelihood of query (y) given evidence (x)
Expected counts ≈ Counts in most prob. state of y given x, found by MaxWalkSAT
No. of true groundings of clause i in data
Expected no. true groundings according to model
),(),()|(log yxnEyxnxyPw iwiwi
Structure Learning
Generalizes feature induction in Markov nets Any inductive logic programming approach can be
used, but . . . Goal is to induce any clauses, not just Horn Evaluation function should be likelihood Requires learning weights for each candidate Turns out not to be bottleneck Bottleneck is counting clause groundings Solution: Subsampling
Structure Learning
Initial state: Unit clauses or hand-coded KB Operators: Add/remove literal, flip sign Evaluation function:
Information extraction Entity resolution Link prediction Collective classification Web mining Natural language
processing
Computational biology Social network analysis Robot mapping Activity recognition Probabilistic Cyc CALO Etc.
Information Extraction
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
Segmentation
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
Author
Title
Venue
Entity Resolution
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
Entity Resolution
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
State of the Art
Segmentation HMM (or CRF) to assign each token to a field
Entity resolution Logistic regression to predict same field/citation Transitive closure
Markov logic theorems Further improving scalability, robustness
and ease of use Online learning and inference Discovering deep structure Generalizing across domains and tasks Relational decision theory Solving larger applications Adversarial settings Etc.
Summary
Markov logic combines full power offirst-order logic and probabilistic networks Syntax: First-order logic + Weights Semantics: Templates for Markov networks
Inference: LazySAT, MC-SAT, etc. Learning: Statistical learning, ILP, etc. Applications: Information extraction, etc. Software: alchemy.cs.washington.edu