Top Banner
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington
29

Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Statistical Relational Learning

Pedro Domingos

Dept. Computer Science & Eng.

University of Washington

Page 2: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Overview

Motivation Some approaches Markov logic Application: Information extraction Challenges and open problems

Page 3: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Motivation

Most learners only apply to i.i.d. vectors But we need to do learning and (uncertain)

inference over arbitrary structures:trees, graphs, class hierarchies,relational databases, etc.

All these can be expressed in first-order logic

Let’s add learning and uncertain inference to first-order logic

Page 4: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Some Approaches

Probabilistic logic [Nilsson, 1986] Statistics and beliefs [Halpern, 1990] Knowledge-based model construction

[Wellman et al., 1992] Stochastic logic programs [Muggleton, 1996] Probabilistic relational models [Friedman et al., 1999] Relational Markov networks [Taskar et al., 2002] Markov logic [Richardson & Domingos, 2004] Bayesian logic [Milch et al., 2005] Etc.

Page 5: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Markov Logic

Logical formulas are hard constraintson the possible states of the world

Let’s make them soft constraints:When a state violates a formula,It becomes less probable, not impossible

Give each formula a weight(Higher weight Stronger constraint)

More precisely:Consider each grounding of a formula

Page 6: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Example: Friends & Smokers

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

1.1

5.1

Cancer(A)

Smokes(A)Friends(A,A)

Friends(B,A)

Smokes(B)

Friends(A,B)

Cancer(B)

Friends(B,B)

Two constants: Anna (A) and Bob (B)

Page 7: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Markov Logic (Contd.)

Probability of a state x:

Most discrete statistical models are special cases (e.g., Bayes nets, HMMs, etc.)

First-order logic is infinite-weight limit

Weight of formula i No. of true groundings of formula i in x

iii xnw

ZxP )(exp

1)(

Page 8: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Key Ingredients

Logical inference:Satisfiability testing

Probabilistic inference:Markov chain Monte Carlo

Inductive logic programming:Search with clause refinement operators

Statistical learning:Weight optimization by conjugate gradient

Page 9: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Alchemy

Open-source software available at:

A new kind of programming language Write formulas, learn weights, do inference Haven’t we seen this before? Yes, but without learning and uncertain

inference

alchemy.cs.washington.edu

Page 10: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Example:Information Extraction

Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).

Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.

H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.

P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.

Page 11: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Segmentation

Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).

Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.

H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.

P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.

Author

Title

Venue

Page 12: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Entity Resolution

Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).

Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.

H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.

P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.

Page 13: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Entity Resolution

Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).

Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.

H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.

P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.

Page 14: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

State of the Art

Segmentation HMM (or CRF) to assign each token to a field

Entity resolution Logistic regression to predict same field/citation Transitive closure

Alchemy implementation: Seven formulas

Page 15: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Types and Predicates

token = {Parag, Singla, and, Pedro, ...}field = {Author, Title, Venue}citation = {C1, C2, ...}position = {0, 1, 2, ...}

Token(token, position, citation)InField(position, field, citation)SameField(field, citation, citation)SameCit(citation, citation)

Page 16: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Types and Predicates

token = {Parag, Singla, and, Pedro, ...}field = {Author, Title, Venue, ...}citation = {C1, C2, ...}position = {0, 1, 2, ...}

Token(token, position, citation)InField(position, field, citation)SameField(field, citation, citation)SameCit(citation, citation)

Optional

Page 17: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Types and Predicates

Input

token = {Parag, Singla, and, Pedro, ...}field = {Author, Title, Venue}citation = {C1, C2, ...}position = {0, 1, 2, ...}

Token(token, position, citation)InField(position, field, citation)SameField(field, citation, citation)SameCit(citation, citation)

Page 18: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

token = {Parag, Singla, and, Pedro, ...}field = {Author, Title, Venue}citation = {C1, C2, ...}position = {0, 1, 2, ...}

Token(token, position, citation)InField(position, field, citation)SameField(field, citation, citation)SameCit(citation, citation)

Types and Predicates

Output

Page 19: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Token(+t,i,c) => InField(i,+f,c)InField(i,+f,c) <=> InField(i+1,+f,c)f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))

Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’) ^ InField(i’,+f,c’) => SameField(+f,c,c’)SameField(+f,c,c’) <=> SameCit(c,c’)SameField(f,c,c’) ^ SameField(f,c’,c”) => SameField(f,c,c”)SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)

Formulas

Page 20: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Formulas

Token(+t,i,c) => InField(i,+f,c)InField(i,+f,c) <=> InField(i+1,+f,c)f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))

Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’) ^ InField(i’,+f,c’) => SameField(+f,c,c’)SameField(+f,c,c’) <=> SameCit(c,c’)SameField(f,c,c’) ^ SameField(f,c’,c”) => SameField(f,c,c”)SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)

Page 21: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Formulas

Token(+t,i,c) => InField(i,+f,c)InField(i,+f,c) <=> InField(i+1,+f,c)f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))

Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’) ^ InField(i’,+f,c’) => SameField(+f,c,c’)SameField(+f,c,c’) <=> SameCit(c,c’)SameField(f,c,c’) ^ SameField(f,c’,c”) => SameField(f,c,c”)SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)

Page 22: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Formulas

Token(+t,i,c) => InField(i,+f,c)InField(i,+f,c) <=> InField(i+1,+f,c)f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))

Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’) ^ InField(i’,+f,c’) => SameField(+f,c,c’)SameField(+f,c,c’) <=> SameCit(c,c’)SameField(f,c,c’) ^ SameField(f,c’,c”) => SameField(f,c,c”)SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)

Page 23: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Token(+t,i,c) => InField(i,+f,c)InField(i,+f,c) <=> InField(i+1,+f,c)f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))

Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’) ^ InField(i’,+f,c’) => SameField(+f,c,c’)SameField(+f,c,c’) <=> SameCit(c,c’)SameField(f,c,c’) ^ SameField(f,c’,c”) => SameField(f,c,c”)SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)

Formulas

Page 24: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Token(+t,i,c) => InField(i,+f,c)InField(i,+f,c) <=> InField(i+1,+f,c)f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))

Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’) ^ InField(i’,+f,c’) => SameField(+f,c,c’)SameField(+f,c,c’) <=> SameCit(c,c’)SameField(f,c,c’) ^ SameField(f,c’,c”) => SameField(f,c,c”)SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)

Formulas

Page 25: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Formulas

Token(+t,i,c) => InField(i,+f,c)InField(i,+f,c) <=> InField(i+1,+f,c)f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))

Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’) ^ InField(i’,+f,c’) => SameField(+f,c,c’)SameField(+f,c,c’) <=> SameCit(c,c’)SameField(f,c,c’) ^ SameField(f,c’,c”) => SameField(f,c,c”)SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)

Page 26: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Formulas

Token(+t,i,c) => InField(i,+f,c)InField(i,+f,c) ^ !Token(“.”,i,c) <=> InField(i+1,+f,c)f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))

Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’) ^ InField(i’,+f,c’) => SameField(+f,c,c’)SameField(+f,c,c’) <=> SameCit(c,c’)SameField(f,c,c’) ^ SameField(f,c’,c”) => SameField(f,c,c”)SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)

Page 27: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Results: Segmentation on Cora

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Recall

Pre

cis

ion

Tokens

Tokens + Sequence

Tok. + Seq. + Period

Tok. + Seq. + P. + Comma

Page 28: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Results:Matching Venues on Cora

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Recall

Pre

cis

ion

Similarity

Sim. + Relations

Sim. + Transitivity

Sim. + Rel. + Trans.

Page 29: Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Challenges and Open Problems

Scaling up learning and inference Model design (aka knowledge engineering) Generalizing across domain sizes Continuous distributions Relational data streams Relational decision theory Statistical predicate invention Experiment design