Statistical Relational Learning

STATISTICAL RELATIONAL LEARNING

Joint Work with Sriraam Natarajan, Kristian Kersting, Jude Shavlik

BAYESIAN NETWORKS

Burglary Earthquake

JohnCalls

e b a0 0 0.10 1 0.81 0 0.61 1 0.9

MaryCalls

BAYESIAN NETWORK FOR A CITY

Burglary Earthquake

Calls(H1) Calls(H3)

Burglary Earthquake

Calls(H2)

Burglary Earthquake

Calls(H2) Calls(H4)

Burglary Earthquake

Calls(H3) Calls(H5)

Burglary Earthquake

Calls(H4) Calls(H6)

SHARED VARIABLESEarthquake(BL)

Alarm(H1) Alarm(H2)Alarm(H3) Alarm(H4)

Burglary(H4)

Burglary(H2) Burglary(H3)

Burglary(H1)

Calls(H1) Calls(H4) Calls(H5)Calls(H2) Calls(H3)

FIRST ORDER LOGIC

Burglary(house) Earthquake(city)

Alarm(house)

Calls(nhouse)

HouseInCity(house, city)

Alarm(house) :- HouseInCity(house, city), Earthquake(city), Burglary(house)

e b a0 0 0.10 1 0.81 0 0.61 1 0.9

Neighbor(house, nhouse)

LOGIC + PROBABILITY = STATISTICAL RELATIONAL LEARNING MODELS

Probabilities

Add Probabilities

Statistical Relational Learning

Add Relations

),(),,(),(

RPPRatingRCPCRatingmCDiff

PRatingCRating

ALPHABETIC SOUP

Knowledge-based model construction[Wellman et al., 1992]

PRISM [Sato & Kameya 1997] Stochastic logic programs [Muggleton, 1996] Probabilistic relational models [Friedman et al., 1999] Bayesian logic programs [Kersting & De Raedt, 2001] Bayesian logic [Milch et al., 2005] Markov logic [Richardson & Domingos, 2006] Relational dependency networks [Neville & Jensen 2007] ProbLog [De Raedt et al., 2007]

And many others!

RELATIONAL DATABASE

Prof Level

Course

Rating

Course Diff

Student

Course Grade

Student

IQ Satisfaction

FIRST ORDER LOGIC

Prof(P)

Level(P,L)

Diff(C)Course(C)

taughtBy(P,C)

ratings(P,C,R)

Student(S)

IQ(S,I)

satis(S,B)

takes(S,C)

grde(S,C,G)

Prof Level

Course

Rating

Course Diff

Student

Course Grade

Student

IQ Satisfaction

GRAPHICAL MODEL

satisfaction(S, B)

Diff(S, C, D)grades(S, C, G)

avgGrade(S, G) avgDiff(S, D)

P(satisfaction(S, B) | avgGrade(S, G), avgDiff(D))

RELATIONAL DECISION TREEspeed(X,S), S>120

job(X, politician)

knows(X,Y)

job(Y, politician)

Name Speed

Job Fine

Bob 120 Teacher NAlice 150 Writer NJohn 180 Politician NMary 160 Student YMike 140 Engineer Y

Person1 Person2Alice JohnMary MikeMary AliceBob MikeBob Mary

RELATIONAL DECISION TREEName Spee

dJob Fine

speed(Alice,150), 150>120

job(X, politician)

knows(X,Y)

job(Y, politician)

dJob Fine

job(Alice, politician)

knows(X,Y)

job(Y, politician)

dJob Fine

knows(Alice,John)

job(Y, politician)

dJob Fine

knows(Alice,John)

job(John, politician)

dJob Fine

knows(Alice,John)

dJob Fine

knows(Alice,John)

RELATIONAL PROBABILITY TREES

Use probabilities on the leaves

Can be used to represent the conditional distributions

Can use regression values on leaves to represent regression functions

speed(X,S), S>120

job(X, politician)

knows(X,Y)

job(Y, politician)

STRUCTURE LEARNING PROBLEM Learn the structure of the conditional

distributions

Find the parents and the distribution for the target concept

satisfaction(S, B)

avgGrade(S, G) avgDiff(S, D)

IQ(S, I) level(P, L)

RELATIONAL TREE LEARNING

student(X)

paper(X,Y)

0.7 -0.2

student(X) = T

paper(X,Y) = T paper(X,Y) = F

student(X) = F

X Δx1 0.7x2 -0.2x3 -0.9

X Yx1 y1x1 y2x3 y1

paper(X, Y)student(X) adviser(X)

X Δx1 0.7x2 -0.2

X Δx3 -0.9

X Δx2 -0.2

X Δx1 0.7

Sequentially learn models where each subsequent model corrects the previous model

FUNCTIONAL GRADIENT BOOSTING

Predictions

Residues

=Initial Model

Induce

Iterate

Final Model = + + + +…

Natarajan et al MLJ’12

BOOSTING ALGORITHM

For each gradient step m=1 to M

For each query predicate, P

Generate trainset usingprevious model, Fm-1

Learn a regression function, Tm,p

For each example, x

Compute gradient for x

Add <x, gradient(x)> to trainset

Add Tm,p to the model, Fm

Set Fm as current model

UW-CSE

UW-CSE AUC-ROC AUC-PR Likelihood Training Time

Boosting 0.96 0.93 0.81 9 s RDN 0.88 0.78 0.80 1 s

Alchemy 0.53 0.62 0.73 93 hrs

• Predict advisedBy relation • Given student, professor, courseTA,

courseProf, etc relations• 5-fold cross validation

http://pages.cs.wisc.edu/~tushar/rdnboost/index.html

CARDIA Family history, medical history, physical activity,

nutrient intake, obesity questions, pysochosocial, pulmonary function etc

Goal is to identify risk factors in early adulthood that causes serious cardio-vascular issues in older adults

Extremely rich dataset with 25 years of information

S. Natarajan , J. Carr

RESULTS

IMITATION LEARNING Expert agent performs actions (trajectories) Goal: Learn a policy from these trajectories

to suggest actions based on current state

Natarajan et al. IJCAI’11

Gridworld domain Robocup domain

ALZHEIMER'S RESEARCH AD – Progressive neurodegenerative

condition resulting in loss of cognitive abilities and memory

MRI – neuroimaging method Visualization of brain anatomy

Humans are not very good at identifying people with AD, especially before cognitive decline

MRI data – major source for distinguishing AD vs CN (Cognitively normal) or MCI vs CN

Natarajan et al. Under review

PROPOSITIONAL MODELS (WITH AAL)

CONCLUSION Statistical Relational Learning combines

first-order logic with probabilistic models

Relational trees used to represent conditional distributions

Boosting trees can be used to efficiently learn structure of SRL models

Statistical Relational Learning

bayesian logic programs

bayesian logic milch

probabilistic models

kristian kersting

nhouselogic probability

sriraam natarajan

e network

prism sato kameya

Documents

An Introduction to Statistical Relational...

Practical Statistical Relational AI

Statistical Relational Learning and Knowledge Graph...

End-to-end Differentiable Proving...While the connection...

STATISTICAL RELATIONAL LEARNING: A STATE-OF

Statistical Learning of Syntax (Among Other Higher-Order...

Relational machine-learning

Statistical Relational Learning: A Tutorial Learning -...

Statistical Relational Learning

Markov Logic And other SRL Approaches. Overview Statistical....

Determining the Number of Latent Factors in Statistical...

Gradient-based boosting for statistical relational...

Statistical Learning from Relational Data

Statistical Learning from Relational Data Daphne Koller...

STATISTICAL RELATIONAL LEARNING AND SCRIPT INDUCTION … ·...

Statistical Relational Learning · Relational learning...