Top Banner
STATISTICAL RELATIONAL LEARNING Joint Work with Sriraam Natarajan, Kristian Kersting, Jude Shavlik
31

Statistical Relational Learning

Feb 23, 2016

Download

Documents

Lois

Statistical Relational Learning. Joint Work with Sriraam Natarajan, Kristian Kersting , Jude Shavlik. Bayesian Networks. Burglary. Earthquake. Alarm. MaryCalls. JohnCalls. Bayesian Network for a City. H3. H1. Burglary. Earthquake. Burglary. Earthquake. Alarm. Alarm. Calls(H2). - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Statistical Relational Learning

STATISTICAL RELATIONAL LEARNING

Joint Work with Sriraam Natarajan, Kristian Kersting, Jude Shavlik

Page 2: Statistical Relational Learning

BAYESIAN NETWORKS

Burglary Earthquake

Alarm

JohnCalls

e b a0 0 0.10 1 0.81 0 0.61 1 0.9

e0.01

b0.01

MaryCalls

Page 3: Statistical Relational Learning

BAYESIAN NETWORK FOR A CITY

Burglary Earthquake

Alarm

Calls(H1) Calls(H3)

Burglary Earthquake

Alarm

Calls(H2)

Burglary Earthquake

Alarm

Calls(H2) Calls(H4)

Burglary Earthquake

Alarm

Calls(H3) Calls(H5)

Burglary Earthquake

Alarm

Calls(H4) Calls(H6)

H1

H2

H3

H4 H5

Page 4: Statistical Relational Learning

SHARED VARIABLESEarthquake(BL)

Alarm(H1) Alarm(H2)Alarm(H3) Alarm(H4)

Burglary(H4)

Burglary(H2) Burglary(H3)

Burglary(H1)

Calls(H1) Calls(H4) Calls(H5)Calls(H2) Calls(H3)

Page 5: Statistical Relational Learning

FIRST ORDER LOGIC

Burglary(house) Earthquake(city)

Alarm(house)

Calls(nhouse)

HouseInCity(house, city)

Alarm(house) :- HouseInCity(house, city), Earthquake(city), Burglary(house)

e b a0 0 0.10 1 0.81 0 0.61 1 0.9

Neighbor(house, nhouse)

Page 6: Statistical Relational Learning

LOGIC + PROBABILITY = STATISTICAL RELATIONAL LEARNING MODELS

Logic

Probabilities

Add Probabilities

Statistical Relational Learning

(SRL)

Add Relations

),(),,(),(

RPPRatingRCPCRatingmCDiff

PRatingCRating

Diff

Page 7: Statistical Relational Learning

ALPHABETIC SOUP

Knowledge-based model construction[Wellman et al., 1992]

PRISM [Sato & Kameya 1997] Stochastic logic programs [Muggleton, 1996] Probabilistic relational models [Friedman et al., 1999] Bayesian logic programs [Kersting & De Raedt, 2001] Bayesian logic [Milch et al., 2005] Markov logic [Richardson & Domingos, 2006] Relational dependency networks [Neville & Jensen 2007] ProbLog [De Raedt et al., 2007]

And many others!

Page 8: Statistical Relational Learning

RELATIONAL DATABASE

Prof Level

Prof

Course

Rating

Course Diff

Student

Course Grade

Student

IQ Satisfaction

Page 9: Statistical Relational Learning

FIRST ORDER LOGIC

Prof(P)

Level(P,L)

Diff(C)Course(C)

taughtBy(P,C)

ratings(P,C,R)

Student(S)

IQ(S,I)

satis(S,B)

takes(S,C)

grde(S,C,G)

Prof Level

Prof

Course

Rating

Course Diff

Student

Course Grade

Student

IQ Satisfaction

Page 10: Statistical Relational Learning

GRAPHICAL MODEL

satisfaction(S, B)

Diff(S, C, D)grades(S, C, G)

avgGrade(S, G) avgDiff(S, D)

P(satisfaction(S, B) | avgGrade(S, G), avgDiff(D))

Page 11: Statistical Relational Learning

RELATIONAL DECISION TREEspeed(X,S), S>120

job(X, politician)

knows(X,Y)

job(Y, politician)

N

N

N

Y

Y

noyes

noyes

no

no

yes

yes

Name Speed

Job Fine

Bob 120 Teacher NAlice 150 Writer NJohn 180 Politician NMary 160 Student YMike 140 Engineer Y

Person1 Person2Alice JohnMary MikeMary AliceBob MikeBob Mary

Page 12: Statistical Relational Learning

RELATIONAL DECISION TREEName Spee

dJob Fine

Bob 120 Teacher NAlice 150 Writer NJohn 180 Politician NMary 160 Student YMike 140 Engineer Y

Person1 Person2Alice JohnMary MikeMary AliceBob MikeBob Mary

speed(Alice,150), 150>120

job(X, politician)

knows(X,Y)

job(Y, politician)

N

N

N

Y

Y

noyes

noyes

no

no

yes

yes

Page 13: Statistical Relational Learning

RELATIONAL DECISION TREEName Spee

dJob Fine

Bob 120 Teacher NAlice 150 Writer NJohn 180 Politician NMary 160 Student YMike 140 Engineer Y

Person1 Person2Alice JohnMary MikeMary AliceBob MikeBob Mary

speed(Alice,150), 150>120

job(Alice, politician)

knows(X,Y)

job(Y, politician)

N

N

N

Y

Y

noyes

noyes

no

no

yes

yes

Page 14: Statistical Relational Learning

RELATIONAL DECISION TREEName Spee

dJob Fine

Bob 120 Teacher NAlice 150 Writer NJohn 180 Politician NMary 160 Student YMike 140 Engineer Y

Person1 Person2Alice JohnMary MikeMary AliceBob MikeBob Mary

speed(Alice,150), 150>120

job(Alice, politician)

knows(Alice,John)

job(Y, politician)

N

N

N

Y

Y

noyes

noyes

no

no

yes

yes

Page 15: Statistical Relational Learning

RELATIONAL DECISION TREEName Spee

dJob Fine

Bob 120 Teacher NAlice 150 Writer NJohn 180 Politician NMary 160 Student YMike 140 Engineer Y

Person1 Person2Alice JohnMary MikeMary AliceBob MikeBob Mary

speed(Alice,150), 150>120

job(Alice, politician)

knows(Alice,John)

job(John, politician)

N

N

N

Y

Y

noyes

noyes

no

no

yes

yes

Page 16: Statistical Relational Learning

RELATIONAL DECISION TREEName Spee

dJob Fine

Bob 120 Teacher NAlice 150 Writer NJohn 180 Politician NMary 160 Student YMike 140 Engineer Y

Person1 Person2Alice JohnMary MikeMary AliceBob MikeBob Mary

speed(Alice,150), 150>120

job(Alice, politician)

knows(Alice,John)

job(John, politician)

N

N

N

Y

Y

noyes

noyes

no

no

yes

yes

Page 17: Statistical Relational Learning

RELATIONAL DECISION TREEName Spee

dJob Fine

Bob 120 Teacher NAlice 150 Writer NJohn 180 Politician NMary 160 Student YMike 140 Engineer Y

Person1 Person2Alice JohnMary MikeMary AliceBob MikeBob Mary

speed(Alice,150), 150>120

job(Alice, politician)

knows(Alice,John)

job(John, politician)

N

N

N

Y

Y

noyes

noyes

no

no

yes

yes

Page 18: Statistical Relational Learning

RELATIONAL PROBABILITY TREES

Use probabilities on the leaves

Can be used to represent the conditional distributions

Can use regression values on leaves to represent regression functions

speed(X,S), S>120

job(X, politician)

knows(X,Y)

job(Y, politician)

0.1

0.2

0.4

0.8

0.8

noyes

noyes

no

no

yes

yes

Page 19: Statistical Relational Learning

STRUCTURE LEARNING PROBLEM Learn the structure of the conditional

distributions

Find the parents and the distribution for the target concept

satisfaction(S, B)

avgGrade(S, G) avgDiff(S, D)

IQ(S, I) level(P, L)

Page 20: Statistical Relational Learning

RELATIONAL TREE LEARNING

20

student(X)

paper(X,Y)

0.7 -0.2

-0.9

student(X) = T

paper(X,Y) = T paper(X,Y) = F

student(X) = F

X Δx1 0.7x2 -0.2x3 -0.9

X Yx1 y1x1 y2x3 y1

Xx1x2

paper(X, Y)student(X) adviser(X)

X Δx1 0.7x2 -0.2

X Δx3 -0.9

X Δx2 -0.2

X Δx1 0.7

0.25

Page 21: Statistical Relational Learning

Sequentially learn models where each subsequent model corrects the previous model

FUNCTIONAL GRADIENT BOOSTING

Data

Predictions

-

Residues

=Initial Model

++

Induce

Iterate

Final Model = + + + +…

ψm

Natarajan et al MLJ’12

Page 22: Statistical Relational Learning

BOOSTING ALGORITHM

For each gradient step m=1 to M

For each query predicate, P

Generate trainset usingprevious model, Fm-1

Learn a regression function, Tm,p

For each example, x

Compute gradient for x

Add <x, gradient(x)> to trainset

Add Tm,p to the model, Fm

Set Fm as current model

Page 23: Statistical Relational Learning

UW-CSE

UW-CSE AUC-ROC AUC-PR Likelihood Training Time

Boosting 0.96 0.93 0.81 9 s RDN 0.88 0.78 0.80 1 s

Alchemy 0.53 0.62 0.73 93 hrs

• Predict advisedBy relation • Given student, professor, courseTA,

courseProf, etc relations• 5-fold cross validation

http://pages.cs.wisc.edu/~tushar/rdnboost/index.html

Page 24: Statistical Relational Learning

CARDIA Family history, medical history, physical activity,

nutrient intake, obesity questions, pysochosocial, pulmonary function etc

Goal is to identify risk factors in early adulthood that causes serious cardio-vascular issues in older adults

Extremely rich dataset with 25 years of information

S. Natarajan , J. Carr

Page 25: Statistical Relational Learning

RESULTS

Page 26: Statistical Relational Learning

IMITATION LEARNING Expert agent performs actions (trajectories) Goal: Learn a policy from these trajectories

to suggest actions based on current state

Natarajan et al. IJCAI’11

Page 27: Statistical Relational Learning

Gridworld domain Robocup domain

Page 28: Statistical Relational Learning

ALZHEIMER'S RESEARCH AD – Progressive neurodegenerative

condition resulting in loss of cognitive abilities and memory

MRI – neuroimaging method Visualization of brain anatomy

Humans are not very good at identifying people with AD, especially before cognitive decline

MRI data – major source for distinguishing AD vs CN (Cognitively normal) or MCI vs CN

Natarajan et al. Under review

Page 29: Statistical Relational Learning

PROPOSITIONAL MODELS (WITH AAL)

Page 30: Statistical Relational Learning

CONCLUSION Statistical Relational Learning combines

first-order logic with probabilistic models

Relational trees used to represent conditional distributions

Boosting trees can be used to efficiently learn structure of SRL models

Page 31: Statistical Relational Learning