Top Banner
Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation Matt Gardner work done by: Ni Lao, Matt Gardner, and collaborators 1
71

Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

May 24, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Random Walk Inference Over Knowledge Bases and Text

11-805 class presentationMatt Gardner

work done by: Ni Lao, Matt Gardner, and collaborators1

Page 2: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

What is a knowledge base?

2

Page 3: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Why knowledge bases?

3

Page 4: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Why knowledge bases?

4

Page 5: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Why knowledge bases?

5

Page 6: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

But ...

6

Page 7: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

But ...

7

Page 8: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

But ...

8

Page 9: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

So we do inference

9

Page 10: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

So we do inference

- Predict missing facts given what we know

10

Page 11: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

So we do inference

- Predict missing facts given what we know

- Lots of ways to do this, today we’ll talk about random walk inference, or the Path Ranking Algorithm (PRA)

11

Page 12: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

So we do inference

- Predict missing facts given what we know

- Lots of ways to do this, today we’ll talk about random walk inference, or the Path Ranking Algorithm (PRA)

- [Lao, Mitchell, Cohen, EMNLP 2011]

12

Page 13: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Inference

If: x1 competes with(x1,x2)

x2 economic sector (x2, x3)

x3

Then: economic sector (x1, x3)

[Lao et al, EMNLP 2011]

13

Page 14: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Pittsburgh

PennsylvaniaCityInState

Philadelphia Harrisburg

U.S.

DeltaPPG

AtLocation

Atlanta DallasTokyo

Japan

CityLocatedInCountry - Selecting path features

City

Loca

tedI

nCou

ntry

Ohio River

Cincinatti

City

Lies

OnR

iver

StateInCountry

Path type Count

[Lao et al, EMNLP 2011]

14

Page 15: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Pittsburgh

PennsylvaniaCityInState

Philadelphia Harrisburg

U.S.

DeltaPPG

AtLocation

Atlanta DallasTokyo

Japan

City

Loca

tedI

nCou

ntry

Ohio River

Cincinatti

City

Lies

OnR

iver

CityLocatedInCountry - Selecting path features

StateInCountry

Path type Count

[Lao et al, EMNLP 2011]

15

Page 16: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Pittsburgh

PennsylvaniaCityInState

Philadelphia Harrisburg

U.S.

DeltaPPG

AtLocation

Atlanta DallasTokyo

Japan

City

Loca

tedI

nCou

ntry

Ohio River

Cincinatti

City

Lies

OnR

iver

CityLocatedInCountry - Selecting path features

StateInCountry

Path type CountCityInState, StateInCountry 1

[Lao et al, EMNLP 2011]

16

Page 17: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Pittsburgh

PennsylvaniaCityInState

Philadelphia Harrisburg

U.S.

DeltaPPG

AtLocation

Atlanta DallasTokyo

Japan

City

Loca

tedI

nCou

ntry

Ohio River

Cincinatti

City

Lies

OnR

iver

CityLocatedInCountry - Selecting path features

StateInCountry

Path type CountCityInState, StateInCountry 1CityInState, CityInState-1, CityLocatedInCountry 1

[Lao et al, EMNLP 2011]

17

Page 18: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Pittsburgh

PennsylvaniaCityInState

Philadelphia Harrisburg

U.S.

DeltaPPG

AtLocation

Atlanta DallasTokyo

Japan

City

Loca

tedI

nCou

ntry

Ohio River

Cincinatti

City

Lies

OnR

iver

Path type CountCityInState, StateInCountry 1CityInState, CityInState-1, CityLocatedInCountry 2

CityLocatedInCountry - Selecting path features

StateInCountry

[Lao et al, EMNLP 2011]

18

Page 19: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Pittsburgh

PennsylvaniaCityInState

Philadelphia Harrisburg

U.S.

PPG

AtLocation

Atlanta DallasTokyo

Japan

City

Loca

tedI

nCou

ntry

Ohio River

Cincinatti

City

Lies

OnR

iver

Path type CountCityInState, StateInCountry 1CityInState, CityInState-1, CityLocatedInCountry 2AtLocation-1, AtLocation, CityLocatedInCountry 1

CityLocatedInCountry - Selecting path features

StateInCountry

Delta

[Lao et al, EMNLP 2011]

19

Page 20: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Pittsburgh

PennsylvaniaCityInState

Philadelphia Harrisburg

U.S.

DeltaPPG

AtLocation

Atlanta DallasTokyo

Japan

City

Loca

tedI

nCou

ntry

Ohio River

Cincinatti

City

Lies

OnR

iver

CityLocatedInCountry - Selecting path features

StateInCountry

Path type CountCityInState, StateInCountry 2CityInState, CityInState-1, CityLocatedInCountry 24AtLocation-1, AtLocation, CityLocatedInCountry 10CityLiesOnRiver, CityLiesOnRiver-1, CityLocatedInCountry 1… …

[Lao et al, EMNLP 2011]

20

Page 21: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Pittsburgh

PennsylvaniaCityInState

Philadelphia

U.S.

DeltaPPG

AtLocation

Atlanta DallasTokyo

Japan

City

Loca

tedI

nCou

ntry

Ohio River

Cincinatti

City

Lies

OnR

iver

CityLocatedInCountry - Selecting path features

StateInCountry

Path type CountCityInState, StateInCountry 3,892CityInState, CityInState-1, CityLocatedInCountry 234AtLocation-1, AtLocation, CityLocatedInCountry 1,543CityLiesOnRiver, CityLiesOnRiver-1, CityLocatedInCountry 123… …

Harrisburg

[Lao et al, EMNLP 2011]

21

Page 22: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Pittsburgh

PennsylvaniaCityInState

Philadelphia Harrisburg

U.S.

DeltaPPG

AtLocation

Atlanta DallasTokyo

Japan

City

Loca

tedI

nCou

ntry

Ohio River

Cincinatti

City

Lies

OnR

iver

Path type CountCityInState, StateInCountry 8,172CityInState, CityInState-1, CityLocatedInCountry 2,234AtLocation-1, AtLocation, CityLocatedInCountry 5,273CityLiesOnRiver, CityLiesOnRiver-1, CityLocatedInCountry 298… …

CityLocatedInCountry - Selecting path features

Select the most frequent path types, and keep them as features in the model

StateInCountry

[Lao et al, EMNLP 2011]

22

Page 23: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Feature = Typed Path CityInState, CityInState-1, CityLocatedInCountry

Pittsburgh

Feature Value

[Lao et al, EMNLP 2011]

CityLocatedInCountry - Computing feature values for (Pittsburgh, USA)

23

Page 24: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Feature = Typed Path CityInState, CityInState-1, CityLocatedInCountry

Pittsburgh

Pennsylvania

CityInS

tate

Feature Value

[Lao et al, EMNLP 2011]

CityLocatedInCountry - Computing feature values for (Pittsburgh, USA)

24

Page 25: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Feature = Typed Path CityInState, CityInState-1, CityLocatedInCountry

Pittsburgh

Pennsylvania

CityInS

tate

CityInState-1 CityInState -1

Philadelphia Harrisburg

…(14)

Feature Value

[Lao et al, EMNLP 2011]

CityLocatedInCountry - Computing feature values for (Pittsburgh, USA)

25

Page 26: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Feature = Typed Path CityInState, CityInState-1, CityLocatedInCountry

Pittsburgh

Pennsylvania

CityInS

tate

CityInState-1 CityInState -1

Philadelphia Harrisburg

…(14)

U.S.

Feature Value

CityLocatedInCountry

[Lao et al, EMNLP 2011]

CityLocatedInCountry - Computing feature values for (Pittsburgh, USA)

26

Page 27: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Feature = Typed Path CityInState, CityInState-1, CityLocatedInCountry 1.0

Pittsburgh

Pennsylvania

CityInS

tate

CityInState-1 CityInState -1

Philadelphia Harrisburg

…(14)

U.S.

Feature Value

CityLocatedInCountry

Pr(U.S. | Pittsburgh, TypedPath)

[Lao et al, EMNLP 2011]

CityLocatedInCountry - Computing feature values for (Pittsburgh, USA)

27

Page 28: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Feature = Typed Path CityInState, CityInState-1, CityLocatedInCountry 1.0 AtLocation-1, AtLocation, CityLocatedInCountry

Pittsburgh

Pennsylvania

CityInS

tate

CityInState-1 CityInState -1

Philadelphia Harrisburg

…(14)

U.S.

Feature Value

CityLocatedInCountry

[Lao et al, EMNLP 2011]

CityLocatedInCountry - Computing feature values for (Pittsburgh, USA)

28

Page 29: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Feature = Typed Path CityInState, CityInState-1, CityLocatedInCountry 1.0 AtLocation-1, AtLocation, CityLocatedInCountry

Pittsburgh

Pennsylvania

CityInS

tate

CityInState-1 CityInState -1

Philadelphia Harrisburg

…(14)

U.S.

Feature Value

CityLocatedInCountry

DeltaPPG

AtLocation -1

[Lao et al, EMNLP 2011]

CityLocatedInCountry - Computing feature values for (Pittsburgh, USA)

29

Page 30: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Feature = Typed Path CityInState, CityInState-1, CityLocatedInCountry 1.0 AtLocation-1, AtLocation, CityLocatedInCountry

Pittsburgh

Pennsylvania

CityInS

tate

CityInState-1 CityInState -1

Philadelphia Harrisburg

…(14)

U.S.

Feature Value

CityLocatedInCountry

DeltaPPG

AtLocation -1 AtLocationAtlanta Dallas

Tokyo

[Lao et al, EMNLP 2011]

CityLocatedInCountry - Computing feature values for (Pittsburgh, USA)

30

Page 31: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Feature = Typed Path CityInState, CityInState-1, CityLocatedInCountry 1.0 AtLocation-1, AtLocation, CityLocatedInCountry 0.6

Pittsburgh

Pennsylvania

CityInS

tate

CityInState-1 CityInState -1

Philadelphia Harrisburg

…(14)

U.S.

Feature Value

CityLocatedInCountry

DeltaPPG

AtLocation -1 AtLocationAtlanta Dallas

Tokyo

Japan

City

Loca

tedI

nCou

ntry

[Lao et al, EMNLP 2011]

CityLocatedInCountry - Computing feature values for (Pittsburgh, USA)

31

Page 32: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Feature = Typed Path CityInState, CityInState-1, CityLocatedInCountry 1.0 AtLocation-1, AtLocation, CityLocatedInCountry 0.6 … …

Pittsburgh

Pennsylvania

CityInS

tate

CityInState-1 CityInState -1

Philadelphia Harrisburg

…(14)

U.S.

Feature Value

CityLocatedInCountry

DeltaPPG

AtLocation -1 AtLocationAtlanta Dallas

Tokyo

Japan

City

Loca

tedI

nCou

ntry

[Lao et al, EMNLP 2011]

This is a row in a feature matrix! Use standard logistic regression

CityLocatedInCountry - Computing feature values for (Pittsburgh, USA)

32

Page 33: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

PRA Feature Matrix

CityInState, C

ityInState

-1 , CityL

ocatedInCountry

A

tLocation-1 , A

tLocation, CityL

ocatedInCountry

(Pittsburgh, USA)(Pittsburgh, Japan)…(Tokyo, USA)(Tokyo, Japan)…

1.0 0.6 0.0 0.4 … 0.0 0.2 0.2 0.1 … … … … …

CityLocatedInCountry

33

Page 34: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

PRA Feature Matrix

CityInState, C

ityInState

-1 , CityL

ocatedInCountry

A

tLocation-1 , A

tLocation, CityL

ocatedInCountry

(Pittsburgh, USA)(Pittsburgh, Japan)…(Tokyo, USA)(Tokyo, Japan)…

1.0 0.6 0.0 0.4 … 0.0 0.2 0.2 0.1 … … … … …

CityLocatedInCountry

34

Large data sets?

Page 35: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

PRA Feature Matrix

CityInState, C

ityInState

-1 , CityL

ocatedInCountry

A

tLocation-1 , A

tLocation, CityL

ocatedInCountry

(Pittsburgh, USA)(Pittsburgh, Japan)…(Tokyo, USA)(Tokyo, Japan)…

1.0 0.6 0.0 0.4 … 0.0 0.2 0.2 0.1 … … … … …

CityLocatedInCountry

35

Large data sets?

Page 36: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

PRA Feature Matrix

CityInState, C

ityInState

-1 , CityL

ocatedInCountry

A

tLocation-1 , A

tLocation, CityL

ocatedInCountry

(Pittsburgh, USA)(Pittsburgh, Japan)…(Tokyo, USA)(Tokyo, Japan)…

1.0 0.6 0.0 0.4 … 0.0 0.2 0.2 0.1 … … … … …

CityLocatedInCountry

36

Large data sets?

Page 37: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

PRA Feature Matrix

CityInState, C

ityInState

-1 , CityL

ocatedInCountry

A

tLocation-1 , A

tLocation, CityL

ocatedInCountry

(Pittsburgh, USA)(Pittsburgh, Japan)…(Tokyo, USA)(Tokyo, Japan)…

1.0 0.6 0.0 0.4 … 0.0 0.2 0.2 0.1 … … … … …

CityLocatedInCountry

37

Large data sets?

Page 38: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Implementation

38

Page 39: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Implementation

- Small graph: in memory

39

Page 40: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Implementation

- Small graph: in memory

- Larger graph: GraphChi

40

Page 41: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Implementation

- Small graph: in memory

- Larger graph: GraphChi

- Vertex-centric computation. Go through the graph

sequentially, processing each walk at each vertex

and sending it to the next stop.

41

Page 42: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Implementation

- Small graph: in memory

- Larger graph: GraphChi

- Vertex-centric computation. Go through the graph

sequentially, processing each walk at each vertex

and sending it to the next stop.

- On a cluster: have a graph server

42

Page 43: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

So...

43

Page 44: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

So...

44

- KB inference is all well and good, but…

Page 45: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

So...

45

- KB inference is all well and good, but…

- What about text?

Page 46: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Inference over KB plus text

46

Page 47: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Inference over KB plus text

47

- Augment NELL or Freebase with information automatically extracted from text

Page 48: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Inference over KB plus text

48

- Augment NELL or Freebase with information automatically extracted from text

- From the combined graph, predict new NELL (or Freebase) relations

Page 49: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Inference over KB plus text

49

- Augment NELL or Freebase with information automatically extracted from text

- From the combined graph, predict new NELL (or Freebase) relations

- Basically “aggregate” relation extraction

Page 50: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Simple relation embeddings

PCA

50

Page 51: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

“Steel City overlooks the Allegheny”“Pittsburgh lies on the Mon”“Pittsburgh sits on the Monongahela”

51

Page 52: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

“Steel City overlooks the Allegheny”“Pittsburgh lies on the Mon”“Pittsburgh sits on the Monongahela”

“Steel City”

“Pittsburgh”

“the Allegheny”

“the Mon”

“the Monongahela”52

Page 53: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

“Steel City overlooks the Allegheny”“Pittsburgh lies on the Mon”“Pittsburgh sits on the Monongahela”

“Steel City”

“Pittsburgh”

“the Allegheny”

“the Mon”

“overlooks”

“lies on”

“sits on”

“the Monongahela”53

Page 54: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

“Steel City overlooks the Allegheny”“Pittsburgh lies on the Mon”“Pittsburgh sits on the Monongahela”

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

“the Allegheny”

“the Mon”

“overlooks”

“lies on”

“sits on”

“the Monongahela”54

Page 55: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

“Steel City overlooks the Allegheny”“Pittsburgh lies on the Mon”“Pittsburgh sits on the Monongahela”

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

“the Allegheny”

“the Mon” canReferTo

canReferTo“overlooks”

“lies on”

“sits on”

“the Monongahela”canReferTo

55

canReferTo

Page 56: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “lies on”, canReferTo

KB + Text

56

Page 57: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “lies on”, canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“overlooks”

“lies on”

“sits on”

“the Monongahela”canReferTo

KB + Text

57

Page 58: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “lies on”, canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“overlooks”

“lies on”

“sits on”

“the Monongahela”canReferTo

KB + Text

58

Page 59: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “lies on”, canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“overlooks”

“lies on”

“sits on”

“the Monongahela”canReferTo

- Large data problem: verb forms are sparse!

- Can clustering help? [Gardner et al., EMNLP 2013]

- “lies on” -> C1- “sits on” -> C1- “overlooks” -> C2

KB + Text

59

Page 60: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “C1”, canReferTo

KB + Clustered Text

60

Page 61: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “C1”, canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“C2”

“C1”

“C1”

“the Monongahela”canReferTo

KB + Clustered Text

61

Page 62: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “C1”, canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“C2”

“C1”

“C1”

“the Monongahela”canReferTo

KB + Clustered Text

62

Page 63: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “C1”, canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“C2”

“C1”

“C1”

“the Monongahela”canReferTo

KB + Clustered Text

63

Page 64: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “C1”, canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“C2”

“C1”

“C1”

“the Monongahela”canReferTo

- Much better, but still can be sparse

- Use vector space similarity directly [Gardner et al., EMNLP 2014]

- “lies on” -> [.4, .3, -.1, .5]- “sits on” -> [.35, .3, -.2, .4]- “overlooks” -> [.2, .4, -.05, .2]

KB + Clustered Text

64

Page 65: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “lies on”: [.4, .3, -.1, .5], canReferTo

KB + Text Vectors

65

Page 66: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “lies on”: [.4, .3, -.1, .5], canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“overlooks”: [.2, .4, -.05, .2]

“lies on”: [.4, .3, -.1, .5]

“sits on”: [.35, .3, -.2, .4]

“the Monongahela”canReferTo

KB + Text Vectors

66

Page 67: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “lies on”: [.4, .3, -.1, .5], canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“overlooks”: [.2, .4, -.05, .2]

“lies on”: [.4, .3, -.1, .5]

“sits on”: [.35, .3, -.2, .4]

“the Monongahela”canReferTo

KB + Text Vectors

67

Page 68: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “lies on”: [.4, .3, -.1, .5], canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“overlooks”: [.2, .4, -.05, .2]

“lies on”: [.4, .3, -.1, .5]

“sits on”: [.35, .3, -.2, .4]

“the Monongahela”canReferTo

KB + Text Vectors

68

Page 69: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Relation: cityLiesOnRiver

Path: canReferTo-1, “lies on”: [.4, .3, -.1, .5], canReferTo

Pittsburgh

Allegheny River

Monongahela River

“Steel City”

“Pittsburgh”

canR

efer

To

canReferTo

“the Allegheny”

“the Mon” canReferTo

canReferTo“overlooks”: [.2, .4, -.05, .2]

“lies on”: [.4, .3, -.1, .5]

“sits on”: [.35, .3, -.2, .4]

“the Monongahela”canReferTo

KB + Text Vectors

69

Page 70: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Empirical Results

70

Page 71: Random Walk Inference Over Knowledge Bases and Textwcohen/10-605/2015-guest-lecture/matt-805-pres… · Random Walk Inference Over Knowledge Bases and Text 11-805 class presentation

Questions?

71