Top Banner
Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: http://www.cs.biu.ac.il/~nlp/downloads/biutee BIUTEE
42

Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Dec 26, 2015

Download

Documents

Nickolas Webb
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Knowledge and Tree-Edits in Learnable Entailment Proofs

Asher Stern and Ido Dagan(earlier partial version by Roy Bar-Haim)

Download at: http://www.cs.biu.ac.il/~nlp/downloads/biutee

BIUTEE

Page 2: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

2

Transformation-based Inference

Sequence of transformations (A proof)

Tree-Edits Complete proofs – by limited pre-defined set of

operations Estimate confidence in each operation

Knowledge based Entailment Rules Arbitrary knowledge-based transformations Formalize many types of knowledge

T = T0 → T

1 → T

2 → ... → T

n = H

Page 3: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

3

Page 4: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

4

Transformation based RTE - Example

T = T0 → T

1 → T

2 → ... → T

n = H

Text: The boy was located by the police.

Hypothesis: Eventually, the police found the child.

Page 5: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Transformation based RTE - Example

T = T0 → T

1 → T

2 → ... → T

n = H

Text: The boy was located by the police.

The police located the boy.

The police found the boy.

The police found the child.

Hypothesis: Eventually, the police found the child.

5

Page 6: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Transformation based RTE - Example

T = T0 → T

1 → T

2 → ... → T

n = H

6

Page 7: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

7

BIUTEE’s Inference Formalism

Analogy to logic proof systems:

Parse Trees Propositions

Tree transformation/generation Inference Steps

Sequence of generated trees: T … Ti … H Proof

Page 8: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

8

BIUTEE Goals

Rely on Entailment Rules Supported by many types of knowledge

Tree Edits Allow complete proofs

BIUTEE Integrates the benefits of both Estimate confidence of both

Page 9: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

9

Challenges / System Components

1. generate linguistically motivated complete proofs?

2. estimate proof confidence?

3. find the best proof?

4. learn the model parameters?

How to…

Page 10: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

1. Generate linguistically motivated complete proofs

10

Page 11: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Knowledge-based Entailment Rules

boy child

Generic Syntactic

Lexical Syntactic

Lexical

Bar-Haim et al. 2007. Semantic inference at the lexical-syntactic level .

Page 12: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

12

Extended Tree Edits (On The Fly Operations)

Predefined custom tree edits Insert node on the fly Move node / move sub-tree on the fly Flip part of speech …

Heuristically capture linguistic phenomena Operation definition Features – to estimate confidence

Page 13: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Proof over Parse Trees - Example

T = T0 → T

1 → T

2 → ... → T

n = H

Text: The boy was located by the police.

Passive to active

The police located the boy.

X locate Y X find Y

The police found the boy.

Boy child

The police found the child.

Tree-edit insertion

Hypothesis: Eventually, the police found the child.

13

Page 14: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

14

Co-reference Substitution For co-referring subtrees S1 , S2:

Copy source tree containing S1 while replacing it with S2

My brother is a musician. He plays the drums.

beverb

ROOTi

musiciannoun

brothernoun

subj

mynoun

gen

adet

det

predplayverb

drumnoun

thedet

obj

ROOTi

henoun

det

subjplayverb

drumnoun

thedet

obj

ROOTi

det

subj

My brother plays the drums.

brothernoun

mynoun

gen

Page 15: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

2. Estimate proof confidence

15

Page 16: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

16

Cost based Model (Variant of Raina et al., 2005) Define operation cost

Represent each operation as a feature vector Cost is linear combination of feature values

Define proof cost as the sum of the operations’ costs

Classify: entailment if and only if proof cost is lower than a threshold

Page 17: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Feature vector representation Define operation feature value Represent each operation as a feature vectorFeatures (Insert-Named-Entity, Insert-Verb, … , WordNet, Lin, DIRT, …)

The police located the boy.DIRT: X locate Y X find Y (score = 0.9)

The police found the boy.

)0,0,…,0.257,…,0((0 ,0,…,0,…,0)Feature vector that

represents the operation 17

An operation

A downward function of score

Page 18: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

18

Cost based Model

Define operation cost– Cost is standard linear combination of feature values

Cost = weight-vector * feature-vector Weight-vector is learned automatically

)())(( ofwofC Tw

Page 19: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Confidence Model

Define operation cost Represent each operation as a feature vector

Define proof cost as the sum of the operations’ costs

)()()()(11

PfwofwoCPC Tn

ii

Tn

iiww

Cost of proofWeight vector

Vector represents the proof.

Define)()(

1

Pfofn

ii

Page 20: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

20

Feature vector representation - example

T = T0 → T

1 → T

2 → ... → T

n = H

)0,0,..…….………………,1,0(

)0,0……..………,0.457,..,0,0(

)0,0…..,0.5,..…….……….,0,0(

)0,0,1,....…..….……..……,0,0(

)0,0,1..0.5…..0.457…....,1,0(

+

+

+

=

Text: The boy was located by the police.

Passive to active

The police located the boy.

X locate Y X find Y

The police found the boy.

Boy child

The police found the child.

Insertion on the fly

Hypothesis: Eventually, the police found the child.

Page 21: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Cost based Model Define operation cost

Represent each operation as a feature vector Define proof cost as the sum of the

operations’ costs Classify: “entailing” if and only if proof cost is

smaller than a threshold

bPfwT )(

21Learn

Page 22: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

3. Find the best proof

22

Page 23: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

23

Search the best proof

Proof #1Proof #2Proof #3Proof #4

T - H

Page 24: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

24

Search the best proof

• Need to consider the “best” proof for the positive pairs• “Best Proof” = proof with lowest cost

‒ Assuming a weight vector is given• Search space exponential – AI-style search (ACL-12)

‒ Gradient-based evaluation function‒ Local lookahead for “complex” operations

Proof #1Proof #2Proof #3Proof #4

T HProof #1Proof #2Proof #3Proof #4

T H

Page 25: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

4. Learn model parameters

25

Page 26: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

26

Learning

Goal: Learn parameters (w, b) Use a linear learning algorithm

logistic regression

Page 27: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

27

Inference vs. Learning

Training samples

Vector representation

Learning algorithm

w,bBest

Proofs

Feature extraction

Feature extraction

Page 28: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

28

Inference vs. Learning

Training samples

Vector representation

Learning algorithm

w,bBest

Proofs

Feature extraction

Page 29: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

29

Iterative Learning Scheme

Training samples

Vector representation

Learning algorithm

w,bBest

Proofs

1 .W=reasonable guess

2 .Find the best proofs

3 .Learn new w and b

4 .Repeat to step 2

Page 30: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

30

Summary- System Components

1. Generate syntactically motivated complete proofs?

Entailment rules On the fly operations (Extended Tree Edit Operations)

2. Estimate proof validity? Confidence Model

3. Find the best proof? Novel search Algorithm

4. Learn the model parameters? Iterative Learning Scheme

How to

Page 31: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Results RTE 1-5

31

System RTE-1 RTE-2 RTE-3 RTE-5

Raina et al. 2005 57.0

Harmeling, 2009 56.39 57.88

Wang and Manning, 2010 63.0 61.10

Bar-Haim et al., 2007 61.12 63.80

Mehdad and Magnini, 2009 58.62 59.87 62.4 60.2

Our System 57.13 61.63 67.13 63.50

Text: Hypothesis:

Text: Hypothesis:

Evaluation by accuracy – comparison with transformation-based systems

Page 32: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

32

Results RTE 6

RTE 6 (F1%)

Base line (Use IR top-5 relevance) 34.63

Median (2010) 36.14

Best (2010) 48.01

Our system 49.54

Natural distribution of entailmentsEvaluation by Recall / Precision / F1

Page 33: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

Hybrid mode:Transformation and Similarity

33

Page 34: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

34

Hybrid Mode

Recall: Entailment rules are insufficient in generating a complete proof of H from T.

The problem: Incomplete proofs are insufficient for entailment

decision, neither are they comparable with each other. Learning & Inference cannot be performed.

Solutions: First solution – complete proof: incorporate (less

reliable) on-the-fly transformations, to “force” a proof, even if it’s incorrect

Second solution: hybrid mode

Page 35: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

35

Hybrid Mode

Generate partial proofs, using only the more reliable transformations

Jointly estimate cost of transformations and remaining gap

Note: feature vectors of incomplete proofs + remaining gaps are comparable!

Features(WordNet, Lin, DIRT, …, Missing predicate, predicate mismatch …)

Features (WordNet, Lin, DIRT, … , Insert-Named-Entity, Insert-Verb, … )Feature vector in pure mode

Feature vector in hybrid mode

Page 36: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

36

Hybrid Mode: feature categories

Lexical features

1. H’s words that are missing in T predicate-argument features

1. Argument in H missing from T

2. Same argument is connected to a different predicate in T and H

3. Argument partial match (e.g. “boy” vs. “little boy”)

4. Predicate in H missing from T

Page 37: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

37

Hybrid Mode: discussion

Hybrid mode pros. Clear and explainable proofs

Hybrid mode cons. Cannot chain on-the-fly transformations required

to subsequently apply entailment rules. Abstraction of T and H, so some gaps are not

detected.

Page 38: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

38

Hybrid Mode: example T: The death of Allama Hassan Turabi is likely to raise

tensions …muslim … He was the leader of a Shia party, Islami Tehreek Pakistan …

H: Allama Hassan Turabi was a member of the Pakistani Shia Muslim in Karachi.

Pure mode Co-reference: He Turabi DIRT: X is leader of Y X is member of Y 5 insertions (“a”, “the”, “pakistani”, “in” “karachi”) One move (“muslim” under “member”)

Hybrid mode: Co-reference and DIRT as in pure mode Gap detected: “Pakistani Shia Muslim” is not connected to the

predicate “member”

Page 39: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

39

Hybrid Mode: results

Accuracy % F1 %

RTE-1 RTE-2 RTE-3 RTE-5 RTE-6 RTE-7

Pure mode 55.88 60.50 65.63 63.17 47.40 41.95

Hybrid mode 56.75 59.63 65.88 61.33 45.42 37.45

Run with the 6 resources that were most commonly applied.

Takeouts Pure mode usually performs better that hybrid

mode However, hybrid mode proofs are clean and

explainable, in contrast to pure mode. Deserve further exploration, possibly in conjunction with

some on-the-fly operations.

Page 40: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

40

Conclusions – The BIUTEE Inference Engine Inference as proof over parse trees

Natural to incorporate many inference types Results - close to best or best on RTEs

Open Source Configurable Extensible Visual tracing Support

Page 41: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

41

Adding extra-linguistic inferences Some tasks may benefit from extra-linguistic

“expert” inferences Temporal / arithmetic / spatial reasoning / …

2 soldiers and a civilian => 3 people

Need to integrate with primary inference over language structures “Expert” may detect on the fly inferences that would

bridge text and hypothesis, Interleaved within tree-generation process

Page 42: Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at: nlp/downloads/biutee.

42

Slide from Inderjeet Mani