Top Banner
At Loose Ends: Challenges and Opportunities in Lexical Composition Vered Shwartz Natural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019
106

At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Jul 24, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

At Loose Ends:Challenges and Opportunities in Lexical Composition

Vered ShwartzNatural Language Processing Lab, Bar-Ilan University

Talk @ EPFL, January 30, 2019

Page 2: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Representing Phrases

Word representations are pretty much sorted out

Sentence with some [w1]

distributional hypothesis

neural magic

vw1best

embeddingsever

How to represent a phrase p = w1...wk?Most straightforward:

Sentence with some [w1]

distributional hypothesis

neural magic

vw1best

embeddingsever

vw1 vw2 vwk, … ,, 𝑓 ( )“The whole is greater than the sum of its parts”1. Meaning shift2. Implicit meaning

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 2 / 39

Page 3: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Representing Phrases

Word representations are pretty much sorted out

Sentence with some [w1]

distributional hypothesis

neural magic

vw1best

embeddingsever

How to represent a phrase p = w1...wk?

Most straightforward:

Sentence with some [w1]

distributional hypothesis

neural magic

vw1best

embeddingsever

vw1 vw2 vwk, … ,, 𝑓 ( )“The whole is greater than the sum of its parts”1. Meaning shift2. Implicit meaning

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 2 / 39

Page 4: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Representing Phrases

Word representations are pretty much sorted out

Sentence with some [w1]

distributional hypothesis

neural magic

vw1best

embeddingsever

How to represent a phrase p = w1...wk?Most straightforward:

Sentence with some [w1]

distributional hypothesis

neural magic

vw1best

embeddingsever

vw1 vw2 vwk, … ,, 𝑓 ( )

“The whole is greater than the sum of its parts”1. Meaning shift2. Implicit meaning

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 2 / 39

Page 5: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Representing Phrases

Word representations are pretty much sorted out

Sentence with some [w1]

distributional hypothesis

neural magic

vw1best

embeddingsever

How to represent a phrase p = w1...wk?Most straightforward:

Sentence with some [w1]

distributional hypothesis

neural magic

vw1best

embeddingsever

vw1 vw2 vwk, … ,, 𝑓 ( )“The whole is greater than the sum of its parts”

1. Meaning shift2. Implicit meaning

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 2 / 39

Page 6: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Representing Phrases

Word representations are pretty much sorted out

Sentence with some [w1]

distributional hypothesis

neural magic

vw1best

embeddingsever

How to represent a phrase p = w1...wk?Most straightforward:

Sentence with some [w1]

distributional hypothesis

neural magic

vw1best

embeddingsever

vw1 vw2 vwk, … ,, 𝑓 ( )“The whole is greater than the sum of its parts”1. Meaning shift2. Implicit meaning

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 2 / 39

Page 7: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Meaning Shift

A constituent word may beused in a non-literal way

VPC meanings differ fromtheir verbs’ meanings

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 3 / 39

Page 8: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Meaning Shift

A constituent word may beused in a non-literal way

VPC meanings differ fromtheir verbs’ meanings

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 3 / 39

Page 9: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Implicit Meaning

In noun compounds

In adjective-noun compositions

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 4 / 39

Page 10: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Implicit Meaning

In noun compounds In adjective-noun compositions

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 4 / 39

Page 11: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

In this talk

1. Testing Existing Text RepresentationsCan they handle the complexity of phrases?

2. Paraphrasing Noun-CompoundsA model for explicating noun compounds through paraphrases

3. Future DirectionsThoughts about the future of phrase representations

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 5 / 39

Page 12: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

In this talk

1. Testing Existing Text RepresentationsCan they handle the complexity of phrases?

2. Paraphrasing Noun-CompoundsA model for explicating noun compounds through paraphrases

3. Future DirectionsThoughts about the future of phrase representations

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 5 / 39

Page 13: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

In this talk

1. Testing Existing Text RepresentationsCan they handle the complexity of phrases?

2. Paraphrasing Noun-CompoundsA model for explicating noun compounds through paraphrases

3. Future DirectionsThoughts about the future of phrase representations

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 5 / 39

Page 14: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Still a Pain in the Neck:Evaluating Text Representations on Lexical Composition

Vered Shwartz and Ido Dagan

(in submission)

Page 15: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Can existing representations address these phenomena?Probing Tasks

Simple tasks designed to test a single linguistic property[Adi et al., 2017, Conneau et al., 2018]

Representation Minimal Model Prediction

SkipThoughts(s) What is s’s length?InferSent(s) Is w in s?... ...

We follow the same for phrases, with various representations

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 7 / 39

Page 16: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Can existing representations address these phenomena?Probing Tasks

Simple tasks designed to test a single linguistic property[Adi et al., 2017, Conneau et al., 2018]

Representation Minimal Model Prediction

SkipThoughts(s) What is s’s length?InferSent(s) Is w in s?... ...

We follow the same for phrases, with various representations

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 7 / 39

Page 17: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Can existing representations address these phenomena?Probing Tasks

Simple tasks designed to test a single linguistic property[Adi et al., 2017, Conneau et al., 2018]

Representation Minimal Model Prediction

SkipThoughts(s) What is s’s length?InferSent(s) Is w in s?... ...

We follow the same for phrases, with various representations

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 7 / 39

Page 18: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

RepresentationsWord Embeddings Sentence Embeddings Contextualized

Word Embeddingsword2vec SkipThoughts ELMoGloVe InferSent∗ OpenAI GPTfastText GenSen∗ BERT

- vector per word - vector per sentence - vector per word- context-agnostic - context-sensitive

- named after charactersfrom Sesame Street

∗ supervised

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 8 / 39

Page 19: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

RepresentationsWord Embeddings Sentence Embeddings Contextualized

Word Embeddingsword2vec SkipThoughts ELMoGloVe InferSent∗ OpenAI GPTfastText GenSen∗ BERT- vector per word - vector per sentence - vector per word- context-agnostic - context-sensitive

- named after charactersfrom Sesame Street

∗ supervised

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 8 / 39

Page 20: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

RepresentationsWord Embeddings Sentence Embeddings Contextualized

Word Embeddingsword2vec SkipThoughts ELMoGloVe InferSent∗ OpenAI GPTfastText GenSen∗ BERT- vector per word - vector per sentence - vector per word- context-agnostic - context-sensitive

- named after charactersfrom Sesame Street

∗ supervisedVered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 8 / 39

Page 21: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Tasks and Results

Phrase Type Noun Compound Literality Noun Compound RelationsFM1 FN1

0

50

100

Majority

23.8

Majority

62.2

word2vec

0.2

word2vec

20.6

GloVe

0.2

GloVe

32.6

fastText

0.1

fastText

31.4

ELMo

18.8

ELMo

61.5

OAIG

2.1

OpenAIGPT

44.9

BERT

18.8

BERT

61.6

Human

70.6

Human

95.9

0

50

100

Majority

20

word2vec

26.5

GloVe

28.8

fastText

30.3

SkipThoughts

34.2

InferSent

24.9

GenSen

35.5

ELMo

41.8

OpenAIGPT

50

BERT

44

Human

87

Accuracy

Word Embeddings Sentence Embeddings Contextualized

0

50

100

Majority

50

word2vec

60.9

GloVe

60.1

fastText

60.7

SkipThoughts

51.3

InferSent

58.5

GenSen

65.6

ELMo

67

OpenAIGPT

50

BERT

74.2 Human

92

Accuracy

Word Embeddings Sentence Embeddings Contextualized

Adjective-Noun Relations Adjective-Noun Entailment Verb-particle Classification

0

50

100

Majority

46.3

word2vec

41.2

GloVe

36

fastText

45.6

SkipThoughts

47.8

InferSent

51.5

GenSen

49.3

ELMo

43.4

OpenAIGPT

52.9

BERT

50

Human

77

Accuracy

Word Embeddings Sentence Embeddings Contextualized

0

50

100Majority

0

word2vec

36.6

GloVe

20.6 fastText

40.4Skip

Thoughts

23.4 InferSent

48.4

GenSen

55.2

ELMo

45.2

OAIG

14.7

BERT

37.2

Human

74.4

F 1

Word Embeddings Sentence Embeddings Contextualized 0

50

100

Majority

72.3

word2vec

68.6

GloVe

67.9

fastText

70

SkipThoughts

68.6

InferSent

67.9

GenSen

65.7

ELMo

76.4

OpenAIGPT

71.4

BERT

75

Human

82

Accuracy

Word Embeddings Sentence Embeddings Contextualized

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 9 / 39

Page 22: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

1. Phrase TypeAuthorities meted out summary justice in cases as this

O B-MW_VPC I-MW_VPC B-MW_NC I-MW_NC O O O O

FM1 FN1

0

50

100

Majority

23.8

Majority

62.2

word2vec

0.2

word2vec

20.6

GloVe

0.2

GloVe

32.6

fastText

0.1

fastText

31.4

ELMo

18.8

ELMo

61.5

OAIG

2.1

OpenAIGPT

44.9

BERT

18.8

BERT

61.6

Human

70.6

Human

95.9

(1) Failure to recognize phrase type; (2) Named entities are easier; (3) Context helps

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 10 / 39

Page 23: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

1. Phrase TypeAuthorities meted out summary justice in cases as this

O B-MW_VPC I-MW_VPC B-MW_NC I-MW_NC O O O O

FM1 FN1

0

50

100

Majority

23.8

Majority

62.2

word2vec

0.2word2vec

20.6

GloVe

0.2

GloVe

32.6

fastText

0.1

fastText

31.4

ELMo

18.8

ELMo

61.5

OAIG

2.1

OpenAIGPT

44.9

BERT

18.8

BERT

61.6

Human

70.6

Human

95.9

(1) Failure to recognize phrase type

; (2) Named entities are easier; (3) Context helps

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 10 / 39

Page 24: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

1. Phrase TypeAuthorities meted out summary justice in cases as this

O B-MW_VPC I-MW_VPC B-MW_NC I-MW_NC O O O O

FM1 FN1

0

50

100

Majority

23.8

Majority

62.2

word2vec

0.2word2vec

20.6

GloVe

0.2

GloVe

32.6

fastText

0.1

fastText

31.4

ELMo

18.8

ELMo

61.5

OAIG

2.1

OpenAIGPT

44.9

BERT

18.8

BERT

61.6

Human

70.6

Human

95.9

(1) Failure to recognize phrase type; (2) Named entities are easier

; (3) Context helps

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 10 / 39

Page 25: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

1. Phrase TypeAuthorities meted out summary justice in cases as this

O B-MW_VPC I-MW_VPC B-MW_NC I-MW_NC O O O O

FM1 FN1

0

50

100

Majority

23.8

Majority

62.2

word2vec

0.2word2vec

20.6

GloVe

0.2

GloVe

32.6

fastText

0.1

fastText

31.4

ELMo

18.8

ELMo

61.5

OAIG

2.1

OpenAIGPT

44.9

BERT

18.8

BERT

61.6

Human

70.6

Human

95.9

(1) Failure to recognize phrase type; (2) Named entities are easier; (3) Context helps

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 10 / 39

Page 26: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

2. Noun Compound Literality

The crash course in litigation made me a better lawyer

Non-Literal Literal

0

50

100

Majority

20

word2vec

26.5

GloVe

28.8

fastText

30.3

SkipThoughts

34.2

InferSent

24.9

GenSen

35.5

ELMo

41.8

OpenAIGPT

50

BERT

44

Human

87

Accuracy

Word Embeddings Sentence Embeddings Contextualized

(1) word embeddings < sentence embeddings < contextualized; (2) Far from humans

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 11 / 39

Page 27: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

2. Noun Compound Literality

The crash course in litigation made me a better lawyer

Non-Literal Literal

0

50

100

Majority

20

word2vec

26.5

GloVe

28.8

fastText

30.3SkipThoughts

34.2

InferSent

24.9

GenSen

35.5

ELMo

41.8

OpenAIGPT

50

BERT

44

Human

87

Accuracy

Word Embeddings Sentence Embeddings Contextualized

(1) word embeddings < sentence embeddings < contextualized

; (2) Far from humans

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 11 / 39

Page 28: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

2. Noun Compound Literality

The crash course in litigation made me a better lawyer

Non-Literal Literal

0

50

100

Majority

20

word2vec

26.5

GloVe

28.8

fastText

30.3SkipThoughts

34.2

InferSent

24.9

GenSen

35.5

ELMo

41.8

OpenAIGPT

50

BERT

44

Human

87

Accuracy

Word Embeddings Sentence Embeddings Contextualized

(1) word embeddings < sentence embeddings < contextualized; (2) Far from humans

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 11 / 39

Page 29: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

2. Noun Compound LiteralityAnalysis

ELMo OpenAI GPT BERT

A search team located the [crash]L site and found small amounts of human remains.

landfill body archaeologicalwreckage place burialWeb man wreckcrash missing excavationburial location grave

After a [crash]N course in tactics and maneuvers, the squadron was off to the war...

crash few shortchanging while successfulcollision moment rigoroustraining long briefreversed couple training

(1) Literal: fewer errors(2) BERT > ELMo, both reasonable(3) OpenAI GPT errs due to uni-directionality

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 12 / 39

Page 30: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

2. Noun Compound LiteralityAnalysis

ELMo OpenAI GPT BERT

A search team located the [crash]L site and found small amounts of human remains.

landfill body archaeologicalwreckage place burialWeb man wreckcrash missing excavationburial location grave

After a [crash]N course in tactics and maneuvers, the squadron was off to the war...

crash few shortchanging while successfulcollision moment rigoroustraining long briefreversed couple training

(1) Literal: fewer errors

(2) BERT > ELMo, both reasonable(3) OpenAI GPT errs due to uni-directionality

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 12 / 39

Page 31: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

2. Noun Compound LiteralityAnalysis

ELMo OpenAI GPT BERT

A search team located the [crash]L site and found small amounts of human remains.

landfill body archaeologicalwreckage place burialWeb man wreckcrash missing excavationburial location grave

After a [crash]N course in tactics and maneuvers, the squadron was off to the war...

crash few shortchanging while successfulcollision moment rigoroustraining long briefreversed couple training

(1) Literal: fewer errors(2) BERT > ELMo, both reasonable(3) OpenAI GPT errs due to uni-directionality

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 12 / 39

Page 32: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

2. Noun Compound LiteralityAnalysis

ELMo OpenAI GPT BERT

Growing up with a [silver]N spoon in his mouth, he was always cheerful...

silver mother woodenrubber father greasyiron lot bigtin big silverwooden man little

Things get tougher when both constituent nouns are non-literal!

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 13 / 39

Page 33: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

3. Noun Compound Relations

The township is served by three access roads .

Road that makes access possible

Road forecasted for access season

0

50

100

Majority

50

word2vec

60.9

GloVe

60.1

fastText

60.7

SkipThoughts

51.3

InferSent

58.5

GenSen

65.6

ELMo

67

OpenAIGPT

50

BERT

74.2 Human

92

Accuracy

Word Embeddings Sentence Embeddings Contextualized

(1) word embeddings < sentence embeddings < contextualized; (2) Far from humans;(3) Open AI GPT fails

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 14 / 39

Page 34: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

3. Noun Compound Relations

The township is served by three access roads .

Road that makes access possible

Road forecasted for access season

0

50

100

Majority

50

word2vec

60.9GloVe

60.1

fastText

60.7

SkipThoughts

51.3

InferSent

58.5

GenSen

65.6

ELMo

67

OpenAIGPT

50

BERT

74.2 Human

92

Accuracy

Word Embeddings Sentence Embeddings Contextualized

(1) word embeddings < sentence embeddings < contextualized

; (2) Far from humans;(3) Open AI GPT fails

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 14 / 39

Page 35: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

3. Noun Compound Relations

The township is served by three access roads .

Road that makes access possible

Road forecasted for access season

0

50

100

Majority

50

word2vec

60.9GloVe

60.1

fastText

60.7

SkipThoughts

51.3

InferSent

58.5

GenSen

65.6

ELMo

67

OpenAIGPT

50

BERT

74.2 Human

92

Accuracy

Word Embeddings Sentence Embeddings Contextualized

(1) word embeddings < sentence embeddings < contextualized; (2) Far from humans

;(3) Open AI GPT fails

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 14 / 39

Page 36: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

3. Noun Compound Relations

The township is served by three access roads .

Road that makes access possible

Road forecasted for access season

0

50

100

Majority

50

word2vec

60.9GloVe

60.1

fastText

60.7

SkipThoughts

51.3

InferSent

58.5

GenSen

65.6

ELMo

67

OpenAIGPT

50

BERT

74.2 Human

92

Accuracy

Word Embeddings Sentence Embeddings Contextualized

(1) word embeddings < sentence embeddings < contextualized; (2) Far from humans;(3) Open AI GPT failsVered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 14 / 39

Page 37: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

3. Noun Compound RelationsAnalysis

stage area

No clear signal from BERT. Capturing implicit information is challenging!

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 15 / 39

Page 38: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

4. Adjective-Noun Relations

. . . he receives warm support from his students ...

emotionality

temperature

0

50

100

Majority

46.3

word2vec

41.2

GloVe

36

fastText

45.6

SkipThoughts

47.8

InferSent

51.5

GenSen

49.3

ELMo

43.4

OpenAIGPT

52.9

BERT

50

Human

77

Accuracy

Word Embeddings Sentence Embeddings Contextualized

Best model performs only slightly better than majority

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 16 / 39

Page 39: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

4. Adjective-Noun Relations

. . . he receives warm support from his students ...

emotionality

temperature

0

50

100Majority

46.3

word2vec

41.2

GloVe

36

fastText

45.6

SkipThoughts

47.8

InferSent

51.5

GenSen

49.3

ELMo

43.4

OpenAIGPT

52.9

BERT

50

Human

77

Accuracy

Word Embeddings Sentence Embeddings Contextualized

Best model performs only slightly better than majority

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 16 / 39

Page 40: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

5. Adjective-Noun EntailmentMost people die in the class to which they were born→

Most people die in the social class to which they were born

0

50

100

Majority

0

word2vec

36.6

GloVe

20.6 fastText

40.4

Skip

Thoughts

23.4 InferSent

48.4

GenSen

55.2

ELMo

45.2

OAIG

14.7

BERT

37.2

Human

74.4

F 1

Word Embeddings Sentence Embeddings Contextualized

(1) Bad performance for all models(2) Best: sentence embeddings trained on RTE

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 17 / 39

Page 41: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

5. Adjective-Noun EntailmentMost people die in the class to which they were born→

Most people die in the social class to which they were born

0

50

100

Majority

0

word2vec

36.6

GloVe

20.6 fastText

40.4

Skip

Thoughts

23.4 InferSent

48.4

GenSen

55.2

ELMo

45.2

OAIG

14.7

BERT

37.2

Human

74.4

F 1

Word Embeddings Sentence Embeddings Contextualized

(1) Bad performance for all models

(2) Best: sentence embeddings trained on RTE

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 17 / 39

Page 42: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

5. Adjective-Noun EntailmentMost people die in the class to which they were born→

Most people die in the social class to which they were born

0

50

100

Majority

0

word2vec

36.6

GloVe

20.6 fastText

40.4

Skip

Thoughts

23.4 InferSent

48.4

GenSen

55.2

ELMo

45.2

OAIG

14.7

BERT

37.2

Human

74.4

F 1

Word Embeddings Sentence Embeddings Contextualized

(1) Bad performance for all models(2) Best: sentence embeddings trained on RTE

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 17 / 39

Page 43: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

6. Verb-Particle Classification

We did get on together Which response did you get on that?VPC Non-VPC

0

50

100

Majority

72.3

word2vec

68.6

GloVe

67.9

fastText

70

SkipThoughts

68.6

InferSent

67.9

GenSen

65.7

ELMo

76.4

OpenAIGPT

71.4

BERT

75

Human

82

Accuracy

Word Embeddings Sentence Embeddings Contextualized

Similar performance for all models.Is the good performance merely due to label imbalance?

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 18 / 39

Page 44: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

6. Verb-Particle Classification

We did get on together Which response did you get on that?VPC Non-VPC

0

50

100

Majority

72.3word2vec

68.6

GloVe

67.9

fastText

70

SkipThoughts

68.6

InferSent

67.9

GenSen

65.7

ELMo

76.4

OpenAIGPT

71.4

BERT

75

Human

82

Accuracy

Word Embeddings Sentence Embeddings Contextualized

Similar performance for all models.

Is the good performance merely due to label imbalance?

Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 18 / 39

Page 45: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

6. Verb-Particle Classification

We did get on together Which response did you get on that?VPC Non-VPC

0

50

100

Majority

72.3word2vec

68.6

GloVe

67.9

fastText

70

SkipThoughts

68.6

InferSent

67.9

GenSen

65.7

ELMo

76.4

OpenAIGPT

71.4

BERT

75

Human

82

Accuracy

Word Embeddings Sentence Embeddings Contextualized

Similar performance for all models.Is the good performance merely due to label imbalance?Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 18 / 39

Page 46: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

6. Verb-Particle ClassificationAnalysis

Weak signal from ELMo. Mostly performs well due to label imbalance.Vered Shwartz and Ido Dagan · Evaluating Text Representations on Lexical Composition 19 / 39

Page 47: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Paraphrase to Explicate:Revealing Implicit Noun-Compound Relations

Vered Shwartz and Ido Dagan

(ACL 2018)

Page 48: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Interpreting Noun-Compounds

Noun compounds are “text compression devices” [Nakov, 2013]

We’re pretty good at decompressing them, even when we seethem for the first time

What is a “parsley cake”?

cake eaten on a parsley?

cake with parsley?

cake for parsley?

... from http://www.bazekalim.com

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 21 / 39

Page 49: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Interpreting Noun-Compounds

Noun compounds are “text compression devices” [Nakov, 2013]We’re pretty good at decompressing them, even when we seethem for the first time

What is a “parsley cake”?

cake eaten on a parsley?

cake with parsley?

cake for parsley?

... from http://www.bazekalim.com

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 21 / 39

Page 50: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Interpreting Noun-Compounds

Noun compounds are “text compression devices” [Nakov, 2013]We’re pretty good at decompressing them, even when we seethem for the first time

What is a “parsley cake”?

cake eaten on a parsley?

cake with parsley?

cake for parsley?

... from http://www.bazekalim.com

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 21 / 39

Page 51: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Interpreting Noun-Compounds

Noun compounds are “text compression devices” [Nakov, 2013]We’re pretty good at decompressing them, even when we seethem for the first time

What is a “parsley cake”?

cake eaten on a parsley?

cake with parsley?

cake for parsley?

...

from http://www.bazekalim.com

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 21 / 39

Page 52: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Interpreting Noun-Compounds

Noun compounds are “text compression devices” [Nakov, 2013]We’re pretty good at decompressing them, even when we seethem for the first time

What is a “parsley cake”?

cake eaten on a parsley?

cake with parsley?

cake for parsley?

... from http://www.bazekalim.com

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 21 / 39

Page 53: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Generalizing Existing Knowledge

What can cake be made of?

Parsley (sort of) fits into this distribution

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 22 / 39

Page 54: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Generalizing Existing Knowledge

What can cake be made of?

Parsley (sort of) fits into this distribution

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 22 / 39

Page 55: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Noun-Compound ParaphrasingGiven a noun-compound w1w2, express the relation between thehead w2 and the modifier w1 with multiple prepositional and verbalparaphrases [Nakov and Hearst, 2006]

olive oil

apple cake

ground attack

[w2] extracted from [w1]

[w2] made of [w1]

[w2] from [w1]

boat whistle

sea bass

[w2] located in [w1]

[w2] live in [w1]

game room

service door

baby oil

[w2] used for [w1]

[w2] for [w1]

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 23 / 39

Page 56: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Prior Methods (1/2)

Based on constituent co-occurrences: “cake made of apple”

Problems:1. Many unseen compounds, no paraphrases in the corpus

rare: parsley cake or highly lexicalized: ice cream

2. Many compounds with just a few paraphrasesCan we infer “cake containing apple” given “cake made of apple”?

Prior work provides partial solutions to either (1) or (2)

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 24 / 39

Page 57: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Prior Methods (1/2)

Based on constituent co-occurrences: “cake made of apple”

Problems:1. Many unseen compounds, no paraphrases in the corpus

rare: parsley cake or highly lexicalized: ice cream

2. Many compounds with just a few paraphrasesCan we infer “cake containing apple” given “cake made of apple”?

Prior work provides partial solutions to either (1) or (2)

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 24 / 39

Page 58: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Prior Methods (1/2)

Based on constituent co-occurrences: “cake made of apple”

Problems:1. Many unseen compounds, no paraphrases in the corpus

rare: parsley cake or highly lexicalized: ice cream

2. Many compounds with just a few paraphrasesCan we infer “cake containing apple” given “cake made of apple”?

Prior work provides partial solutions to either (1) or (2)

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 24 / 39

Page 59: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Prior Methods (1/2)

Based on constituent co-occurrences: “cake made of apple”

Problems:1. Many unseen compounds, no paraphrases in the corpus

rare: parsley cake or highly lexicalized: ice cream

2. Many compounds with just a few paraphrasesCan we infer “cake containing apple” given “cake made of apple”?

Prior work provides partial solutions to either (1) or (2)

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 24 / 39

Page 60: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Prior Methods (2/2)

1. MELODI [Van de Cruys et al., 2013]:Represent NC using compositional distributional representations

Predict paraphrase templates given NC vectorGeneralizes for similar unseen NCs, e.g. pear tart

2. IIITH [Surtani et al., 2013]:Learn “is-a” relations between paraphrases:e.g. “[w2] extracted from [w1]” ⊂ “[w2] made of [w1]”

Our solution: multi-task learning to address both problems

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 25 / 39

Page 61: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Prior Methods (2/2)

1. MELODI [Van de Cruys et al., 2013]:Represent NC using compositional distributional representationsPredict paraphrase templates given NC vector

Generalizes for similar unseen NCs, e.g. pear tart

2. IIITH [Surtani et al., 2013]:Learn “is-a” relations between paraphrases:e.g. “[w2] extracted from [w1]” ⊂ “[w2] made of [w1]”

Our solution: multi-task learning to address both problems

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 25 / 39

Page 62: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Prior Methods (2/2)

1. MELODI [Van de Cruys et al., 2013]:Represent NC using compositional distributional representationsPredict paraphrase templates given NC vectorGeneralizes for similar unseen NCs, e.g. pear tart

2. IIITH [Surtani et al., 2013]:Learn “is-a” relations between paraphrases:e.g. “[w2] extracted from [w1]” ⊂ “[w2] made of [w1]”

Our solution: multi-task learning to address both problems

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 25 / 39

Page 63: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Prior Methods (2/2)

1. MELODI [Van de Cruys et al., 2013]:Represent NC using compositional distributional representationsPredict paraphrase templates given NC vectorGeneralizes for similar unseen NCs, e.g. pear tart

2. IIITH [Surtani et al., 2013]:Learn “is-a” relations between paraphrases:e.g. “[w2] extracted from [w1]” ⊂ “[w2] made of [w1]”

Our solution: multi-task learning to address both problems

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 25 / 39

Page 64: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Prior Methods (2/2)

1. MELODI [Van de Cruys et al., 2013]:Represent NC using compositional distributional representationsPredict paraphrase templates given NC vectorGeneralizes for similar unseen NCs, e.g. pear tart

2. IIITH [Surtani et al., 2013]:Learn “is-a” relations between paraphrases:e.g. “[w2] extracted from [w1]” ⊂ “[w2] made of [w1]”

Our solution: multi-task learning to address both problems

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 25 / 39

Page 65: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Multi-task Reformulation

Training example {w1 = apple, w2 = cake, p = “[w2] made of [w1]”}

1. Predict a paraphrase p for a given NC w1w2:What is the relation between apple and cake?

2. Predict w1 given a paraphrase p and w2:What can cake be made of?

3. Predict w2 given a paraphrase p and w1:What can be made of apple?

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 26 / 39

Page 66: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Multi-task Reformulation

Training example {w1 = apple, w2 = cake, p = “[w2] made of [w1]”}

1. Predict a paraphrase p for a given NC w1w2:What is the relation between apple and cake?

2. Predict w1 given a paraphrase p and w2:What can cake be made of?

3. Predict w2 given a paraphrase p and w1:What can be made of apple?

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 26 / 39

Page 67: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Multi-task Reformulation

Training example {w1 = apple, w2 = cake, p = “[w2] made of [w1]”}

1. Predict a paraphrase p for a given NC w1w2:What is the relation between apple and cake?

2. Predict w1 given a paraphrase p and w2:What can cake be made of?

3. Predict w2 given a paraphrase p and w1:What can be made of apple?

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 26 / 39

Page 68: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Multi-task Reformulation

Training example {w1 = apple, w2 = cake, p = “[w2] made of [w1]”}

1. Predict a paraphrase p for a given NC w1w2:What is the relation between apple and cake?

2. Predict w1 given a paraphrase p and w2:What can cake be made of?

3. Predict w2 given a paraphrase p and w1:What can be made of apple?

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 26 / 39

Page 69: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Main Task (1): Predicting ParaphrasesWhat is the relation between apple and cake?

(23) made

(28) apple

(4145) cake...

(7891) of

(1) [w1]

(2) [w2]

(3) [p]

(78) [w2] containing [w1]...

(131) [w2] made of [w1]...

[p]cake apple

MLPp

pi = 78

Encode placeholder [p] in “cake [p] apple” using biLSTM

Predict an index in the paraphrase vocabularyFixed word embeddings, learned placeholder embeddings(1) Generalizes NCs: pear tart expected to yield similar results

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 27 / 39

Page 70: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Main Task (1): Predicting ParaphrasesWhat is the relation between apple and cake?

(23) made

(28) apple

(4145) cake...

(7891) of

(1) [w1]

(2) [w2]

(3) [p]

(78) [w2] containing [w1]...

(131) [w2] made of [w1]...

[p]cake apple

MLPp

pi = 78

Encode placeholder [p] in “cake [p] apple” using biLSTMPredict an index in the paraphrase vocabulary

Fixed word embeddings, learned placeholder embeddings(1) Generalizes NCs: pear tart expected to yield similar results

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 27 / 39

Page 71: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Main Task (1): Predicting ParaphrasesWhat is the relation between apple and cake?

(23) made

(28) apple

(4145) cake...

(7891) of

(1) [w1]

(2) [w2]

(3) [p]

(78) [w2] containing [w1]...

(131) [w2] made of [w1]...

[p]cake apple

MLPp

pi = 78

Encode placeholder [p] in “cake [p] apple” using biLSTMPredict an index in the paraphrase vocabularyFixed word embeddings, learned placeholder embeddings

(1) Generalizes NCs: pear tart expected to yield similar results

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 27 / 39

Page 72: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Main Task (1): Predicting ParaphrasesWhat is the relation between apple and cake?

(23) made

(28) apple

(4145) cake...

(7891) of

(1) [w1]

(2) [w2]

(3) [p]

(78) [w2] containing [w1]...

(131) [w2] made of [w1]...

[p]cake apple

MLPp

pi = 78

Encode placeholder [p] in “cake [p] apple” using biLSTMPredict an index in the paraphrase vocabularyFixed word embeddings, learned placeholder embeddings(1) Generalizes NCs: pear tart expected to yield similar results

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 27 / 39

Page 73: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Helper Task (2): Predicting Missing ConstituentsWhat can cake be made of?

(23) made

(28) apple

(4145) cake...

(7891) of

(1) [w1]

(2) [w2]

(3) [p]

ofcake made [w1]

MLPw

w1i = 28

Encode placeholder in “cake made of [w1]” using biLSTM

Predict an index in the word vocabulary(2) Generalizes paraphrases:

“[w2] containing [w1]” expected to yield similar results

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 28 / 39

Page 74: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Helper Task (2): Predicting Missing ConstituentsWhat can cake be made of?

(23) made

(28) apple

(4145) cake...

(7891) of

(1) [w1]

(2) [w2]

(3) [p]

ofcake made [w1]

MLPw

w1i = 28

Encode placeholder in “cake made of [w1]” using biLSTMPredict an index in the word vocabulary

(2) Generalizes paraphrases:“[w2] containing [w1]” expected to yield similar results

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 28 / 39

Page 75: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Helper Task (2): Predicting Missing ConstituentsWhat can cake be made of?

(23) made

(28) apple

(4145) cake...

(7891) of

(1) [w1]

(2) [w2]

(3) [p]

ofcake made [w1]

MLPw

w1i = 28

Encode placeholder in “cake made of [w1]” using biLSTMPredict an index in the word vocabulary(2) Generalizes paraphrases:

“[w2] containing [w1]” expected to yield similar results

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 28 / 39

Page 76: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Evaluation

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 29 / 39

Page 77: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Evaluation Setting

Available dataset: SemEval 2013 task 4 [Hendrickx et al., 2013]Semi-supervised: infer templates of POS tags (e.g. “[w2] verbprep [w1]”) from training data, use Google N-grams to generatetraining data

A ranking rather than a retrieval taskSystems expected to return a ranked list of paraphrases for eachnoun compoundWe implemented a ranking model that re-ranks the top kparaphrases retrieved by the model

Evaluation: based on n-gram overlap, provided evaluation scriptGold paraphrase score: how many annotators suggested it?

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 30 / 39

Page 78: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Evaluation Setting

Available dataset: SemEval 2013 task 4 [Hendrickx et al., 2013]Semi-supervised: infer templates of POS tags (e.g. “[w2] verbprep [w1]”) from training data, use Google N-grams to generatetraining data

A ranking rather than a retrieval taskSystems expected to return a ranked list of paraphrases for eachnoun compound

We implemented a ranking model that re-ranks the top kparaphrases retrieved by the model

Evaluation: based on n-gram overlap, provided evaluation scriptGold paraphrase score: how many annotators suggested it?

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 30 / 39

Page 79: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Evaluation Setting

Available dataset: SemEval 2013 task 4 [Hendrickx et al., 2013]Semi-supervised: infer templates of POS tags (e.g. “[w2] verbprep [w1]”) from training data, use Google N-grams to generatetraining data

A ranking rather than a retrieval taskSystems expected to return a ranked list of paraphrases for eachnoun compoundWe implemented a ranking model that re-ranks the top kparaphrases retrieved by the model

Evaluation: based on n-gram overlap, provided evaluation scriptGold paraphrase score: how many annotators suggested it?

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 30 / 39

Page 80: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Evaluation Setting

Available dataset: SemEval 2013 task 4 [Hendrickx et al., 2013]Semi-supervised: infer templates of POS tags (e.g. “[w2] verbprep [w1]”) from training data, use Google N-grams to generatetraining data

A ranking rather than a retrieval taskSystems expected to return a ranked list of paraphrases for eachnoun compoundWe implemented a ranking model that re-ranks the top kparaphrases retrieved by the model

Evaluation: based on n-gram overlap, provided evaluation scriptGold paraphrase score: how many annotators suggested it?

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 30 / 39

Page 81: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Results

non-isomorphic isomorphic

20

40

6054.8

13

40.6

13.8

17.9

23.123.125.8

28.4 28.2

MELODI [Van de Cruys et al., 2013]

SemEval 2013 Baseline [Hendrickx et al., 2013]

SFS [Versley, 2013]

IIITH [Surtani et al., 2013]

PaNiC [Shwartz and Dagan, 2018]

rewards recalland precision

rewardsonly precision

“conservative”models

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 31 / 39

Page 82: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Results

non-isomorphic isomorphic

20

40

6054.8

13

40.6

13.8

17.9

23.123.125.8

28.4 28.2

MELODI [Van de Cruys et al., 2013]

SemEval 2013 Baseline [Hendrickx et al., 2013]

SFS [Versley, 2013]

IIITH [Surtani et al., 2013]

PaNiC [Shwartz and Dagan, 2018]

rewards recalland precision

rewardsonly precision

“conservative”models

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 31 / 39

Page 83: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Results

non-isomorphic isomorphic

20

40

6054.8

13

40.6

13.8

17.9

23.123.125.8

28.4 28.2

MELODI [Van de Cruys et al., 2013]

SemEval 2013 Baseline [Hendrickx et al., 2013]

SFS [Versley, 2013]

IIITH [Surtani et al., 2013]

PaNiC [Shwartz and Dagan, 2018]

rewards recalland precision

rewardsonly precision

“conservative”models

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 31 / 39

Page 84: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Results

non-isomorphic isomorphic

20

40

6054.8

13

40.6

13.8

17.9

23.123.125.8

28.4 28.2

MELODI [Van de Cruys et al., 2013]

SemEval 2013 Baseline [Hendrickx et al., 2013]

SFS [Versley, 2013]

IIITH [Surtani et al., 2013]

PaNiC [Shwartz and Dagan, 2018]

rewards recalland precision

rewardsonly precision

“conservative”models

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 31 / 39

Page 85: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Positive

(1)

44%

(2)15%

(3)

14%

(4)

8%

(5)

5%(6)

14%

1. Valid, missing from gold-standard(“discussion by group”)

2. Too specific(“life of women in community”)

3. Incorrect prepositionsE.g., n-grams don’t respect syntacticstructure: “rinse away the oil frombaby ’s head”⇒ “oil from baby”

4. Syntactic errors5. Borderline grammatical(“force of coalition forces”)

6. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 32 / 39

Page 86: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Positive

(1)

44%

(2)15%

(3)

14%

(4)

8%

(5)

5%(6)

14%

1. Valid, missing from gold-standard(“discussion by group”)

2. Too specific(“life of women in community”)

3. Incorrect prepositionsE.g., n-grams don’t respect syntacticstructure: “rinse away the oil frombaby ’s head”⇒ “oil from baby”

4. Syntactic errors5. Borderline grammatical(“force of coalition forces”)

6. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 32 / 39

Page 87: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Positive

(1)

44%

(2)15%

(3)

14%

(4)

8%

(5)

5%(6)

14%

1. Valid, missing from gold-standard(“discussion by group”)

2. Too specific(“life of women in community”)

3. Incorrect prepositions

E.g., n-grams don’t respect syntacticstructure: “rinse away the oil frombaby ’s head”⇒ “oil from baby”

4. Syntactic errors5. Borderline grammatical(“force of coalition forces”)

6. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 32 / 39

Page 88: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Positive

(1)

44%

(2)15%

(3)

14%

(4)

8%

(5)

5%(6)

14%

1. Valid, missing from gold-standard(“discussion by group”)

2. Too specific(“life of women in community”)

3. Incorrect prepositionsE.g., n-grams don’t respect syntacticstructure: “rinse away the oil frombaby ’s head”⇒ “oil from baby”

4. Syntactic errors5. Borderline grammatical(“force of coalition forces”)

6. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 32 / 39

Page 89: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Positive

(1)

44%

(2)15%

(3)

14%

(4)

8%

(5)

5%(6)

14%

1. Valid, missing from gold-standard(“discussion by group”)

2. Too specific(“life of women in community”)

3. Incorrect prepositionsE.g., n-grams don’t respect syntacticstructure: “rinse away the oil frombaby ’s head”⇒ “oil from baby”

4. Syntactic errors

5. Borderline grammatical(“force of coalition forces”)

6. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 32 / 39

Page 90: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Positive

(1)

44%

(2)15%

(3)

14%

(4)

8%

(5)

5%(6)

14%

1. Valid, missing from gold-standard(“discussion by group”)

2. Too specific(“life of women in community”)

3. Incorrect prepositionsE.g., n-grams don’t respect syntacticstructure: “rinse away the oil frombaby ’s head”⇒ “oil from baby”

4. Syntactic errors5. Borderline grammatical(“force of coalition forces”)

6. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 32 / 39

Page 91: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Positive

(1)

44%

(2)15%

(3)

14%

(4)

8%

(5)

5%(6)

14%

1. Valid, missing from gold-standard(“discussion by group”)

2. Too specific(“life of women in community”)

3. Incorrect prepositionsE.g., n-grams don’t respect syntacticstructure: “rinse away the oil frombaby ’s head”⇒ “oil from baby”

4. Syntactic errors5. Borderline grammatical(“force of coalition forces”)

6. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 32 / 39

Page 92: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Negative

(1)

30%(2)

25%

(3)

10%

(4)

35%

1. Long paraphrase (n > 5)

2. Determiners(“mutation of a gene”)

3. Inflected constituents(“holding of shares”)

4. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 33 / 39

Page 93: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Negative

(1)

30%(2)

25%

(3)

10%

(4)

35%

1. Long paraphrase (n > 5)2. Determiners(“mutation of a gene”)

3. Inflected constituents(“holding of shares”)

4. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 33 / 39

Page 94: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Negative

(1)

30%(2)

25%

(3)

10%

(4)

35%

1. Long paraphrase (n > 5)2. Determiners(“mutation of a gene”)

3. Inflected constituents(“holding of shares”)

4. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 33 / 39

Page 95: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Error AnalysisFalse Negative

(1)

30%(2)

25%

(3)

10%

(4)

35%

1. Long paraphrase (n > 5)2. Determiners(“mutation of a gene”)

3. Inflected constituents(“holding of shares”)

4. Other errors

Vered Shwartz and Ido Dagan · Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations · ACL 2018 33 / 39

Page 96: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Future Directions

Page 97: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Can we learn phrase meanings like humans do?

[Cooper, 1999]: how do L2 learners process idioms?Infer from context: 28% (57% success rate)Rely on literal meaning: 19% (22% success rate)...

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 35 / 39

Page 98: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Inferring from context

We need “extended” contexts[Asl, 2013]: more successful idiominterpretation with extendedcontexts (stories)

We need richer context modelingCharacters in the storyRelationships between themDialogues...

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 36 / 39

Page 99: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Inferring from context

We need “extended” contexts[Asl, 2013]: more successful idiominterpretation with extendedcontexts (stories)

We need richer context modelingCharacters in the storyRelationships between themDialogues...

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 36 / 39

Page 100: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Relying on literal meaning

“Robert knew he was robbing the cradle by dating a sixteen-year-old girl”

We need world knowledge“Cradle is something you put thebaby in”

We need to be able to reason“You’re stealing a child from amother”

“So robbing the cradle is like datinga really young person”

[Cooper, 1999]

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 37 / 39

Page 101: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Relying on literal meaning

“Robert knew he was robbing the cradle by dating a sixteen-year-old girl”

We need world knowledge“Cradle is something you put thebaby in”

We need to be able to reason“You’re stealing a child from amother”

“So robbing the cradle is like datinga really young person”

[Cooper, 1999]

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 37 / 39

Page 102: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Recap

1. Testing Existing Pre-trained RepresentationsContextualized word embeddings provide better phraserepresentations, but there is still a long way to go

2. Paraphrasing Noun-CompoundsRepresentations of compositional phrases can rely upon andgeneralize existing knowledge about similar concepts

3. Future DirectionsTo represent phrases like humans do, we need better context andworld knowledge modeling

Thank you!

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 38 / 39

Page 103: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Recap

1. Testing Existing Pre-trained RepresentationsContextualized word embeddings provide better phraserepresentations, but there is still a long way to go

2. Paraphrasing Noun-CompoundsRepresentations of compositional phrases can rely upon andgeneralize existing knowledge about similar concepts

3. Future DirectionsTo represent phrases like humans do, we need better context andworld knowledge modeling

Thank you!

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 38 / 39

Page 104: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Recap

1. Testing Existing Pre-trained RepresentationsContextualized word embeddings provide better phraserepresentations, but there is still a long way to go

2. Paraphrasing Noun-CompoundsRepresentations of compositional phrases can rely upon andgeneralize existing knowledge about similar concepts

3. Future DirectionsTo represent phrases like humans do, we need better context andworld knowledge modeling

Thank you!

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 38 / 39

Page 105: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

Recap

1. Testing Existing Pre-trained RepresentationsContextualized word embeddings provide better phraserepresentations, but there is still a long way to go

2. Paraphrasing Noun-CompoundsRepresentations of compositional phrases can rely upon andgeneralize existing knowledge about similar concepts

3. Future DirectionsTo represent phrases like humans do, we need better context andworld knowledge modeling

Thank you!

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 38 / 39

Page 106: At Loose EndsNatural Language Processing Lab, Bar-Ilan University Talk @ EPFL, January 30, 2019. Representing Phrases Word representations are pretty much sorted out ... The crash

References I[Adi et al., 2017] Adi, Y., Kermany, E., Belinkov, Y., Lavi, O., and Goldberg, Y. (2017). Fine-grained analysis of sentence

embeddings using auxiliary prediction tasks. In Proceedings of ICLR Conference Track.

[Asl, 2013] Asl, F. M. (2013). The impact of context on learning idioms in efl classes. TESOL Journal, 37(1):2.

[Conneau et al., 2018] Conneau, A., Kruszewski, G., Lample, G., Barrault, L., and Baroni, M. (2018). What you can cram into asingle vector: Probing sentence embeddings for linguistic properties. In Proceedings of the 56th Annual Meeting of theAssociation for Computational Linguistics (Volume 1: Long Papers), pages 2126–2136. Association for ComputationalLinguistics.

[Cooper, 1999] Cooper, T. C. (1999). Processing of idioms by l2 learners of english. TESOL quarterly, 33(2):233–262.

[Hendrickx et al., 2013] Hendrickx, I., Kozareva, Z., Nakov, P., Ó Séaghdha, D., Szpakowicz, S., and Veale, T. (2013).Semeval-2013 task 4: Free paraphrases of noun compounds. In SemEval, pages 138–143.

[Nakov, 2013] Nakov, P. (2013). On the interpretation of noun compounds: Syntax, semantics, and entailment. NaturalLanguage Engineering, 19(03):291–330.

[Nakov and Hearst, 2006] Nakov, P. and Hearst, M. (2006). Using verbs to characterize noun-noun relations. In InternationalConference on Artificial Intelligence: Methodology, Systems, and Applications, pages 233–244. Springer.

[Shwartz and Dagan, 2018] Shwartz, V. and Dagan, I. (2018). Paraphrase to explicate: Revealing implicit noun-compoundrelations. In ACL, Melbourne, Australia.

[Surtani et al., 2013] Surtani, N., Batra, A., Ghosh, U., and Paul, S. (2013). Iiit-h: A corpus-driven co-occurrence basedprobabilistic model for noun compound paraphrasing. In SemEval, pages 153–157.

[Van de Cruys et al., 2013] Van de Cruys, T., Afantenos, S., and Muller, P. (2013). Melodi: A supervised distributional approachfor free paraphrasing of noun compounds. In SemEval, pages 144–147.

[Versley, 2013] Versley, Y. (2013). Sfs-tue: Compound paraphrasing with a language model and discriminative reranking. InSemEval, pages 148–152.

Vered Shwartz · MWUs Under the Magnifying Glass · January 2019 39 / 39