Top Banner
GENERALIZING SEMANTIC RELATIONS 月月 月 127 月月月 月月月月 ( 月月月月月月 )


Dec 28, 2015



Eleanor Barnett
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Page 1: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )


Page 2: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Up until now: Getting to know NLP• “Speech and Language Processing” (Jurafsky & Martin)• 論文:

• On-Demand Information Extract (Sekine)• Learning First-Order Horn Clauses From Web Text [Sherlock]

(Schoenmackers 2010)• Coupled Semi-Supervised Learning for Information Extraction [NELL]

(Carlson)• Identifying Relations for Open Information Extraction [ReVerb]

(Fader)• Relation Acquisition using Word Classes and Partial Patterns

(Saeger)• Interpretation as Abduction (Hobbs)• An ILP Formulation of Abudctive Inference for Discourse

Interpretation (Inoue氏 )• Learning Dependency-Based Compositional Semantics (Liang)

Page 3: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Motivation• Ultimate Goal: Inference• Inference requires: knowledge• Large scale database of semantic relations have been

created from web text

Page 4: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

ReVerb (Fader et al., 2011)• relation(arg1,arg2) tuples acquired from large-scale Web data• Over 14.5 million semantic relations released to public

Page 5: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )


• Many different ways to express equivalent meaning

• Consider resides relation• Table shows counts of reverb relations containing live or reside

• Relations in red should be generalized to: • reside(<PERSON>,<PLACE>)

• We aim to generalize through semantic clustering

Frequency | Relation

27,383 lives in

10,315 live in

8,653 lived in

5,185 currently resides in

4,002 currently lives in

3,310 now lives in

1,933 resides in

1,548 is a resident of

1,468 live on

1,308 now resides in

1,191 has lived in

1,055 resided in

876 lives on

590 lived on

531 live at

515 still lives in

461 can live up to

456 is a lifelong resident of

444 was a resident of

413 live for

382 must be residents of

332 lives with

332 lived for

Page 6: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Semantic Clustering Goals

1. Semantic Relations Dictionary • Mapping from ReVerb’s specific instances to generalized semantic-

placeholder that looks like<Generalized-Rel>(<Arg1 Type>,<Arg2 Type>)

2. Method of mapping real-world relation instances to generalized semantic form

• Can be accomplished with a semantic similarity function• Clustering and generalize relations • Looking up new relations from text

Page 7: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Semantic Similarity• Ontological: Similarity based on arguments’ hierarchy of

semantic types

• Lexical: Similarity based on lexical features of relation

• Contextual: Similarity based on surrounding text

Page 8: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Ontological Similarity (Clustering)

1. Matt resides in Sendai

2. Eric lives in Japan

• Should these be clustered together? (Yes!)• Matching arg1 type <Person>• Matching arg2 type <Place>• High ontological similarity means good chance of


Page 9: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Ontological Similarity (Lookup)1. Matt lives on a farm => ???

2. Eric lives on donuts => ???

• Are these the same semantic relation? (NO!)• Multiple entries in dictionary for lives_on:

• resides(<Living Thing>,<Place>)• nourished_by(<Living Thing>,<Nourishment>)

• Use argument type similarity testing to differentiate between senses of lives_on

Page 10: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Ontological Similarity (Lookup cont.)• Which version will ontological similarity suggest we return for

each example?

1. Matt lives on a farm <Person> lives on<?> <Place>


2. Eric lives on donuts

<Person>lives on<?> <Food>


• onto_sim(<Food>,<Nourishment>) is greater than onto_sim(<Food>,<Place>) so we know knows Eric is nourished_by donuts

Page 11: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Lexical Similarity• Use relationship features to score similarity• N-gram overlap, bag-of-words, …• Weighting content/functional words differently• etc

Page 12: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Lexical Similarity

• Correctly groups together • Lives at• Live in

• But erroneously clusters• Lives for• Lives with

• And doesn’t cluster• resides in

• (relying on ontological sim. for that)

• 27,383 lives in • 10,315 live in• 8,653 lived in • 5,185 currently resides in • 4,002 currently lives in• 3,310 now lives in • 1,933 resides in • 1,548 is a resident of• 1,468 live on • 1,308 now resides in• 1,191 has lived in • 1,055 resided in • 876 lives on • 590 lived on • 531 live at• 515 still lives in • 461 can live up to • 456 is a lifelong resident of• 444 was a resident of • 413 live for • 382 must be residents of • 332 lives with • 332 lived for

Page 13: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Contextual Similarity• How similar is the surrounding text?• To answer this, we need original text

• Will have to hunt down sentences on the web• Time consuming• Feasible?

Page 14: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

Other Issues• Word tense

• Does lived in belong with lives in?

• Detection of conflicting polarity x (Acesulfame_Potassium does_not_promote tooth_decay)x (Conservatives should_not_promote democracy) x (Website must_not_promote hate) x ? (Environmentalists are_not_alone_in_promoting renewable_en


• Semantic type coverage problems• Use lexical similarity-based lookup for semantic type too?

Page 15: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

進捗報告• Set up git repository• Implemented:

• Wrapper for reverb • (data lookup)

• WordNet type-lookup• Sherlock type-lookup• Ontological similarity

• Made a slideshow for 研究会• ただ今ご覧になっていただいている物

Page 16: GENERALIZING SEMANTIC RELATIONS 12月7日 研究会 祭都援炉 ( マットエンロ )

計画!!!!!• Finish similarity score

• Selecting a wordnet ontological similarity function• (Over 5 different evaluations already exist)

• Implement lexical similarity• (Should already be in NLTK somewhere)

• Implementing contextual similarity• (Prepare for the hunt!)

• Selecting & implementing a clustering method• Test on ReVerb data

• First on wikipedia…• Then on clueweb