Top Banner
Coupling Semi-Supervised Learning of Categories and Relations Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr., and Tom M. Mitchell Carnegie Mellon University 9/18/2012 CS 652, Peter Lindes 1
11

Coupling Semi-Supervised Learning of Categories and Relations

Feb 23, 2016

Download

Documents

Celina Nitu

Coupling Semi-Supervised Learning of Categories and Relations. Andrew Carlson, Justin Betteridge , Estevam R. Hruschka Jr., and Tom M. Mitchell Carnegie Mellon University. The Problem. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes 1

Coupling Semi-Supervised Learning of Categories and Relations

Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr., and Tom M. Mitchell

Carnegie Mellon University

9/18/2012

Page 2: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes 2

9/18/2012

The Problem“We present an approach to semi-supervised learning that yields more accurate results by coupling the training of many information extractors.”

Page 3: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes 39/18/2012

Page 4: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes

• Predefined Categories– Unary predicates (instances are noun phrases)– Mutually exclusive relationships– Some subset relationships– Flag: proper nouns, common nouns, or both– 10-20 seed instances – 5 seed patterns (automatically derived - Hearst, 1992)

• Predefined Relations– Binary predicates (an instance is a pair of noun phrases)– Mutually exclusive relationships– 10-20 seed instances– No seed patterns

9/18/2012

Page 5: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes 59/18/2012

The Predicates

Page 6: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes 6

• Taken from “a 200-million page web crawl”• Filtered for English “using a stop word ratio

threshold”• Filtered out web spam and adult content

“using a ‘bad word’ list”• Segmented, tokenized, and tagged• Noisy sentences filtered out• 514-million sentences used for experiment

9/18/2012

Page 7: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes 7

Evaluation• 3 Questions:– “Can CBL iterate many times and still achieve high

precision?”– “How helpful are the types of coupling that we

employ?”– “Can we extend existing semantic resources?”

• 3 Configurations– Full– NS: no sharing of promoted items, seeds shared– NCR: no type checking

9/18/2012

Page 8: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes 8

Results - Precision

9/18/2012

Iterations Full NS NCR

5 92 84 89

10 82 70 84

15 83 63 79

Iterations Full NS NCR

5 92 86 74

10 83 76 68

15 84 64 62

Categories

Relations

Precision estimated by human judging of correctness for 30 samples of each predicate.

Page 9: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes 9

Results - Recall

9/18/2012

Promoted categories and relations – 15 iterations

“At this stage of development, obtaining high recall is not a priority … it is our hope that high recall will come with time.”

Page 10: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes 10

Example Extracted Facts

9/18/2012

“We have presented a method of coupling the semi-supervised learning of categories and relations and demonstrated empirically that the coupling forestalls the problem of semantic drift associated with bootstrap learning methods.”

Page 11: Coupling Semi-Supervised Learning of Categories and Relations

CS 652, Peter Lindes 11

Comparison to Freebase

9/18/2012

“… our methods can contribute new facts to existing resources.”