Top Banner
Semi-Supervised Natural Language Learning Reading Group • I set up a site at: http://www.cs.cmu.edu/~acarlson /semisupervised/ • Cover other applications of semi-supervised learning? • Volunteers? • Every week or bi-weekly? • Time change? 1pm? Noon?
26

Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Semi-Supervised Natural Language Learning Reading Group

• I set up a site at: http://www.cs.cmu.edu/~acarlson/semisupervised/

• Cover other applications of semi-supervised learning?

• Volunteers?

• Every week or bi-weekly?

• Time change? 1pm? Noon?

Page 2: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Unsupervised Word Sense Disambiguation Rivaling

Supervised Methods

Author: David Yarowsky (1995)

Presented by: Andy Carlson

Page 3: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Word Sense Disambiguation

• Determining what sense of a word is meant in a given sentence

• “Toyota is considering opening a plant in Detroit.”

• “The banana plant is grown all over the tropics for its fruit.”

• Different from sense induction– we assume we already know distinct senses

Page 4: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Using unlabeled data

• Two properties of language let us use unlabeled data:

• One sense per collocation– Nearby words provide strong and consistent clues

• One sense per discourse– With a document, the sense of a word is highly

consistent

• We can base an iterative bootstrapping algorithm on these two properties

Page 5: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

One sense per discourse

• How accurate?

• How frequently does it apply?

Page 6: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.
Page 7: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Decision Lists

• List of rules of the form “collocation => sense”

• Example: life (within 2-10 words) => biological sense of plant

• Rules are ordered by log-likelihood ratio

Page 8: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

The algorithm – step 1

• Find all occurrences of the given polysemous word

• We follow examples for the word plant

Page 9: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.
Page 10: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Step 2 – Initial Labeling

• For each sense of the word, identify a small number of training examples

• Strategies: dictionary words, human-labelling of most frequent collocates, or human-chosen collocates

• Example: the words life and manufacturing are used as seed collocations

Page 11: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Labeled as ‘living’ plant

Page 12: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Unlabeled examples

Page 13: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Labeled as ‘factory’ plant

Page 14: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Sample initial state

Page 15: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Step 3a

• Train the decision list based on the current labeling of the state space

Page 16: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Step 3b

• Apply learned classifier to all examples

Page 17: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Step 3c

• Optionally, apply the one-sense-per-discourse constraint

Page 18: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Step 3c

Page 19: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Step 3c

Page 20: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

After steps 3b and 3c

Page 21: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Step 3d

• Repeat step 3 iteratively

• Details – grow window size for collocations, and randomly perturb the class inclusion threshold

Page 22: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Step 4

• Stop. The algorithm converges to a stable residual set.

Page 23: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Sample final state

Page 24: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Final decision list

Page 25: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.

Results

Page 26: Semi-Supervised Natural Language Learning Reading Group I set up a site at: acarlson/semisup ervised/ acarlson/semisup.