Top Banner
Weakly Supervised Machine Reading Isabelle Augenstein University College London October 2016
31

Weakly Supervised Machine Reading

Jan 22, 2018

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  1. 1. Weakly Supervised Machine Reading Isabelle Augenstein University College London October 2016
  2. 2. What is Machine Reading? Automatic reading (i.e. encoding of text) Automatic understanding of text Useful ingredients for machine reading Representation learning Structured prediction Generating training data
  3. 3. Machine Reading RNNs are a popular method for machine reading method_for ( MR, XXX ) Supporting text Question u(q)r(s) g(x)What is a good method for machine reading?
  4. 4. Machine Reading Tasks Word Representation Learning Output: vector for each word Learn relations between words, learn to distinguish words from one another Unsupervised objective: word embeddings Sequence Representation Learning Output: vector for each sentence / paragraph Learn how likely a sequence is given a corpus, learn what next most likely word is given a sequence of words Unsupervised objective: unconditional language models, natural language generation Supervised objective: sequence classification tasks
  5. 5. Machine Reading Tasks Pairwise Sequence Representation Learning Output: vector for pairs of sentences / paragraphs Learn how likely a sequence is given another sequence and a corpus Pairs of sequences can be encoded independently or encoded conditioned on one another Unsupervised objective: conditional language models Supervised objective: stance detection, knowledge base slot filling, question answering
  6. 6. Talk Outline Learning emoji2vec Embeddings from their Description Word representation learning, generating training data Numerically Grounded and KB Conditioned Language Models (Conditional) Sequence representation learning Stance Detection with Bidirectional Conditional Encoding Conditional sequence representation learning, generating training data
  7. 7. Machine Reading: Word Representation Learning RNNs are a popular method for machine reading method_for ( MR, XXX ) Supporting text Question What is a good method for machine reading?
  8. 8. emoji2vec Emoji use has increased Emoji carry sentiment, which could be useful for e.g. sentiment analysis
  9. 9. emoji2vec
  10. 10. emoji2vec Task: learn representations for emojis Problem: many emojis are used infrequently, and typical word representation learning methods (e.g. word2vec) require them to be seen several times Solution: learn emojis from their description
  11. 11. emoji2vec Method: emoji embedding is sum of word embeddings of words in description
  12. 12. emoji2vec Results Emoji vectors are useful in addition to GoogleNews vectors for sentiment analysis task Analogy task also works for emojis
  13. 13. emoji2vec Conclusions Alternative source for learning representations (descriptions) very useful, especially for rare words
  14. 14. Machine Reading: Sequence Representation Learning (Unsupervised) RNNs are a popular method for machine reading method_for ( MR, XXX ) Supporting text Question What is a good method for machine reading?
  15. 15. Numerically Grounded + KB Conditioned Language Models Semantic Error Correction with Language Models
  16. 16. Numerically Grounded + KB Conditioned Language Models Problem: clinical data contains many numbers, many are unseen at test time Solution: concatenate RNN input embeddings with numerical representations Problem: clinical data contains, in addition to report, incomplete and inconsistent KB entry for each patient, how to use it? Solution: lexicalise KB and condition on it
  17. 17. Numerically Grounded + KB Conditioned Language Models
  18. 18. Numerically Grounded + KB Conditioned Language Models Model MAP P R F1 Random 27.75 5.73 10.29 7.36 Base LM 64.37 39.54 64.66 49.07 Cond 62.76 37.46 62.20 46.76 Num 68.21 44.25 71.19 54.58 Cond+Num 69.14 45.36 71.43 55.48 Semantic Error Correction Results
  19. 19. Numerically Grounded + KB Conditioned Language Models Conclusions Accounting for out-of-vocabulary tokens at test time increases performance Duplicate information from lexicalising KB can help further
  20. 20. Machine Reading: Pairwise Sequence Representation Learning (Supervised) RNNs are a popular method for machine reading method_for ( MR, XXX ) Supporting text Question u(q)r(s) g(x)What is a good method for machine reading?
  21. 21. Stance Detection with Conditional Encoding @realDonaldTrump is the only honest voice of the @GOP Task: classify attitude of a text towards a given target as positive, negative, or neutral Example tweet is positive towards Donald Trump, but (implicitly) negative towards Hillary Clinton
  22. 22. Stance Detection with Conditional Encoding Challenges Learn a model that interprets the tweet stance towards a target that might not be mentioned in the tweet itself Learn model without labelled training data for the target with respect to which we are predicting the stance
  23. 23. Stance Detection with Conditional Encoding Challenges Learn a model that interprets the tweet stance towards a target that might not be mentioned in the tweet itself Solution: bidirectional conditional model Learn model without labelled training data for the target with respect to which we are predicting the stance Solution 1: use training data labelled for other targets (domain adaptation setting) Solution 2: automatically label training data for target, using a small set of manually defined hashtags (weakly labelled setting)
  24. 24. Stance Detection with Conditional Encoding
  25. 25. Stance Detection with Conditional Encoding Domain Adaptation Setting Train on Legalization of Abortion, Atheism, Feminist Movement, Climate Change is a Real Concern and Hillary Clinton, evaluate on Donald Trump tweets Model Stance P R F1 FAVOR 0.3145 0.5270 0.3939 Concat AGAINST 0.4452 0.4348 0.4399 Macro 0.4169 FAVOR 0.3033 0.5470 0.3902 BiCond AGAINST 0.6788 0.5216 0.5899 Macro 0.4901
  26. 26. Stance Detection with Conditional Encoding Weakly Supervised Setting Weakly label Donald Trump tweets using hashtags, evaluate on Donald Trump tweets Model Stance P R F1 FAVOR 0.5506 0.5878 0.5686 Concat AGAINST 0.5794 0.4883 0.5299 Macro 0.5493 FAVOR 0.6268 0.6014 0.6138 BiCond AGAINST 0.6057 0.4983 0.5468 Macro 0.5803
  27. 27. Stance Detection with Conditional Encoding Other findings Pre-training word embeddings on large in-domain corpus with unsupervised objective and continuing to optimise them towards supervised objective works well Better than pre-training without further optimisation, or random initialisation, or Google News embeddings LSTM encoding of tweets and targets works better than sum of word embeddings baseline, despite small training set (7k 14k instances) Almost all instances for which target mentioned in tweet have non-neutral stance
  28. 28. Stance Detection with Conditional Encoding Conclusions Modelling sentence pair relationship is important Automatic labelling of in-domain tweets is even more important Learning sequence representations also a good approach for small data
  29. 29. Thank you! isabelleaugenstein.github.io i.augenstein@ucl.ac.uk @IAugenstein github.com/isabelleaugenstein
  30. 30. References Ben Eisner, Tim Rocktschel, Isabelle Augenstein, Matko Bonjak, Sebastian Riedel. emoji2vec: Learning Emoji Representations from their Description. SocialNLP at EMNLP 2016. https://arxiv.org/abs/1609.08359 Georgios Spithourakis, Isabelle Augenstein, Sebastian Riedel. Numerically Grounded Language Models for Semantic Error Correction. EMNLP 2016. https://arxiv.org/abs/1608.04147 Isabelle Augenstein, Tim Rocktschel, Andreas Vlachos, Kalina Bontcheva. Stance Detection with Bidirectional Conditional Encoding. EMNLP 2016. https://arxiv.org/abs/1606.05464
  31. 31. Collaborators Kalina Bontcheva University of Sheffield Andreas Vlachos University of Sheffield George Spithourakis UCLMatko Bonjak UCL Sebastian Riedel UCL Tim Rocktschel UCL Ben Eisner Princeton