This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1. Weakly Supervised Machine Reading Isabelle Augenstein
University College London October 2016
2. What is Machine Reading? Automatic reading (i.e. encoding of
text) Automatic understanding of text Useful ingredients for
machine reading Representation learning Structured prediction
Generating training data
3. Machine Reading RNNs are a popular method for machine
reading method_for ( MR, XXX ) Supporting text Question u(q)r(s)
g(x)What is a good method for machine reading?
4. Machine Reading Tasks Word Representation Learning Output:
vector for each word Learn relations between words, learn to
distinguish words from one another Unsupervised objective: word
embeddings Sequence Representation Learning Output: vector for each
sentence / paragraph Learn how likely a sequence is given a corpus,
learn what next most likely word is given a sequence of words
Unsupervised objective: unconditional language models, natural
language generation Supervised objective: sequence classification
tasks
5. Machine Reading Tasks Pairwise Sequence Representation
Learning Output: vector for pairs of sentences / paragraphs Learn
how likely a sequence is given another sequence and a corpus Pairs
of sequences can be encoded independently or encoded conditioned on
one another Unsupervised objective: conditional language models
Supervised objective: stance detection, knowledge base slot
filling, question answering
6. Talk Outline Learning emoji2vec Embeddings from their
Description Word representation learning, generating training data
Numerically Grounded and KB Conditioned Language Models
(Conditional) Sequence representation learning Stance Detection
with Bidirectional Conditional Encoding Conditional sequence
representation learning, generating training data
7. Machine Reading: Word Representation Learning RNNs are a
popular method for machine reading method_for ( MR, XXX )
Supporting text Question What is a good method for machine
reading?
8. emoji2vec Emoji use has increased Emoji carry sentiment,
which could be useful for e.g. sentiment analysis
9. emoji2vec
10. emoji2vec Task: learn representations for emojis Problem:
many emojis are used infrequently, and typical word representation
learning methods (e.g. word2vec) require them to be seen several
times Solution: learn emojis from their description
11. emoji2vec Method: emoji embedding is sum of word embeddings
of words in description
12. emoji2vec Results Emoji vectors are useful in addition to
GoogleNews vectors for sentiment analysis task Analogy task also
works for emojis
13. emoji2vec Conclusions Alternative source for learning
representations (descriptions) very useful, especially for rare
words
14. Machine Reading: Sequence Representation Learning
(Unsupervised) RNNs are a popular method for machine reading
method_for ( MR, XXX ) Supporting text Question What is a good
method for machine reading?
15. Numerically Grounded + KB Conditioned Language Models
Semantic Error Correction with Language Models
16. Numerically Grounded + KB Conditioned Language Models
Problem: clinical data contains many numbers, many are unseen at
test time Solution: concatenate RNN input embeddings with numerical
representations Problem: clinical data contains, in addition to
report, incomplete and inconsistent KB entry for each patient, how
to use it? Solution: lexicalise KB and condition on it
17. Numerically Grounded + KB Conditioned Language Models
18. Numerically Grounded + KB Conditioned Language Models Model
MAP P R F1 Random 27.75 5.73 10.29 7.36 Base LM 64.37 39.54 64.66
49.07 Cond 62.76 37.46 62.20 46.76 Num 68.21 44.25 71.19 54.58
Cond+Num 69.14 45.36 71.43 55.48 Semantic Error Correction
Results
19. Numerically Grounded + KB Conditioned Language Models
Conclusions Accounting for out-of-vocabulary tokens at test time
increases performance Duplicate information from lexicalising KB
can help further
20. Machine Reading: Pairwise Sequence Representation Learning
(Supervised) RNNs are a popular method for machine reading
method_for ( MR, XXX ) Supporting text Question u(q)r(s) g(x)What
is a good method for machine reading?
21. Stance Detection with Conditional Encoding @realDonaldTrump
is the only honest voice of the @GOP Task: classify attitude of a
text towards a given target as positive, negative, or neutral
Example tweet is positive towards Donald Trump, but (implicitly)
negative towards Hillary Clinton
22. Stance Detection with Conditional Encoding Challenges Learn
a model that interprets the tweet stance towards a target that
might not be mentioned in the tweet itself Learn model without
labelled training data for the target with respect to which we are
predicting the stance
23. Stance Detection with Conditional Encoding Challenges Learn
a model that interprets the tweet stance towards a target that
might not be mentioned in the tweet itself Solution: bidirectional
conditional model Learn model without labelled training data for
the target with respect to which we are predicting the stance
Solution 1: use training data labelled for other targets (domain
adaptation setting) Solution 2: automatically label training data
for target, using a small set of manually defined hashtags (weakly
labelled setting)
24. Stance Detection with Conditional Encoding
25. Stance Detection with Conditional Encoding Domain
Adaptation Setting Train on Legalization of Abortion, Atheism,
Feminist Movement, Climate Change is a Real Concern and Hillary
Clinton, evaluate on Donald Trump tweets Model Stance P R F1 FAVOR
0.3145 0.5270 0.3939 Concat AGAINST 0.4452 0.4348 0.4399 Macro
0.4169 FAVOR 0.3033 0.5470 0.3902 BiCond AGAINST 0.6788 0.5216
0.5899 Macro 0.4901
26. Stance Detection with Conditional Encoding Weakly
Supervised Setting Weakly label Donald Trump tweets using hashtags,
evaluate on Donald Trump tweets Model Stance P R F1 FAVOR 0.5506
0.5878 0.5686 Concat AGAINST 0.5794 0.4883 0.5299 Macro 0.5493
FAVOR 0.6268 0.6014 0.6138 BiCond AGAINST 0.6057 0.4983 0.5468
Macro 0.5803
27. Stance Detection with Conditional Encoding Other findings
Pre-training word embeddings on large in-domain corpus with
unsupervised objective and continuing to optimise them towards
supervised objective works well Better than pre-training without
further optimisation, or random initialisation, or Google News
embeddings LSTM encoding of tweets and targets works better than
sum of word embeddings baseline, despite small training set (7k 14k
instances) Almost all instances for which target mentioned in tweet
have non-neutral stance
28. Stance Detection with Conditional Encoding Conclusions
Modelling sentence pair relationship is important Automatic
labelling of in-domain tweets is even more important Learning
sequence representations also a good approach for small data
30. References Ben Eisner, Tim Rocktschel, Isabelle Augenstein,
Matko Bonjak, Sebastian Riedel. emoji2vec: Learning Emoji
Representations from their Description. SocialNLP at EMNLP 2016.
https://arxiv.org/abs/1609.08359 Georgios Spithourakis, Isabelle
Augenstein, Sebastian Riedel. Numerically Grounded Language Models
for Semantic Error Correction. EMNLP 2016.
https://arxiv.org/abs/1608.04147 Isabelle Augenstein, Tim
Rocktschel, Andreas Vlachos, Kalina Bontcheva. Stance Detection
with Bidirectional Conditional Encoding. EMNLP 2016.
https://arxiv.org/abs/1606.05464
31. Collaborators Kalina Bontcheva University of Sheffield
Andreas Vlachos University of Sheffield George Spithourakis
UCLMatko Bonjak UCL Sebastian Riedel UCL Tim Rocktschel UCL Ben
Eisner Princeton