Top Banner
A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION FROM TEXT USING DEEP REINFORCEMENT LEARNING SCAI 2019, 12/08/2019 VISHWAJEET KUMAR 1,2,3 , GANESH RAMAKRISHNAN 2 , YUAN-FANG LI 3 1 IITB-MONASH RESEARCH ACADEMY, 2 IIT BOMBAY, 3 MONASH UNIVERSITY 1
22

A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

Mar 11, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N E R AT I O N F R O M T E X T U S I N G D E E P R E I N F O R C E M E N T L E A R N I N G

S C A I 2 0 1 9 , 1 2 / 0 8 / 2 0 1 9

V I S H WA J E E T K U M A R 1 , 2 , 3 , G A N E S H R A M A K R I S H N A N 2, Y U A N - FA N G L I 3

1 I I T B - M O N A S H R E S E A R C H A C A D E M Y, 2 I I T B O M B AY, 3M O N A S H U N I V E R S I T Y

!1

Page 2: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

O U T L I N E

• Introduction & motivation

• The generator-evaluator framework

• Evaluation

• Conclusion

!2

Page 3: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

W H E N / W H E R E / W H Y D O W E A S K Q U E S T I O N S ?

• Organisation: policies, product & service documentation, patents, meeting minutes, FAQ, …

• Education: reading comprehension assessment

• Healthcare: clinical notes

• Technology: chatbots, customer support, …

!3

Page 4: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

T H E Q U E S T I O N G E N E R AT I O N TA S K

• Goal

• Automatically generating questions

• From sentences or paragraphs

• Challenges

• Questions must be well-formed

• Questions must be relevant

• Questions must be answerable

!4

Page 5: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

M O T I VAT I O N

• QG: a (relatively) recent task: a Seq2Seq problem

• RNN-based models with attention perform well for short sentences

• However for longer text they perform poorly

• Cross-entropy loss may make the training process brittle: the exposure bias problem

!5

Page 6: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

E X A M P L E G E N E R AT E D Q U E S T I O N S

!6

M O D E L Q U E S T I O N

S e q 2 S e q w i t h c r o s s - e n t r o p y l o s s

w h a t y e a r w a s n e w y o r k n a m e d ?

C o p y - a w a r e s e q 2 s e q

w h a t y e a r w a s n e w n e w a m s t e r d a m n a m e d ?

G E ( S e q 2 s e q w i t h B L E U )

w h a t y e a r w a s n e w y o r k f o u n d e d ?

Example text: “new york city traces its roots to its 1624 founding as a trading post by colonists of the dutch republic and was named new amsterdam in 1626 .”

Page 7: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

T O B E M O R E S P E C I F I C

• QG performance is evaluated using discrete metrics like BLEU, ROUGE etc., not cross-entropy loss

• Need for a mechanism to deal with relatively rare word and important words

• Need to handle the word repetition problem while decoding

!7

Page 8: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

O U T L I N E

• Introduction & motivation

• The generator-evaluator framework

• Evaluation

• Conclusion

!8

Page 9: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

A G E N E R AT O R - E VA L U AT O R F R A M E W O R K F O R Q G

• Generator (semantics)

• Identifies pivotal answers (Pointer Networks)

• Recognises contextually important keywords (Copy)

• Avoids redundancy (Coverage)

• Evaluator (structure)

• Optimises conformity towards ground-truth questions

• Reinforcement learning with performance metrics as rewards

!9

Page 10: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

R E I N F O R C E M E N T L E A R N I N G F O R Q G

!10

BLEU, ROUGE-L, METEOR, etc.

Generator

Parameter update

Words and the context vector

Page 11: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

Generator

LSTM Question DecoderBi-LSTM Answer Encoded Sentence Encoder

PcgAttention distribution

Vocabulary DistributionContext Vector

Word Coverage Vector

Final DistributionEvaluator

YGold

Reward Ysamples

Training data

...

Pointer NetworkAnswer Encoder

!11

AR

CH

ITE

CT

UR

E

Page 12: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

R E W A R D F U N C T I O N S

• General rewards

• BLEU, GLEU, METEOR, ROUGE-L

• DAS: decomposable attention that considers variability

• QG-specific rewards

• QSS: degree of overlap between generated question & source sentence

• ANSS: degree of overlap between predicted answer & gold answer

!12

Page 13: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

O U T L I N E

• Introduction & motivation

• The generator-evaluator framework

• Evaluation

• Conclusion

!13

Page 14: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

E VA L U AT I O N : D ATA S E T & B A S E L I N E S

• Dataset: SQuAD

• Train: 70,484

• Valid: 10,570

• Test: 11,877

• Baselines

• Learning to ask (L2A): vanilla Seq2Seq model (ACL’17)

• NQGLC: Seq2Seq + ground-truth answer encoding (NAACL’18)

• AutoQG: Seq2Seq + answer prediction (PAKDD’18)

• SUM: RL-based summarisation (ICLR’18)

!14

Page 15: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

A U T O M AT I C E VA L U AT I O N

!15

M O D E L B L E U 1 B L E U 2 B L E U 3 B L E U 4 M E T E O R R O U G E - LL 2 A 4 3 . 2 1 2 4 . 7 7 1 5 . 9 3 1 0 . 6 0 1 6 . 3 9 3 8 . 9 8

A u t o Q G 4 4 . 6 8 2 6 . 9 6 1 8 . 1 8 1 2 . 6 8 1 7 . 8 6 4 0 . 5 9

N Q G L C - - - ( 1 3 . 9 8 ) ( 1 8 . 7 7 ) ( 4 2 . 7 2 )

S U M B L E U 1 1 . 2 0 3 . 5 0 1 . 2 1 0 . 4 5 6 . 6 8 1 5 . 2 5

S U M R O U G E 1 1 . 9 4 3 . 9 5 1 . 6 5 0 . 0 8 2 6 . 6 1 1 6 . 1 7

G E B L E U 4 6 . 8 4 2 9 . 3 8 2 0 . 3 3 1 4 . 4 7 1 9 . 0 8 4 1 . 0 7

G E B L E U + Q S S + A N S S 4 6 . 5 9 2 9 . 6 8 2 0 . 7 9 1 5 . 0 4 1 9 . 3 2 4 1 . 7 3

G E D A S 4 4 . 6 4 2 8 . 2 5 1 9 . 6 3 1 4 . 0 7 1 8 . 1 2 4 2 . 0 7

G E D A S + Q S S + A N S S 4 6 . 0 7 2 9 . 7 8 2 1 . 4 3 1 6 . 2 2 1 9 . 4 4 4 2 . 8 4

G E G L U E 4 5 . 2 0 2 9 . 2 2 2 0 . 7 9 1 5 . 2 6 1 8 . 9 8 4 3 . 4 7

G E G L U E + Q S S + A N S S 4 7 . 0 4 3 0 . 0 3 2 1 . 1 5 1 5 . 9 2 1 9 . 0 5 4 3 . 5 5

G E R O U G E 4 7 . 0 1 3 0 . 6 7 2 1 . 9 5 1 6 . 1 7 1 9 . 8 5 4 3 . 9 0

G E R O U G E + Q S S + A N S S 4 8 . 1 3 3 1 . 1 5 2 2 . 0 1 1 6 . 4 8 2 0 . 2 1 4 4 . 1 1

Page 16: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

H U M A N E VA L U AT I O N

!16

M O D E LS Y N TA X S E M A N T I C S R E L E VA N C E

S C O R E K A P PA S C O R E K A P PA S C O R E K A P PA

L 2 A 3 9 . 2 0 . 4 9 3 9 0 . 4 9 2 9 0 . 4 0

A u t o Q G 5 1 . 5 0 . 4 9 4 8 0 . 7 8 4 8 0 . 5 0

G E B L E U 4 7 . 5 0 . 5 2 4 9 0 . 4 5 4 1 . 5 0 . 4 4

G E B L E U + Q S S + A N S S 8 2 0 . 6 3 7 5 . 3 0 . 6 8 7 8 . 3 3 0 . 4 6

G E D A S 6 8 0 . 4 0 6 3 0 . 3 3 4 1 0 . 4 0

G E D A S + Q S S + A N S S 8 4 0 . 5 7 8 1 . 3 0 . 6 0 7 4 0 . 4 7

G E G L U E 6 0 . 5 0 . 5 0 6 2 0 . 5 2 4 4 0 . 4 1

G E G L U E + Q S S + A N S S 7 8 . 3 0 . 6 8 7 4 . 6 0 . 7 1 7 2 0 . 4 0

G E R O U G E 6 9 . 5 0 . 5 6 6 8 0 . 5 8 5 3 0 . 4 3

G E R O U G E + Q S S + A N S S 7 9 . 3 0 . 5 2 7 2 0 . 4 1 6 7 0 . 4 1

Page 17: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

O U T L I N E

• Introduction & motivation

• The generator-evaluator framework

• Evaluation

• Conclusion

!17

Page 18: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

C O N C L U S I O N

• A generator-evaluator framework for question generation from text

• Takes into account both semantics & structure

• Proposes novel reward functions

• Evaluation shows state-of-the-art performance

!18

Page 19: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

A N Y Q U E S T I O N S ?

T H A N K Y O U !

!19

Page 20: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

R E F E R E N C E S

• Xinya Du, Junru Shao, and Claire Cardie. Learning to ask: Neural question generation for reading comprehension. In ACL, volume 1, pages 1342–1352, 2017.

• Vishwajeet Kumar, Kireeti Boorla, Yogesh Meena, Ganesh Ramakrishnan, and Yuan-Fang Li. Au- tomating reading comprehension by generating question and answer pairs. In PAKDD, 2018.

• Pranav Rajpurkar, Jian Zhang, Kon- stantin Lopyrev, and Percy Liang. SQuAD: 100,000+ questions for machine comprehension of text. In EMNLP 2016, pages 2383–2392. ACL, November 2016.

• Linfeng Song, Zhiguo Wang, Wael Hamza, Yue Zhang, and Daniel Gildea. Leveraging context information for natural question generation. In NAACL, pages 569–574, 2018.

• Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive summarization. In ICLR, 2018.

!20

Page 21: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

S O M E M O R E E X A M P L E S

!21

Text: “critics such as economist paul krugman and u.s. treasury secretary timothy geithner have argued that the regulatory framework did not keep pace with financial innovation, such as the increasing importance of the shadow banking system, derivatives and off-balance sheet financing.”

M O D E L Q U E S T I O N

A u t o Q G w h o a r g u e d t h a t t h e r e g u l a t o r y f r a m e w o r k w a s n o t k e e p t o t a k e p a c e w i t h f i n a n c i a l i n n o v a t i o n ?

G E B L E Uw h a t w a s t h e n a m e o f t h e i n c r e a s i n g i m p o r t a n c e o f t h e s h a d o w b a n k i n g s y s t e m ?

G E D A Sw h a t w a s t h e m a i n f o c u s o f t h e p r o b l e m w i t h t h e s h a d o w b a n k i n g s y s t e m ?

G E G L E U w h a t w a s n o t k e e p p a c e w i t h f i n a n c i a l i n n o v a t i o n ?

G E R O U G E w h a t d i d p a u l k r u g m a n a n d u . s . t r e a s u r y s e c r e t a r y d i s a g r e e w i t h ?

Page 22: A FRAMEWORK FOR AUTOMATIC QUESTION GENERATION …

– H T T P S : / / E N . W I K I P E D I A . O R G / W I K I / W A R S A W

“Legislative power in Warsaw is vested in a unicameral Warsaw City Council (Rada Miasta),which comprises 60 members. Council members are elected directly every four years . Like most legislative bodies, the City Council divides itself into committees which have the oversight of various functions of the city government.”

!22

1 H o w m a n y m e m b e r s a r e i n t h e Wa r s a w C i t y C o u n c i l ?

2 H o w o f t e n a r e t h e R a d a M i a s t a e l e c t e d ?

3 T h e C i t y C o u n c i l d i v i d e s i t s e l f i n t o w h a t ?