Top Banner
Neural Module Networks for Reasoning Over Text Nitish Gupta , Kevin Lin , Dan Roth , Sameer Singh & Matt Gardner Presented by: Jigyasa Gupta
54

Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Jun 29, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

NeuralModuleNetworksforReasoningOverText

Nitish Gupta,KevinLin,DanRoth,SameerSingh&MattGardner

Presentedby:Jigyasa Gupta

Page 2: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

NeuralModules• Introducedinthepaper“DeepCompositionalQuestionAnsweringwithNeuralModuleNetworks”byJacobAndreas,MarcusRohrbach,Trevor Darrell,DanKleinforVisualQAtask

SlidesofNeuralModulestakenfromBerthy Feng,astudentatPrincetonUniversity

Page 3: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Motivation :CompositionalNatureofVQA

SlidesofNeuralModulestakenfromBerthy Feng,astudentatPrincetonUniversity

Page 4: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Motivation :CompositionalNatureofVQA

Page 5: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Motivation:CombineBothApproaches

Page 6: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 7: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 8: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Modules

Attention(Find)Re-Attention(Transform)CombinationClassification(Describe)Measurement

Page 9: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 10: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 11: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 12: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 13: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 14: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 15: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 16: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 17: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 18: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

DROP:AReadingComprehensionBenchmarkRequiringDiscreteReasoningOverParagraphs

Dheeru Dua,Yizhong Wang,PradeepDasigi,GabrielStanovsky,SameerSingh,andMattGardner

Page 19: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 20: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

• UseNeuralModuleNetworks(NMNs)toanswercompositionalquestionsagainstaparagraphoftext.

• Requiremultiplestepsofreasoning:discrete,symbolicoperations(asshowninDROPdataset)

• NMNsare• Interpretable• Modular• Compositional

NEURALMODULENETWORKSFORREASONINGOVERTEXT

Page 21: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Example

Page 22: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

NMNcomponents

• Modules:differentiablemodulesthatperformreasoningovertextandsymbolsinaprobabilisticmanner• Contextualtokenrepresentations:• nandmarenumberoftokensinquesandpara,d=sizeofembedding(bidirectional- GRUorpretrainedBERT)

• QuestionParser:encoderdecodermodelwithattentiontomapquestionintoexecutableprogram• Learning:• likelihoodoftheprogramunderthequestion-parsermodelp(z|q)• foranygivenprogramz,likelihoodofthegold-answerp(y∗|z)

Page 23: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Questionembedding

Paragraphembedding Answer(y*)

Encoder Decoder Decoder Decoder Decoder

Module1 Module2 Module3 Module4

Programexecutor(z)

QuestionParser

JointLearning

NMNcomponents

Page 24: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

LearningChallenges

• QuestionParser:• Freeformrealworldquestions:diversegrammarandlexicalvariability

• ProgramExecutor• Nointermediatefeedbackavailableformodules.Errorsgetspropagated

• JointLearning:• supervisiononlyatgoldlevel,difficulttolearnquestionparserandprogramexecutorjointly

Page 25: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Modules

Page 26: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

find(Q)→PForquestionspansintheinput,findsimilarspansinthepassage

• Similaritymatrixbetweenquestionandparatokensembedding

• NormalizeStogetattentionmatrix• Computeexpectedparagraphattention

Inputquestionattentionmap

Outputparaattentionmap

Page 27: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

find(Q)→P:Example

Questionattentionmapisavailablefromtheencoder–decoderofparser

Page 28: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

filter(Q,P)→PBasedonthequestion,selectasubsetofspansfromtheinput

• Weightedsumofquestion-tokenembedding

• Computealocally-normalizedparagraph-tokenmask

• Outputisanormalizedmaskedinputparagraphattention

Page 29: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

filter(Q,P)→P :Example

Page 30: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

relocate(Q,P)→PFindtheargumentaskedforinthequestionforinputparagraphspans

• Weightedsumofquestion-tokenembeddingwithattentionmap

• Computeaparagraph-to-paragraphattentionmatrix

• OutputattentionisaweightedsumoftherowsRweightedbytheinputparagraphattention

Page 31: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

find-num(P)→Nandfind-date(P)→DFindthenumber(s)/date(s)associatedtotheinputparagraphspans

• Extractnumbersanddatesasapre-processingstep,eg [2,2,3,4]• Computeatoken-to-number similarity matrix

• Computean expected distribution overthe number tokens

• Aggregate the probabilities fornumber-tokens ,• Example :{2,3,4}with N=[0.5,0.3,0.2]

Page 32: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

find-num(P)→N:xample

Page 33: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 34: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

count(P)→CCountthenumberofinputpassagespans

• Count([0,0,0.3,0.3,0,0.4])=2• Modulefirstscalestheattentionusingthevalues[1,2,5,10]toconvertitintoamatrixPscaled∈ Rm×4

Pretraining thismodulebygeneratingsyntheticdataofattentionandcountvalueshelps

Normalized-passage-attentionwherepassagelengthsaretypically400-500tokens.Hencescalingtheattentionusingvalues>1helpsthemodelindifferentiatingamongstsmallvalues.

Page 35: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 36: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

compare-num-lt(P1,P2)→POutputthespanassociatedwiththesmallernumber

• N1=find_num(P1),N2=find_num(P2)• Computestwosoftboolean values,p(N1<N2)andp(N2<N1)

• Outputsaweightedsumoftheinputparagraphattentions

Page 37: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 38: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

time-diff(P1,P2)→TDDifferencebetweenthedatesassociatedwiththeparagraphspans

• Moduleinternallycallsthefind-datemoduletogetadatedistributionforthetwoparagraphattentions,D1andD2

Page 39: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

find-max-num(P)→P,find-min-num(P)→PSelectthespanthatisassociatedwiththelargestnumber

• ComputeanexpectednumbertokendistributionTusingfind-num• Computetheexpectedprobabilitythateachnumbertokenistheonewiththemaximumvalue,Tmax∈ Rntokens

• Reweight thecontributionfromthei-th paragraphtokentothej-thnumbertoken

Page 40: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 41: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

span(P)→SIdentifyacontiguousspanfromtheattendedtokens

• Onlyappearsastheoutermostmoduleinaprogram.• Outputstwoprobabilitydistributions,Ps andPe∈ Rm,denotingstartandendofaspan• Thismoduleisimplementedsimilartothecountmodule

Page 42: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Auxiliarysupervision

• unsupervisedauxiliarylosstoprovideaninductivebiastotheexecutionoffind-num,find-date,andrelocatemodules• provideheuristically-obtainedsupervisionforquestionprogramandintermediatemoduleoutputforasubsetofquestions(5–10%).

Page 43: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

UnsupervisedauxiliarylossforIE

• find-num,find-date,andrelocatemodulesperforminformationextraction• ObjectiveincreasesthesumoftheattentionprobabilitiesforoutputtokensthatappearwithinawindowW=10

Page 44: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

QuestionParseSupervision

• Heuristicpatternstogetprogramandcorrespondingquestionattentionsupervisionforasubsetofthetrainingdata(10%)

Page 45: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

IntermediateModuleOutputSupervision

• Usedforfind-num andfind-datemodules• Forasubsetofthequestions(5%)• Eg :“howmanyyardswasthelongest/shortesttouchdown?”• Identifyallinstancesofthetoken“touchdown”• Assumetheclosestnumbertoitshouldbeanoutputofthefind-nummodule.• Supervisethisasamulti-hotvectorN∗ anduseanauxiliaryloss

Page 46: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Dataset

20,000questionsfortraining/validation,and1800questionsfortesting(25%ofDROP)Automaticallyextractedquestionsinthescopeofmodelbasedontheirfirstn-gram.

Page 47: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

RESULTS

Page 48: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

RESULTS– QuestionsType

Page 49: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

EffectofAuxiliarySupervision

Page 50: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

IncorrectProgramPredictions.

• HowmanytouchdownpassesdidTomBradythrowintheseason?-count(find)• Correctanswerrequiresasimplelookupfromtheparagraph.

• Whichhappenedlast,failedassassinationattemptonLenin,ortheRedTerror?date-compare-gt(find,find))• Correctanswerrequiresnaturallanguageinferenceabouttheorderofeventsandnotsymboliccomparisonbetweendates.

• Whocaughtthemosttouchdownpasses?- relocate(find-max-num(find))).• Requirenestedcountingwhichisoutofscope

Page 51: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

FutureWork

• Designadditionalmodules• Howmanylanguageseachhadlessthan115,000speakersinthepopulation?• Whichquarterbackthrewthemosttouchdownpasses?• Howmanypointsdidthepackersfallbehindduringthegame?

• UsecompletedatasetofDROP:Incurrentsystem,trainingmodelonthequestionsforwhichmodulescan’texpressthecorrectreasoningharmstheirabilitytoexecutetheirintendedoperations

• Opensupavenuesfortransferlearningwheremodulescanbeindependentlytrainedusingindirectordistantsupervisionfromdifferenttasks

• Combiningblack-boxoperationswiththeinterpretablemodulessothatcancapturemoreexpressivity

Page 52: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

ReviewComments- Pros

• Interestingidea[Atishya,Rajas,Keshav,Siddhant,Lovish]• Interpretableandmodular[Atishya,Rajas,Siddhant,Lovish,Vipul]• BetterthanBERTforsymbolicreasoning[Keshav]• Auxiliarylossformulationseemsaverynovelidea[Vipul]• Questionparserhasnewrole:parsetoreturncompositionofmodules.[Pawan]

Page 53: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Reviewcomments- Cons

• Difficulttounderstandmoduledescription[Atishya,Siddhant]• Auxillary lossnotgeneralizable[Atishya,Rajas]• Contributionofeachmodulenotstudied[Atishya,Rajas,Siddhant,Lovish,Pawan]• Only22%ofDROPdatasetused[Rajas,Keshav,Lovish]• Compositionalreasoningquerieslike“WhoisthemotherofPMofIndia?”arenothandled.[Keshav]• Endlessamountofmodulesrequiredtoachievefullreasoningcapability[Vipul]

Page 54: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Reviewcomments- Extensions

• Studyonthecontributionofeachmodule[Atishya]• Pre-trainallthemodulesbycollectingdatausingspecificheuristics[Atishya,Rajas]• RLframeworktopredictwhetheragivenquestioncanbesufficientlyreasoned [Rajas]• Moduletopredictopen-predicatesofthetypePM(India,x)&Mother(x,y).[Keshav,Vipul]• Trainmultipurposemodules(topredict citizenof and presidentof relationships)[Vipul]• Combineend-to-endneuralsystemandNMN[Keshav]• Learnnewmodulesfromdatasetautomatically;learnnewSPARQLtemplatefromdata )[Siddhant,Pawan]• Curriculumlearning[Siddhant]• Metalearning toautomaticallydeterminethemodules[Lovish]