Top Banner
Constructing Software Knowledge Graph from Software Text Sirui Li (u5831882) Supervisor:Dr. ZhenchangXing Team Member: Hongwei Li, Jiamou Sun COMP8755 IndividualComputingProject Semester 1, 2018
17

Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Dec 09, 2018

Download

Documents

dobao
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

ConstructingSoftwareKnowledgeGraphfromSoftwareText

Sirui Li(u5831882)Supervisor:Dr.ZhenchangXing

TeamMember:Hongwei Li,Jiamou SunCOMP8755IndividualComputingProject

Semester1,2018

Page 2: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

WhatisSoftwareText?

• APIdocumentation

• StackOverflowPost

• BugReport

Page 3: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Motivation(FormativeStudy)• AccessibilityissueofAPIusagedirectives(APIcaveats)

Summary:Ourformativestudyshowsthatmanyprogramming issuescouldactuallybeavoided ifdeveloperswereawareof relevantAPIcaveatsinAPIdocumentation.Unfortunately,developersmostlydiscoverAPIcaveatspostmortemaftersomething wronghappened, ratherthanbewaring oftheAPIcaveatsbeforehand toavoidthemistakesinthefirstplace.

Page 4: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Goal

TotackletheaccessibilityofAPIcaveatsbymininganAPIcaveatsknowledgegraphfrommultiplesourcesofAPIdocumentation.

Page 5: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Approach

1.Preprocessinputdocumentation2. BuildanAPIskeletongraphfromsemi-structuredAPIreferencedocumentation3.ExtractAPIcaveatsentencesfromAPItextualdescriptions4.ConstructtheAPI-caveatsknowledgegraphbylinkingAPIcaveatsentencestorelevantAPIs

Page 6: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Approach(Preprocessinputdocumentation)

• Inputdocumentation:APIreferencedocumentationandAPItutorialse.g.AndroidAPIreference,AndroidDeveloperGuides

• Software-specifictokenizer (Yeetal)+StanfordCoreNLPe.g.“voidsetOnBufferAvailableListener (Alloca- tion.OnBufferAvailableListenercallback)SetanotificationhandlerforUSAGE_IO_INPUT”

Page 7: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Approach(BuildanAPIskeletongraph)

• FromAPIreferencedocumentation1. Entities:classes,interfaces,fields,methodsandparameters.2. Relations:containment,inheritance/implementation,fielddatatype,

methodreturntype,methodparametertype,andmethod-thrown-exception.

Page 8: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Approach(ExtractAPICaveatSentences)Eachpatternisdefinedbyaregularexpression:1.Error/Exception:Youmuststoreastrongreferencetothelistener,otherwiseitwillbesusceptible togarbagecollection2.Recommendation:youarebetteroffusingJobIntentService,whichuses jobsinsteadofservices...3.Alternative:Youmustcallrelease()whenyouaredoneusingthecamera,otherwise itwillremainlockedandbeunavailabletoapplications4.Imperative:Donot confusethismethodwithactivitylifecyclecallbackssuchasonPause()...5.Note:Note:Forallactivities,youmustdeclareyourintentfiltersinthemanifestfile6.Conditional:When usingasubclass ofAsyncTask torunnetworkoperations, youmustbecautious..7.Temporal:Thismaybenull iftheserviceisbeingrestartedafteritsprocesshasgoneaway8.Affirmative:Theidentifierdoesnothavetobeunique inView,butitshould bepositive9.Negative:Anyactivitiesthatarenotdeclaredtherewill notbeseenbythesystemandwillneverberun10.Emphasis:Only objectsrunning ontheUIthreadhaveaccesstootherobjectsonthatthread

Page 9: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Approach(BuildAPICaveatsKnowledgeGraph)

1. Co-referenceResolutiona) StanfordCoreNLPb) Declaration-basedResolution:“thismethod”,“thisclass”,etc.

e.g.Activity.onActionModeStarted:“Activitysubclassesoverridingthismethod shouldcallthesuperclassimplementation.Ifyouoverridethismethodyoumustcallthroughtothesuperclassimplementation.”

2. LinkingAPICaveatstoAPIEntitiesa) Hyperlinkbased

e.g.“ItthrowsActivityNotFoundException if...”ishyperlinkedtothereferencepageof“ActivityNotFoundException”,wecanthenlinkthissentencetothe“ActivityNotFoundException”

b) Declarationbasede.g.View.setId(int):“DonotpassaresourceID”and“Theidentifiershouldbeapositivenumber”.

c) Openlinking(Subject-Verb-Objecttriples)e.g. “Youmustcallrelease()...” linkswith“release()”

Page 10: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Evaluation

• RQ1:WhatistheabundanceofdifferentsubcategoriesofAPIcaveatsinAPIdocumentation?• RQ2:CanourapproachaccuratelyextractAPIcaveatsentences,re-solveco-referencesinthesesentences,andlinkAPIcaveatsentencestoAPIentities?• RQ3:CanourAPIcaveatsknowledgegraphandAPIcaveatssearchimprovetheaccessibilityofAPIcaveats,comparedwithtraditionaldocumentationsearch?

Page 11: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Evaluation(RQ1:TheAbundanceofAPICaveat)

• ACFORscale(Abundant,Common,Frequent,Occasional,Rare)1

1http://en.wikipedia.org/wiki/Abundance_(ecology).

Conditional, Affirmative,NegativeandEmphasisAPIcaveatsaredominant inourknowledgegraph.ThefactthatmostofAPIcaveatsdonothaveexplicitcaveatindicatorscouldhelptoexplainwhyAPIcaveatsarehardtonoticeintheAPIdescriptions, especiallythelengthyones.

Page 12: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Evaluation(RQ2:TheQualityofAPICaveatsKnowledgeGraph)• AccuracyofExtractingAPICaveatSentences

a) Cohen’skappametric:88%->almostperfectagreementb) Accuracy:100%

• AccuracyofCo-referenceResolutiona) Cohen’skappametric:97%->almostperfectagreementb) Accuracy:74.22%

• AccuracyofCaveat-Sentence-APILinkinga) Cohen’skappametric:99.74%(declaration-based),98.70%(openlinking)->almostperfect

agreementb) Accuracy:99.48%(declaration-based),98.44%(openlinking)

Page 13: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Evaluation(RQ3:TheImprovementofAPICaveatsAccessibility)

Page 14: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Evaluation(RQ3:TheImprovementofAPICaveatsAccessibility)

Participants:12third- andfourth-yearunder- graduatestudents fromourschool(NoneofthemhaveAndroiddevelopmentexperience.)

1.ControlGroup:useGooglesearchenginetosearchforAPIcaveatsontheAndroidDeveloperswebsite2.ExperimentalGroup:useoursearchtooltosearchforAPIcaveatsintheAndroidAPIcaveatsknowledgegraphweconstruct.

ExperimentProcedure:Ansimpleapplicationrecordstheiranswers,thetimeandthedifficultyofthequestionandparticipants’confidenceinthesubmittedanswerusingthe5-pointLikertscale.

DataAnalysis:question-completion-timestatisticsandthequestion-difficultyandanswer-confidenceratingsofthetwogroups

Page 15: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Evaluation(RQ3:TheImprovementofAPICaveatsAccessibility)

Theparticipantsoftheexperimentalgroupcompletethequestions fasterthanthoseofthecontrolgroup(86.13±22.26secondsversus142.33±50.02seconds),andtheAPIcaveatsthattheexperimentalgroupfindsaremoreaccuratethanthosefoundbythecontrolgroup (62.91±6.76%versus40.94±15.42%).TheWilcoxonRankSumTestshowsthatboth thedifferencebetweenquestion-completion timeandthedifferenceofcorrect-API-caveatspercentagebetweenthetwogroupsarestatisticallysignificantatp-value<0.05.

Page 16: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Evaluation(RQ3:TheImprovementofAPICaveatsAccessibility)

(a)QuestionDifficulty

(b)AnswerConfidence

Theobjectiveperformance resultsandthe“surprising”subjectratingsrevealtheaccessibilityissueofAPIcaveatsinAPIdocumentationwhichcouldcreateanillusionofalreadyknowing therightAPIusage.AsourknowledgegraphmakestheAPIcaveatsmoreeasilyaccessible,bewaring oftheAPIcaveatswouldmakethedevelopersrealizethatusinganAPIproperlymaynotbeaseasyasitlooks.ThisimprovedawarenessofAPIcaveatscouldmakethedevelopersmorecautiouswhenusinganAPI,andthuspotentiallyavoidingsomemistakesinthefirstplace.

Page 17: Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

Questions?Suggestions?Thankyou! ☺