Top Banner
Information Needs in Collocated Software Development Teams Andrew J. Ko Human-Computer Interaction Institute Carnegie Mellon University Forbes Ave, Pittsburgh PA [email protected] Robert DeLine and Gina Venolia Microsoft Research One Microsoft Way Redmond, WA {rdeline, ginav}@microsoft.com Abstract Previous research has documented the fragmented na- ture of software development work. To explain this in more detail, we analyzed software developers’ day-to-day information needs. We observed seventeen developers at a large software company and transcribed their activities in -minute sessions. We analyzed these logs for the information that developers sought, the sources that they used, and the situations that prevented information from being acquired. We identified twenty-one information types and cataloged the outcome and source when each type of information was sought. The most frequently sought information included awareness about artifacts and coworkers. The most often deferred searches in- cluded knowledge about design and program behavior, such as why code was written a particular way, what a program was supposed to do, and the cause of a program state. Developers often had to defer tasks because the only source of knowledge was unavailable coworkers. 1. Introduction Soware development is an expensive and time-intensive endeavor. Projects ship late and buggy, despite develop- ers’ best efforts, and what seem like simple projects be- come difficult and intractable [2]. Given the complex work involved, this should not be surprising. Designing soware with a consistent vision requires the consensus of many people, developers exert great efforts at under- standing a system’s dependencies and behaviors [11], and bugs can arise from large chasms between the cause and the symptom, oen making tools inapplicable [6]. One approach to understanding why these activities are so difficult is to understand them from an informa- tion perspective. Some studies have investigated informa- tion sources, such as people [13], code repositories [5], and bug reports [16]. Others have studied means of ac- quiring information, such as email, instant messages (), and informal conversations [16]. Studies have even characterized developers’ strategies [9], for example, how they decide whom to ask for help. While these studies provide several concrete insights about aspects of software development work, we still know little about what information developers look for and why they look for it. For example, what information do developers use to triage bugs? What knowledge do developers seek from their coworkers? What are develop- ers looking for when they search source code or use a debugger? By identifying the types of information that developers seek, we might better understand what tools, processes and practices could help them more easily find such information. To understand these information needs in more de- tail, we performed a two-month field study of software developers at Microsoft. We took a broad look, observing groups across the corporation, focusing on three specific questions: · What information do soware developers’ seek? · Where do developers seek this information? · What prevents them from finding information? In our observations, we found several information needs. The most difficult to satisfy were design questions: for example, developers needed to know the intent behind existing code and code yet to be written. Other informa- tion seeking was deferred because the coworkers who had the knowledge were unavailable. Some information was nearly impossible to find, like bug reproduction steps and the root causes of failures. In this paper, we discuss prior field studies of soware development, and then describe our study’s methodol- ogy. We then discuss the information needs that we iden- tified in both qualitative and quantitative terms. We then discuss our findings’ implications on soware design and engineering. 2. Related Work Several previous studies have documented the social na- ture of development work. Perry, Staudenmayer and Votta reported that over half of developers’ time was spent interacting with coworkers [15]. Much of this communication is to maintain awareness. De Souza, Redmiles, Penix and Sierhuis found that developers send emails before check-ins to allow their peers to prepare for
10

Information Needs in Collocated Software Development Teams

May 03, 2023

Download

Documents

Jason Groves
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Information Needs in Collocated Software Development Teams

InformationNeedsinCollocatedSoftwareDevelopmentTeams

AndrewJ.KoHuman-ComputerInteractionInstitute

CarnegieMellonUniversityForbesAve,PittsburghPA

[email protected]

RobertDeLineandGinaVenoliaMicrosoftResearchOneMicrosoftWayRedmond,WA

{rdeline,ginav}@microsoft.com

Abstract

Previousresearchhasdocumentedthefragmentedna-tureofsoftwaredevelopmentwork.Toexplainthis inmoredetail,weanalyzedsoftwaredevelopers’day-to-dayinformationneeds.Weobservedseventeendevelopersatalargesoftwarecompanyandtranscribedtheiractivitiesin-minute sessions.We analyzed these logs for theinformationthatdeveloperssought,thesourcesthattheyused,andthesituationsthatpreventedinformationfrombeingacquired.Weidentifiedtwenty-oneinformationtypesandcatalogedtheoutcomeandsourcewheneachtype of information was sought. Themost frequentlysought informationincludedawarenessaboutartifactsand coworkers. Themost often deferred searches in-cludedknowledgeaboutdesignandprogrambehavior,suchaswhycodewaswrittenaparticularway,whataprogramwassupposedtodo,andthecauseofaprogramstate. Developers often had to defer tasks because theonlysourceofknowledgewasunavailablecoworkers.

1. Introduction

Sowaredevelopmentisanexpensiveandtime-intensiveendeavor.Projectsshiplateandbuggy,despitedevelop-ers’bestefforts,andwhatseemlikesimpleprojectsbe-come difficult and intractable[2]. Given the complexworkinvolved,thisshouldnotbesurprising.Designingsowarewithaconsistentvisionrequirestheconsensusofmanypeople,developersexertgreateffortsatunder-standingasystem’sdependenciesandbehaviors[11],andbugscanarisefromlargechasmsbetweenthecauseandthesymptom,oenmakingtoolsinapplicable[6].

Oneapproachtounderstandingwhytheseactivitiesaresodifficultistounderstandthemfromaninforma-tionperspective.Somestudieshaveinvestigatedinforma-tionsources,suchaspeople[13],coderepositories[5],andbugreports[16].Othershavestudiedmeansofac-quiring information, such as email, instant messages(),andinformalconversations[16].Studieshaveevencharacterizeddevelopers’strategies[9],forexample,howtheydecidewhomtoaskforhelp.

Whilethesestudiesprovideseveralconcreteinsightsabout aspects of software development work, we stillknowlittleaboutwhatinformationdeveloperslookforandwhytheylookforit.Forexample,whatinformationdodevelopersusetotriagebugs?Whatknowledgedodevelopersseekfromtheircoworkers?Whataredevelop-ers lookingforwhentheysearchsourcecodeoruseadebugger?Byidentifyingthetypesofinformationthatdevelopersseek,wemightbetterunderstandwhattools,processesandpracticescouldhelpthemmoreeasilyfindsuchinformation.

Tounderstandtheseinformationneedsinmorede-tail,weperformedatwo-monthfieldstudyofsoftwaredevelopersatMicrosoft.Wetookabroadlook,observing groups across the corporation, focusing on threespecificquestions:· Whatinformationdosowaredevelopers’seek?· Wheredodevelopersseekthisinformation?· Whatpreventsthemfromfindinginformation?Inourobservations,wefoundseveralinformationneeds.Themostdifficult tosatisfyweredesignquestions: forexample,developersneededtoknowtheintentbehindexistingcodeandcodeyettobewritten.Otherinforma-tionseekingwasdeferredbecausethecoworkerswhohadtheknowledgewereunavailable.Someinformationwasnearly impossibletofind, like bug reproduction stepsandtherootcausesoffailures.

Inthispaper,wediscusspriorfieldstudiesofsowaredevelopment,andthendescribeourstudy’smethodol-ogy.Wethendiscusstheinformationneedsthatweiden-tifiedinbothqualitativeandquantitativeterms.Wethendiscussourfindings’implicationsonsowaredesignandengineering.

2. RelatedWork

Severalpreviousstudieshavedocumentedthesocialna-ture of development work. Perry, Staudenmayer andVotta reported that over half of developers’ time wasspent interacting with coworkers [15]. Much of thiscommunication is to maintain awareness. De Souza,Redmiles,PenixandSierhuisfoundthatdeveloperssendemailsbeforecheck-instoallowtheirpeerstopreparefor

Page 2: Information Needs in Collocated Software Development Teams

changes[5].Collocationisacentralfactorindeterminingthequalityofawarenessinformation.SeamanandBasilifound that localmobility facilitates awareness inwaysthatareunavailableindistributedsituations[18].Simi-larly,coordinationproblemscanbeexaggeratedacrosssitesbecauseofthelackofspontaneouscommunicationchannels[8].

Developers also communicate to obtain knowl-edge[9].LaToza,VenoliaandDeLinedescribetheroleofthe“teamhistorian,”whopossessesknowledgeabouttheoriginsofaprojectanditsarchitecture[13].Todeter-minewho to ask, developers oen gauge expertise byinspectingcheck-inlogs[5],butsuchinformationisnotalwaysaccurate[12].

Oneconsequenceofdevelopers’frequentcommuni-cationisthefragmentationoftime.Gonzalez,MarkandHarrisfoundthatdevelopersaverageaboutminutesonataskandaboutminutes in anareaofworkbeforeswitching[7].Theseswitchesoccurduetochangingtaskprioritiesandgettingblocked[15].Perlowrelatedhowone soware group's frequent interruptions created asenseofa"timefamine”—havingtoomuchtodoandnotenoughtime[14].

Dependenciesarealsoacentralfactorinsowarede-velopment.Developers usebug reports, contentman-agementsystems,andversioncontrolsystemstomanagedependencies andnotify coworkersofnewdependen-cies[5]. Teams will clone soware to avoiddependencies,eventhoughtheylaterhavetoduplicatefixestotheclonedcode[13].Developersalsorushtheiractivitiestominimizedependenciesbetweentheircodeandrecentlycommittedchangesintherepository[5].

Thesepreviousstudiesprovideageneralsenseoftheimportance of communication among developers tomaintainawareness,shareknowledge,andmanagede-pendencies.Usingsimilarmethods,boththisstudyandarecentstudybySillito,MurphyandDeVolder[1]dissectthiscommunicationfromaninformationneedsperspec-tivebycatalogingthequestionsthatariseduringdevel-opmenttasks.Sillito,MurphyandDeVolderpresentquestionsaboutcodethatdevelopersaskedduringpro-grammingtasks.Inthisstudy,wepresentquestions

thatdevelopersaskedduringtheirdailywork(designing,coding,debugging,bugtriage).Ourquestionscoverabroaderscopeofworkandarethereforemoreabstractthantheirquestions.Unifyingtheseresultsisfuturework.

3. Method

Ourmethodwastorecordnoteswhileobservingdevel-opers’normalwork.Torecruitdevelopers,wesurveyed developers from the corporate address book. Ofthese,respondedandvolunteeredforobservation.

Eachobservationsessionwasaboutminutesandinvolvedasingleobservertakinghandwrittennotes.Toencouragetheparticipanttonarratehiswork,weaskedtheparticipanttothinkofusasanewcomertotheteam,doinga “job shadow.”We focusedon recordinggoal-orientedeventslike“findingthemethodthatcomputedthewrongvalue”ratherthanlow-levelevents likekey-strokesormenuselections.Sincewesharedthepartici-pantsprogrammingbackground,weunderstoodmuchoftheworkandwhereandhowinformationwasobtained,withoutinquiry.Insomecases,wepromptedwithques-tionslike“whatareyoulookingfor?”tolearntheirin-formation needs, but most developers thought aloudwithoutprompting.Wetimestampedtherecordedeventsand conversations eachminute.After minutes, welookedforagoodstoppingpointandwrappedup.Im-mediately after each observation, we transcribed thehandwrittennotes,asintheexcerptshowninFigure1.

Duringtheallottedtimeforthestudywewereabletoobservedevelopers,whichwasenoughtoseecommonpatternsintheirinformationneeds.(Section6.2toucheson the potential value of observingmore developers.)Figure 2 describes these developers’ experience levels,typesofwork,andphasesofdevelopmentandintroducestheinitialsweusetorefertotheminthispaper.InMi-crosoft'sterminology,devleadsmanagesoftwaredevel-opmentengineers(sordevs)whilealsoperformingadevelopmentrole.

4. TaskStructure

Ourobservationsspannedhoursofwork.Weparti-tionedtheloggedactivitiesintoworkcategoriescommonacross the participants:writing code; submitting code(check-ins);triagingbugs;reproducingfailures;under-standing program behavior; reasoning about design;maintaining awareness; and non-work activities (e.g.personalphonecalls).Wealsoidentifiedcausesoftaskswitching:face-to-facedialogue;phonecalls;instantmes-sages();emailalerts;meetings;taskavoidance;gettingblocked; and task completion.We annotated the logs

9:41 am So this copies the files onto the server, then allocates a machine to do the setup. In the meantime, I'm going to get this local fix [of this other bug] over [checked in].

9:41 am [opens diff tool to see changes he’s made to code]

9:43 am Oh damn, I have some mixed changes. Some are part of this DCR [design change request] I'm working on and some are part of a bug fix, so I have to mix it out.

Figure 1. An excerpt from J’s observation log.

Page 3: Information Needs in Collocated Software Development Teams

withtheseswitches,basedonremarkslike“Iwanttogetbacktomyrepro…”Allofthesecausesoftaskswitchesareformsofinterruption,exceptgettingblockedandtaskcompletion.Whentheparticipantvoluntarilyswitchedactivities,welabeltheswitchasblockediftheparticipantcouldnolongermakeprogressonthe current activity(typicallyduetoaninformationneed)andtaskavoid-anceifshecouldmakeprogressbutchosetoswitchany-way.

Figure2visualizesthesetaskswitches,whichoccurredanaverageofeveryminutes(±.),mirroringtheratereportedinGonzales,MarkandHarris[7].Timefrag-mentationvariedconsiderablyperparticipant.For ex-ample, reproduced failures without interruption,whereaswasfrequentlyblocked.Manyinterruptionswere due to face-to-face, , or phone conversations,whichoccurredfromtotimespersession(medianof),eachlastingformuchofthesession.Developerswerealsointerruptedbynotifications,suchasemailandalerts

aboutchangestothebugdatabase.Developers experi-encedfromtonotificationspersession(medianof).

Blocking,showninFigure2asdarkverticalbars,oc-curredwheninformationwasunavailable.Someblockswereaboutwaitingfortheresultsofcompilationsandtestssuites.Developersalsowaitedforemailrepliesandfor other teams to submit changes or x bugs. Otherblockswereduetomissingknowledge,likewhenade-veloperstopscodingtolearnaboutan.Developerswereblockedamedianof timesper sessionandbe-tweenandtimesoverall.

5. InformationNeeds

Duringourobservations,therewereinstancesofin-formationseeking,whichweabstractedfromtheparticu-lars of the work context into general informationneeds.Herewepresenttheseinformationneedsclusteredbytheworkcategoryinwhichtheyarose.Throughout,

Figure 2. The backgrounds and task structures of the 17 observed developers. The right edge of each task block indicates the reason for the task switch (thin line for done, thick line for blocked, jagged line for interrupted). When a task gets broken up by interruptions or blocking, we draw its fragments at the same vertical level to show resumption.

Page 4: Information Needs in Collocated Software Development Teams

we list within braces the initials of the developers forwhomweobservedaninformationneedorothertrend.

5.1 WritingCode

Developers had several questions situated in the codetheywerewriting:

(c1) Whatdatastructuresorfunctionscanbeusedtoimplementthisbehavior?

(c2) HowdoIusethisdatastructureoffunction?(c3) HowcanIcoordinatethiscodewiththisother

datastructureorfunction?

Although the firstof thesequestionswasuncommon,whenitoccurred,developerssearcheddocumenta-tion {} and inspected other code for examples {}.Thesesearchescanbethoughtofasasearchthroughthespaceofexistingreusablecode;forexample,lookedforanappropriateserializationbysearchingalargeda-tabaseofpublicdocumentation.

Once a developer had a candidate in mind, theysoughtitssyntacticusagerules(c).Forexample,whichmethodisappropriatetocall?Whatdatastructuresdoesthis require?What constructors does this class have?Developersuseddocumentationwhen itwas available{},butsometimesneededtouse codethatwasonlyfullyunderstoodbyitsauthor{}.Othersfoundexamplecodefromwhichtoinferrules{}.

Becausedevelopershadtocoordinateswiththeirowncode,theyalsosoughtbehavioralusagerules(c),implicitinthedesign.Forexample,isitlegaltocallthismethodaftercallingthisothermethod?WhatstatedoIhavetobeinbeforethiscall?Suchinformationwasrarely explicit. Developers used their colleagues {},documentation{},andexamplecode{}toinfertheserules.

5.2 Submitt ingaChange

Developershadthreeprimaryquestionswhenexposingtheircodetotheirteammates:

(s1) DidImakeanymistakesinmynewcode?(s2) DidIfollowmyteam’sconventions?(s3) Whichchangesarepartofthissubmission?

Besidesbuildingtheircodetoassessitssyntacticcorrect-ness,developersansweredquestionsofcorrectness(s)byconsideringthescenariosandrangeofinputthattheyintendedtocover.Theyuseddebuggers{},difftools{}andunittests{},butprimarilyreliedontheirownreasoning{}.Anothercommonfilterformistakeswascodereviews.Beforeareview,developersfirstlookedformistakes:

K: I think I'm ready to check in, so I'm just making sure I didn't do any-thing stupid. Like, I forgot to write those tests! Yeah, stupid like that.

Onedeveloperwroteassertions{},buttheseinterferedwithotherdevelopers’work{}:

A: I never want to see [that product’s] asserts but they always pop up. They have nothing to do with my work!

Threedevelopersusedstaticanalysistoolstocheckforfault-pronedesignpatterns{},butexpresseddisdainforsuchtools’falsepositivesorcouldnotunderstandthetools’recommendations.

Developersalsoconsideredtheirteam’sconventions(s).Someteamsrequiredtagsorotherdocumentationtobe embedded inmethodheaders,whichdeveloperswere careful toremember,oftenwiththehelpof tools{}.Sometimestwosubmissionsintersected(likeinthe transcript inFigure )ordevelopershad tomergetheircodewithanother’sanddevelopershadtodeter-minewhichdifferenceswerepartofthecurrentsubmis-sion(s){}.

5.3 Triag ingBugs

Mostdeveloperswereswampedwithbugreportsfromtests,customers,andinternalemployees.Triageoccurredin isolation as a developer partitioned their time{},butalsointriagemeetings{}.Foreachreport,thegoalwastodetermine:

(b1) Isthisalegitimateproblem?(b2) Howdifficultwillthisproblembetofix?(b3) Istheproblemworthfixing?

Assessing legitimacy (b) involveddecidingwhether afailurewasduetoaproblemwiththecodeoranunreal-isticconfigurationofatest{}:

B: It might but not really be a failure. It might just be a setup problem. This particular component doesn't depend on anything. Probably locked a file, so it’s returning an exit code. Not a real failure.

Legitimacyalsodependedonwhetherareportwasadu-plicate.Peoplereportedsimilarfailuresymptoms{},butalsodifferentfailuresymptomsthatdevelopersbelievedhadacommoncause{}:

A: These subjects are just busted! I have a feeling I'm seeing the same bug. I'm going to do a quick search to see if there are busted subjects [in the bug database]—this one kinda sounds like it, blah blah, cate-gory name is corrupted? Ooh, screenshots are the same!

Bugtriageisacost/benefitanalysis.Toassessathecostofrepair(b),developersconsideredwhetheraredesignwouldbenecessary{},whetherotherteamsmightbeaffected {}, and whether a fix could be written andtestedbyadeadline{}.Teamsalsoclosebugs“byde-sign,”treatingthemasworkitemsforlaterreleases:

V’s teammate: I think the best thing is a new overlay to indicate some-thing's going on.

Page 5: Information Needs in Collocated Software Development Teams

V: We can't do that by the release. Looks like a work item.

Another factor affecting the repair costwas a reports’clarity{}.Doesithavedetailedreproductionsteps?Isthe failure clearlydescribed?Does ithave hints aboutpossiblecausesoranerrormessage?Reportswithinade-quateclaritywererejected{}.

On thebenefit side of the analysis (b),developersconsideredthenumberofusersaffected{}andtheuserexperience{}.Forexample,discussedafix:

V’s teammate: If we want to push it back, we can, but I think an overlay is easiest.

V: But it's a totally broken experience for the user.

Iftherewasaknownworkaround,developersmightfo-cusonmoreseverebugs{}.

5.4 ReproducingaFailure

Toreproduceafailure,developersasked:

(r1) Whatdoesthefailurelooklike?(r2) Inwhatsituationsdoesthisfailureoccur?

Theprimarysourceforbothofthesetypesofinforma-tionwasbugreports.Reportswouldoftenincludescreenshots{},butmoreoftendevelopersreliedonthede-scriptionsofthefailuretohelpthemimagineitsappear-ance{}.

Developersreliedheavilyonbugreport’sreproduc-tionstepstounderstandthesituationsinwhichafailureoccurred (r). Given the complex configurations thatwerenecessarytoreproducesomeproblems,evende-tailedstepsomittedcrucialstate{}.Inothercases,thestatewasknown,butdifficulttoreproduce{}:

A: Originally, the repro steps said I need a blog count [as a test case] but I couldn't set one up, so I went back and forth.

Toovercomethis,somedeveloperssetuparemotedesk-topconnectionwiththereport’sauthor,sothatthefullconfigurationwasavailablefordebugging{}.Develop-ers would also guesswhat statewaswrong and beginmodifyingtheirenvironmentandtestcasesuntilrepro-ducingthefailure:

A: I'm looking at [the report] to see if I have this configured the same way, but I'm not getting the problem. Maybe we've changed it in the past half year this has been open.

Inonesituationafailurecouldnotbereproducedandthe bug had to be deferred {}. The developer docu-mentedhisattemptsinthereportforthesakeofothertestersanddevelopers.

5.5 UnderstandingExecut ionBehavior

Developershadtounderstandunfamiliarcodeinseveralcircumstances:usingvendor code{}; joininganewteam{};obtainingownershipofcode{};duringwork-

loadbalancing{};orwhendebugging,withunfamiliarcodeonthecallstack{}.Eachtime,theyaddressedthreebasicquestions:

(u1) Whatcodecouldhavecausedthisbehavior?(u2) What’sstaticallyrelatedtothiscode?(u3) Whatcodecausedthisprogramstate?

Developersbeganthesetaskswithawhyquestionandahypothesisaboutthecauseofthefailure:

A: Why did I get gibberish? Storing field, given PPack, what is an MPField? I have no idea what this data structure contains. SPSField? I suspect SPS is just busted.

Developersacquiredtheirhypotheses(u)byusingtheirintuition{},askingcoworkersforopinions{},lookingexecutionlogs{},scouringbugreportsforhints{},andusingthedebugger{}.Althoughdevelopersusedmanysourcestoobtainhypotheses,onlyafewgath-eredandconsideredmorethanoneatatime{}.Theaccuracyofdevelopers’hypotheseswasonlyobviousinhindsight.

To test and refine hypotheses, developers asked abroadarrayofquestionswithavarietyoftools.Manyofthesequestionswereaboutthestructureofthecode(u),likewhat is the definition of this? andwhat calls thismethod?Suchquestionswereeasytoanswerwithtools.Other,morebroadly scopedquestions, likewhat codedoesasimilaroperation?,hasnotoolsupport,butdevel-opersweregoodatansweringthemwithsearchtools.

Developersansweredquestionsaboutcausality(u)suchaswheredidthisvaluecomefrom?{}andhowdidtheprogramarriveatthismethod?{},byaseriesoflowerlevelquestions,suchaswhatthreadisthepro-graminrightnow?{},whatisthevalueofthisvariableordatastructurenow?{}.(Sillito,MurphyandDeVolderreportsimilarindirectquestioning[1].)Thiswasdone primarily with breakpoint debuggers, which re-quired developers to translate their questions into anawkwardseriesofactions:

A: Here we're formatting WSTValue…I can't do highlighting, so I go to Source Insight. Find where I am in devns—this is the guy that screwed up. Shift F8, highlight all occurrences, where it gets its value from. Here's where we set it. So I want a breakpoint here.

As developers refined their hypotheses, they changedtheirconcernfromthebehavioroftheexistingsystem,tothehypotheticalbehavioraftersomechange:

T: There's no file there, so something forgot it and I have a suspicion of what it is. Might mean that the free code has to get moved later.

Intuitionwas essential in answeringallof theseques-tions.The costof testinghypotheses and the riskof afalsehypothesisoftenpreventeddevelopersfromfindingarootcause.Instead,developersfrequentlyassessedthevalue incontinuingtheir investigation,stoppingwhentheyweresatisfied{}.

Page 6: Information Needs in Collocated Software Development Teams

5.6 ReasoningaboutDesign

Developerssoughtfourkindsofdesignknowledge:

(d1) Whatisthepurposeofthiscode?(d2) Whatistheprogramsupposedtodo?(d3) Whywasthiscodeimplementedthisway?(d4) Whataretheimplicationsofthischange?

Thepurposeofcode(d)wasoftenunclearwhendevel-opersfoundantouse:Is itapublicartifactor in-tendedonlyforaparticularcomponent?Isitregularlymaintainedornolongerused?Somedevelopersinferredpurposebyfindingexampleuses{};sometimestheydirectlyaskedthecode’sauthor{}.

Developers needed to knowwhat the programwassupposedtodo(d),forexample,toevaluatethecorrect-nessofavariable’svalue{}:

D, yelling across the hall: Is 'B' not a legal license key letter?

Sometimesthisassessmentwasobvious.Forexample,acrashinabasicusecasemustbeunintended.Inothercases,whataprogramwassupposedtodowasanex-plicit,documenteddecision:

M: I just want to double check and make sure the convert key only shows up in languages that it's supposed to, based on the spec.

Itwasrarelysufficienttounderstandthecauseofapro-grambehavior.Developersalsoneededtoknowthehis-torical reason for its current implementation (d){}.Forexample,whenassessingwhetheravari-able’s value was “wrong,” developers had to considerwhetherthevaluewasanticipatedbythedesignerandexplicitly ignored orwhether itwas overlooked. Theywoulddothisbyinvestigatingthecode’schangehistory{}orbylookingforbugreportsthatcontainedhintsaboutitscurrentdesign{}.Developerswouldseekthisdesign rationale from the author of the code throughface-to-faceconversationorsomeothermeans{},butinonecasetheauthorwasunavailable{}.Evenwhendevelopersfoundapersontoask,identifyingtheinfor-mationthattheysoughtwashardtoexpress,asdevelop-ersstruggledtotranslatedetailedandcomplexruntimescenariosintowordsanddiagrams.

Theconsequencesofdecisionswerealso important(d).Forexample,whentriaging,developersoftendis-cussedhypotheticalscenarios{}:

V’s teammate: Let’s go ahead and block and make it into a single op-eration.

V: But the upgrade script needs to look for individualization.

Designknowledgeofalltypeswasscatteredamongde-sign documents {}, bug reports {}, and personalnotebooks{}.Emailthreadssometimescontaineddesignrationale{},butwerenotsharedglobally.Code

commentssometimescontaineddesignrationale{},butdevelopershesitatedtowritethembecauseofthecostofsubmitting code changes. Developers rarely searchedthesesources,becausesuchsourceswerethoughttobeinaccurateandoutofdate:

H: Given that I'll be the one fixing the bugs, I need to make sure I know not what we are doing, but why we are doing it. We have these big long design meetings, and everybody states their ideas, and we come to a consensus, but what never gets written in the spec is why we de-cided on that. Keeping track of that is really hard.

Theseproblemsledallbuttwodeveloperstodeferdeci-sionsbecauseofmissingdesignknowledge.

5.7 Maintain ingAwareness

Developersworkedtokeeptrackofhardware,peopleandinformationneededfortheirtasks:

(a1) HowhaveresourcesIdependonchanged?(a2) Whathavemycoworkersbeendoing?(a3) Whatinformationwasrelevanttomytask?

Someawarenessinformationwas“pushed”todevelopersthrough clients and alert tools {}, andthroughcheck-inemails{}.Developersobtainedothertypesofawarenessbyactivelyseekingit.Onegrouphadbrief meetings throughout the day, to keep aware ofproblemsthatteammateswereworkingonandissuesonwhichtheywereblocked{};othergroupshadweeklymeetings to keep awareness about triage and designchoices.Developerswouldstopbycoworkers’officestoupdatethemonproblemsortoseewhatproblemstheywerefacing{}:

F: I talked to [Joe] a bit about the execution, and gather objects is on track, but I still need to make the base class.

F’s boss: Yeah, [Joe] talked to me about it. We need to make sure files are not delay assigned. He's in this big whoop-de-doo about it.

Developerstrackedtheirtimeandothers’,checkingtheircalendars,glancingatschedulesandaskingtheirmanag-ersaboutpriorities{}.Managerscommunicatedtotheir developers about upcoming changes in informalmeetings,emailannouncements,orplanningmeetings{}.Becausedeveloperswereofteninterrupted,theyalsosoughtawarenessabouttheirownwork(a):

G: Sometimes I have like 20 windows, 5 or 6 build windows, each one is a state that I'm working on and I lose it! If I could just save it…I would be really happy! I hate those midnight reboots.

Page 7: Information Needs in Collocated Software Development Teams

6. QuantifyingInformationNeeds

Theinformationneedswehavediscussedaresumma-rizedinFigure3.Thetimespentsearching,searchfre-quencies,searchoutcomes,andsourcefrequenciesarebasedonourobservationaldata.Theoutcomesincludewhendevelopersacquiredinformation,deferredasearchwiththeintentofresumingit,orgaveupwithnointentofresumingit;afewsearchescontinuedbeyondourob-servations.Also, in twocases, a needwas initiallyde-ferred, then satisfied afterward by a coworker’s emailresponse;wecodedtheseasacquired.

Themostfrequentlysoughtandacquiredinformationincludes whether any mistakes (syntax or otherwise)weremade in code andwhat a developers’ coworkershavebeendoing.Themostoftendeferredinformationwasthecauseofaparticularprogramstateandthesitua-tionsinwhichafailureoccurs.Developersrarelygaveupsearching.Therewasnorelationshipbetweendeferringasearchandwhetherthesourceinvolvedpeople(bugre-ports,face-to-face,,email)(χ()=.,p>.).

Based on medians, the information that took thelongest to acquirewas whether conventionswere fol-lowed(s);basedonmaximums,thelongesttoacquirewasknowledgeaboutdesign(d,d)andbehavior(u,u).Noonesourceofinformationtooklongertoacquirethananother(F(,)=.,p>.),norwasthereadif-ferenceinsearchtimesbetweensourcesinvolvingpeopleand sources that did not (F(, )=., p>.). Thesetimes aremisleading, however, asmany of themaxi-mums were on deferred searches, so they were likelylongerthanshownhere.Further,developersgaveupor

deferred searches because they depended on a personknowntobeunavailable.Theywerealsoexpertatassess-ingthe likelihoodofthesearchsucceedingandwouldabandonasearchiftheinformationwasnotimportantenough.

6.1 Rat ing InformationNeeds

Thepercentagesinthemiddleofcomefromasurveyofdifferentdevelopers(ofcontacted),askingthemtoratetheiragreementwithstatementsabouteachoftheseinformationtypes,basedona-pointscalefromstronglydisagreetostronglyagree.Thebarsrepresentthepercentofdeveloperswhoagreedorstronglyagreedthatthein-formationwas(fromletoright)importanttomakingprogress,unavailableordifficulttoobtain,andhadques-tionableaccuracy.

Thesurveyresultsrevealinterestingtrends.Thema-jorityofdevelopersratedthemostfrequentlysoughtin-formationinourobservationsasmoreimportant,andtheyalsoratedfrequentlydeferredinformationasmoreunavailable. One discrepancy is that developers ratedcoworker awareness (a) as relatively unimportant,whichconflictswithitsfrequencyinourobservations.Itmaybethatcoworkerawareness issofrequentsoughtandsuccessfullyobtainedthatdevelopersdonotthinkaboutit.Wealsoobserveddeveloperssuccessfullyobtainknowledge about the implications of a change (d),whereasdevelopersrateditrelativelydifficulttoacquire.Thesurveyalsobeginstorevealwhichinformationtypeshave more questionable accuracy, namely knowledgeaboutdesign(d,d),behavior(u),andtriage(b,b).

Figure 3. Types of information developers sought, with search times in minutes; perceptions of the information’s importance, availability, and accuracy; frequencies and outcomes of searches; and sources, with the most common in boldface.

Page 8: Information Needs in Collocated Software Development Teams

6.2 MostCommon InformationNeeds

Foreachinformationneed,Figure4liststhosepartici-pantswhohadthatneedatleastonceduringtheirobser-vations.Themostcommonneedacrossparticipantswascoworkerawareness.Mostoftheinformationneedsoc-curredamongseveraldevelopersfromdifferentteamsindifferentbusinessdivisions,whichsuggeststhatthesearerepresentativeofdevelopmentworkingeneral.Afewofthe informationneedsoccurredforonlya fewpartici-pants,whichsuggeststhatthislistisnotcomplete.Ob-serving more developers over longer periods of timecouldrevealotherlessfrequentneeds.(Hadwenotob-served,forinstance,wewouldnothaveseenneeda3.)

6.3 Unsat isf ied Informat ionNeeds

Manyofthefrequentinformationneedsareproblematic,inthatsearchesfortheinformationwereoftensatisfied(deferredorabandoned)andhadlongsearchtimes.Themostfrequentlyunsatisfiedinformationneedswerethefollowing, with their percentage of unsatisfied queriesandmaximumobservedsearchtimes:

Whatcodecausedthisprogramstate? % min Whywasthecodeimplementedthisway? % min Inwhatsituationsdoesthisfailureoccur? % min Whatcodecouldhavecausedthisbehavior? % min HowhavetheresourcesIdependonchanged?% min Whatistheprogramsupposedtodo? % min Whathavemycoworkersbeendoing? % min

This rankingmay reflect that of the participants’teamswere inabugfixingphase.Inparticular,thein-formationneedsranked,andarelargelyaboutbugreproductionand theones ranked and are largelyaboutevaluatingpossiblefixesforbugs.Nonetheless,thefactthattheseinformationneedsaresooftenunsatisfiedand take such a long time clearly hindered developerproductivity.

7. Discussion

Ourmotivationforthisstudywastoidentifyandcharac-terizesoftwaredevelopers’informationneeds.Whilethelistwehaveidentifiedmaynotbecomplete,ithasseveralimplications.

7.1 Coworkers asInformationSources

Coworkerswerethemost frequentsourceof informa-tion,accessedatleastonceforoftheinformationneedsandinoftheinstancesofinformationseek-ing.Theimportanceofcoworkersasinformationsources

probablyexplainswhycoworkerawarenessisthesecondmostfrequentinformationneed.(Developerscheckedoncoworker availability almost as many times as theylookedatoutputfromthecompilerordebugger.)Thisisconsistentwithpriorworkonawarenessinsoftwarede-velopment,with regard to sources, strategies, and fre-quencyofinformationseeking[5][9][15].

Whyshoulddevelopersturnsooftentocoworkers?Onepossibilityisthetopicsbeingdiscussed.Outsideofawareness,theinformationneedswherecoworkersweremostoftenconsultedwereeitheraboutdesign,i.e.· Whataretheimplicationsofthischange?(times)· Whatistheprogramsupposedtodo?()· Whywasthecodeimplementedthisway?()oraboutexecutionbehavior,i.e.· Isthisproblemworthfixing?()· Inwhatsituationsdoesthisfailureoccur?()· Whatcodecouldhavecausedthisbehavior?()Inseveralinstancescoworkerswereunavailableforthesequestions,andthedevelopers’taskswereblockedoncetheysenttheirquestionsviaemail{}.

Developersconsultedcoworkersaboutdesignbecauseinmostcases,designknowledgewasonlyincowork-

ers’minds.Thelackofdesigndocumentationmaybedueto inadequate notations, particularly for design intentandrationale.Twoofourobserveddevelopersdidhavedesigndocumentation—aprototypeforauserinterface,asyntaxgrammarforaparser—whichansweredsomeoftheirquestions{}.However, theystill turnedtoco-workerswhentheyquestionedtheaccuracyofthedocu-ments.

Questionsaboutprogrambehaviorweredifficulttoacquirebecauseofthenumberofpossibleexplanations.Developershadtouseprimitivetoolstosearchthisex-

a2 Whathavemycoworkersbeendoing? 15 u3 Whatcodecausedthisprogramstate? 11 a1 HowhaveresourcesIdependonchanged? 10 u1 Whatcodecouldhavecausedthisbehavior? 9 c3 HowdoIusethisdatastructureorfunction? 9 s1 DidImakeanymistakesinmynewcode? 9 d2 Whatistheprogramsupposedtodo? 7 r2 Inwhatsituationsdoesthefailureoccur? 7 b3 Isthisproblemworthfixing? 7 u2 What'sstaticallyrelatedtothiscode? 6 d3 Whywasthiscodeimplementedthisway? 6 d4 Whataretheimplicationsforthischange? 6 r1 Whatdoesthefailurelooklike? 5 c3 HowcanIcoordinatethiswiththeothercode?4 s2 DidIfollowmyteam'sconventions? 4 d1 Whatisthepurposeofthiscode? 4 b2 Isthisalegitimateproblem? 3 s3 Whatchangesarepartofthissubmission? 2 b2 Howdifficultwillthisproblembetofix? 2 a3 Whatinformationwasrelevanttomytask? 1

Figure 4. Information needs per participant.

Page 9: Information Needs in Collocated Software Development Teams

planationspace,andsosearchesweredrivenbyintuitionorexpertopinion.Developersalsowenttogreatlengthstolearnbehaviorinformation.Asoneexample,tofixabugrecentlyassignedtohim,hadatesterninetimeszonesawayreproducethebug(atam)sincenooneelsehadtherightmachineconfiguration.Becausebehaviorinformationwashardtoacquire,developersmadetriagedecisionsquicklybasedonimplementationconcernsandresourceavailability,ratherthantheorganization’sover-allgoals{}.Thatis,developerswouldfavorthosetaskswiththefewestinformationneeds.

7.2 Automat ing InformationSources

Oneapproachtoreducingthiscommunicationburdenisto automate the acquisitionof information.Given thefrequentdesireforawarenessinformation,itisnosur-prisethatresearchersarealreadycreatingawarenessdis-plays for development teams, like FASTDash [1] andPalantír[17].

For example,many of developers’ questions aboutstaticrelationshipsdependedonmetadatasuchasbuildnumbersandversionhistories,butdevelopersmanuallyincorporatedsuchdataintheirsearches.Similarly,toolsforanalyzingprograms’dynamicbehavioronlypartiallyhelpedwithdeterminingthecauseofaprogramstate;theresthadtobedeterminedbyhandusingabreakpointdebuggerandthroughguesswork.Task-specificapplica-tions of program slicing would be a way to automatesomeofthissearching[19].Implementationquestions(c,c,c)alsolackedadequatetools(itisworthnotingthattheseneedshavealsobeendiscussedrelativetoend-userprogramming[10]).Toolswereoftenappropriatedforunanticipateduses,soitiswithintooldesigners’in-tereststodesigntoolsthatareamenabletoappropriation.Thismightentailusingstandards,sothat informationmaybepassedbetweentoolsandtransformedasneeded.

Someinformationseekingcannotbeautomatedbe-causethe informationiscurrentlyunavailable.Forex-ample,whendeveloperscouldnotreproduceafailure,therewaslittletheycoulddotofindit.Tracingtoolsthatcanrecordthefailurecontextwouldbeamajoradvance.Failurescouldbedebuggedseparatelyfromtheiroriginalcontextandthetracecouldbeanalyzedbymultiplepeo-ple.Designintentwasalsodifficulttofind.Informationabout rationale and intent existed sometimes in un-searchable places likewhiteboards and personal note-books or in unexpected places like bug reports. Someawarenessinformationisdifficulttoacquire,forexam-ple,developersoftenwonderedwhoisreadingtheircode.Toolscouldmakethisavailable.

Asidefromtools,onecouldaddresstheseinformationneedsthroughprocesschange,forexampleAgilemeth-

ods.Thefrequentneedtoconsultcoworkersforinforma-tionisanimportantmotivationforScrummeetingsandradicalcollocation.ChongandSiinorecentlycomparedinterruptionsamongradicallycollocatedpairprogram-mersversuscubicle-basesoloprogrammersandfoundthattheAgileteam’s interruptionswereshorter,moreon-topic,andlessdisruptive[4].Ourdataabouttheim-portanceofdesignknowledgeprovideevidenceaboutthevalueof prototyping in softwaredesign, aswell as thevalueofprototypesduringimplementation.Ourobserva-tionsaboutthe importanceof errorchecking, coupledwith the distributed nature of design knowledge, alsosupporttheclaimsofpairprogramming:withtwodevel-opers,with slightlydifferentdesign knowledge, errorsseemmorelikelytobecaughtorevenprevented.

Last is the issue of notations for software design.Whilethereisalreadyconsiderableresearchonarchitec-turedescriptionlanguages,,andvariousformsofmodelchecking,ourobservationsraiseseveralpertinentquestions.What canbewrittendowncost-effectively?Howcan itbewritten tobe searchable and so that itsaccuracyandtrustworthinessareassessablebydevelop-erswhoconsultit?Itisworthnotingthatseveralpartici-pantsperceivedthatface-to-facemeetingstobeapleas-antandefficientwaytotransferdesignknowledge.Thefrequent conversations promote camaraderie and noeffortiswastedrecordingdesigninformationthatmightneverbereadormightgostalebeforebeingread.Hence,ademand-drivenapproachtorecordingdesignknowl-edgemight succeedover aneager “record everything”approach.

7.3 StudyLimitations

Becauseweperformedthisstudyinthecontextofdevel-opers’realwork,theexternalvalidityofourresultsarehigh.Thediversityofoursamplegivesusconfidenceingeneralizingacrossdifferentproducts, teamstructuresand development phases within the organization westudied.Wewereunabletocontrolforcorporateculture,althoughthecommunicationpatternsanddevelopmentprocesses we observed are consistent with studies ofother corporations. Other variations, such as testingpractices,thetalentandexpertiseofacompany’sdevel-opers,andmoreorlessformaldevelopmentprocesses,mayhavebiasedourfindings.

Studies that relyonobservationsare subject toob-servers’biases,soitwasessentialthatwehavemultipleobservers. For example, we may have misunderstoodwhatdeveloperswere lookingfor.Tominimize intru-sion,wechosetohaveasingleobserverpersession,whotookonlywrittennotes.(Inseveralcases,theparticipantsworked in sharedor noisyoffices.)Evenwitha single

Page 10: Information Needs in Collocated Software Development Teams

observer,therewereseveralinstancesofmissedinterrup-tions,whereavisitorpeekedinsidetheoffice, sawtheobserver,andchosetoleaveratherthaninterrupt.Therewerealsomanyinformationseekingtasksthatwecouldnot observe because they were either too subtle, likeglancingatacoworker’sstatus,orinvisible, liketheuseofmemorytorecallfactsaboutthecode.Ourdatawasalsobiasedbythoseissuesthatdeveloperschosetomentionduringthinkaloud.

Thetimestampsinour logsare accuratewithintheminute,butaresubjecttotypicalclericalerrorsduringtranscriptionand copying.Also, some timewas spenttalkingtotheobservers,butthisbiaswaslikelydistrib-utedthroughoutourobservations.Weonlyhadasinglecodercategorizeourlogs,whichaffectsourquantitativedata;however,we feel theordersofmagnitude inourdataareaccurate.

8. Conclusions

Ourgoalsinthisstudyweretoidentifysoftwaredevelop-ers’informationneedsandcharacterizetheroleoftheseneeds indevelopers’decisionmaking.Whatwefoundweretypesofinformation.Someofthesewereeasytosatisfyaccurately(awareness)butotherswithonlyques-tionableaccuracy(thevalueofafixandtheimplicationsofachange).Otherneedsweredeferredoften(knowl-edge about behavior and design), whereas somewereimpossibletosatisfyincertaincases(reproductionsteps).Not only do these needs call for innovations in tools,processes, andnotations,but theyalso revealhow thecollectiveresponsibilityfordesignknowledgecanleadtointenseawarenessandcommunicationneedsobservedinthisstudyandothers.

Therearemanyfuturedirectionsforthiswork.Oneissuewedidnotinvestigatewerethedecisionsthatde-velopersmadeandthetrueaccuracyoftheinformationonwhichtheywerebased.Wehavealsoproposedsomeexplanationsfortheneedsweobserved,whichshouldbetested.Thereareotherpopulations,namelytestersandarchitects, whose roles and information needs shouldalsobestudied.Wehopethatthese investigationsandotherswillbringusamorecompleteunderstandingofsoftwaredevelopmentworkandeventualimprovementsinsoftwarequality.

Acknowledgements

Weextendourthankstothedeveloperswhoparticipatedinourstudyfortheirvaluabletime.Wealsothankthe,andVisual StudioUserExperienceteamsatMicrosoftfortheirfeedback.Thefirstauthorwasanin-ternatMicrosoftResearchoverthesummerof.

References

[1] Biehl,J.T.,Czerwinski,M.,Smith,G.,Robertson,G.G.,Bailey,B.().FASTDash:AVisualDashboardforFosteringAwarenessinSoftwareTeams.ToappearatCHI.

[2] Brooks, F.P. Jr. (). The Mythical Man-Month: Essays onSoftwareEngineering.AddisonWesley,Reading,.

[3] Cataldo,M.,P.Wagstrom,J.D.Herbsleb,K.Carley().Iden-tificationofCoordinationRequirements:ImplicationsfortheDe-signofCollaborationandAwarenessTools.ComputerSupportedCooperativeWork,Banff,Alberta,–.

[4] Chong,J.,RosanneSiino.InterruptionsonSoftwareTeams:AComparisonofPairedandSoloProgrammers.ComputerSup-portedCooperativeWork,Banff,Alberta.p.–.

[5] deSouza,C.R.B.,D.F.Redmiles,G.Mark,J.Penix,M.Sierhuis()ManagementofInterdependenciesinCollaborativeSoft-wareDevelopment:AFieldStudy.ISESE,Rome,Italy,–.

[6] Eisenstadt,M. (). “MyHairiestBug”WarStories.CACM,(),–.

[7] Gonzalez,V.,G.Mark.,J.Harris().NoTaskLeBehind?ExaminingtheNatureofFragmentedWork.CHI,Portland,,–.

[8] Gutwin,C.,R.Penner,K.Schneider,K.().GroupAwarenessinDistributedSoftwareDevelopment.CSCW,Chicago,,–.

[9] Hertzum,M.().TheImportanceofTrustinSoftwareEngi-neers’AssessmentofChoiceofInformationSources.InformationandOrganization,(),–.

[10] Ko,A.J.,B.A.Myers,H.H.Aung().SixLearningBarriersinEnd-UserProgrammingSystems.VL/HCC,Rome,Italy,–.

[11] Ko,A.J.,B.A.Myers,M.J.Coblenz,H.H.Aung().AnEx-ploratory Study of HowDevelopers Seek, Relate, and CollectRelevantInformationduringSoftwareMaintenanceTasks.TSE,–.

[12] McDonald,D.W.,M.S.Ackerman().JustTalktoMe:AFieldStudyofExpertiseLocation.CSCW,Seattle,WA,–.

[13] LaToza,T.D.,G.Venolia,R.DeLine.().MaintainingMentalModels:A Study of DeveloperWorkHabits. ICSE, Shanghai,China,–.

[14] Perlow,L.A. ().TheTimeFamine:TowardaSociologyofWorkTime.AdministrativeScienceQuarterly,(),–.

[15] Perry,D.E.,N.A.Staudenmayer,L.G.Votta().People,Orga-nizationsandProcessImprovement.IEEESoftware,July,–.

[16] Sandusky,R.J.,L.Gasser().NegotiationandCoordinationofInformationandActivityinDistributedSoftwareProblemMan-agement.GROUP,SanibelIsland,,–.

[17] Sarma,A.,Z.Noroozi,A.vanderHoek,Palantír:RaisingAware-nessamongConfigurationManagementWorkspaces.ICSE,,Portland,,–.

[18] Seaman,C.B.,V.R.Basili().CommunicationandOrganiza-tion:AnEmpiricalStudyofDiscussioninInspectionMeetings.TSE.(),–.

[19] Sridharan,M.,S.J.Fink,R.Bodik.ThinSlicing.ToappearatPLDI.

[20] Sillito,J.,G.Murphy,K.DeVolder().QuestionsProgram-mersAskDuringSoftwareEvolutionTasks.SIGSOFT/FSE,Port-land,,–.