arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Post on 24-Jun-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

arXiv Computer Science manuscript No(will be inserted by the editor)

Swarm Debugging the Collective Intelligence onInteractive Debugging

Fabio Petrillo1 Yann-Gael Gueheneuc3Marcelo Pimenta2 Carla Dal SassoFreitas2 Foutse Khomh4

Received date Accepted date

Abstract One of the most important tasks in software maintenance is debug-ging To start an interactive debugging session developers usually set break-points in an integrated development environment and navigate through differ-ent paths in their debuggers We started our work by asking what debugginginformation is useful to share among developers and study two pieces of in-formation breakpoints (and their locations) and sessions (debugging paths)To answer our question we introduce the Swarm Debugging concept to framethe sharing of debugging information the Swarm Debugging Infrastructure(SDI) with which practitioners and researchers can collect and share dataabout developersrsquo interactive debugging sessions and the Swarm DebuggingGlobal View (GV) to display debugging paths Using the SDI we conducteda large study with professional developers to understand how developers setbreakpoints Using the GV we also analyzed professional developers in twostudies and collected data about their debugging sessions Our observationsand the answers to our research questions suggest that sharing and visualizingdebugging data can support debugging activities

Keywords Debugging debugging effort software visualization empiricalstudies distributed systems information foraging

1 Introduction

Debug To detect locate and correct faults in a computer programTechniques include the use of breakpoints desk checking dumps in-spection reversible execution single-step operations and tracesmdashIEEE Standard Glossary of SE Terminology 1990

1Universite du Quebec a Chicoutimi 2Federal University of Rio Grande do Sul 3ConcordiaUniversity 4Polytechnique Montreal Canada

arX

iv1

902

0352

0v1

[cs

SE

] 1

0 Fe

b 20

19

2Please give a shorter version with authorrunning and titlerunning prior to maketitle

Debugging is a common activity during software development mainte-nance and evolution [1] Developers use debugging tools to detect locateand correct faults Debugging tools can be interactive or automated

Interactive debugging tools aka debuggers such as sdb [2] dbx [3] orgdb [4] have been used by developers for decades Modern debuggers are of-ten integrated in interactive environments eg DDD [5] or the debuggers ofEclipse NetBeans IntelliJ IDEA and Visual Studio They allow developersto navigate through the code look for locations to place breakpoints and stepoverinto statements While stepping debuggers can traverse method invoca-tions and allow developers to toggle one or more breakpoints and stoprestartexecutions Thus they allow developers to gain knowledge about programsand the causes of faults to fix them

Automated debugging tools require both successful and failed runs and donot support programs with interactive inputs [6] Consequently they have notbeen widely adopted in practice Moreover automated debugging approachesare often unable to indicate the ldquotruerdquo locations of faults [7] Other hybridtools such as slicing and query languages may help developers but there isinsufficient evidence that they help developers during debugging

Although Integrated Development Environments (IDEs) encourage devel-opers to work collaboratively exchanging code through Git or assessing codequality with SonarQube one activity remains solitary debugging Debuggingis still an individual activity during which a developer explores the sourcecode of the system under development or maintenance using the debugger pro-vided by an IDE She steps into hundreds of statements and traverses dozensof method invocations painstakingly to gain an understanding of the systemMoreover within modern interactive debugging tools such as those includedin Eclipse or IntelliJ a debugging session cannot start if the developer does notset a breakpoint Consequently it is mandatory to set at least one breakpointto launch an interactive debugging session

Several studies have shown that developers spend over two-thirds of theirtime investigating code and one-third of this time is spent in debugging [8910] However developers do not reuse the knowledge accumulated duringdebugging directly When debugging is over they loose track of the pathsthat they followed into the code and of the breakpoints that they toggledMoreover they cannot share this knowledge with other developers easily If afault re-appears in the system or if a new fault similar to a previous one islogged the developer must restart the exploration from the beginning

In fact debugging tools have not changed substantially in the last 30 yearsdevelopersrsquo primary tools for debugging their programs are still breakpoint de-buggers and print statements Indeed changing the way developers debug theirprograms is one of the main motivations of our work We are convinced thata collaborative way of using contextual information of (previous) debuggingsessions to support (future) debugging activities is a very interesting approach

Roszligler [7] advocated for the development of a new family of debuggingtools that use contextual information To build context-aware debugging toolsresearchers need an understanding of developersrsquo debugging sessions to use

Swarm Debugging the Collective Intelligence on Interactive Debugging 3

this information as context for their debugging Thus researchers need toolsto collect and share data about developersrsquo debugging sessions

Maalej et al [11] observed that capturing contextual information requiresthe instrumentation of the IDE and continuous observation of the developersrsquoactivities within the IDE Studies by Storey et al [12] showed that the newergeneration of developers who are proficient in social media are comfortablewith sharing such information Developers are nowadays open transparenteager to share their knowledge and generally willing to allow informationabout their activities to be collected by the IDEs automatically [12]

Considering this context we introduce the concept of Swarm Debug-ging (SD) to (1) capture debugging contextual information (2) share it and(3) reuse it across debugging sessions and developers We build the concept ofSwarm Debugging based on the idea that many developers performing debug-ging sessions independently are in fact building collective knowledge whichcan be shared and reused with adequate support Thus we are convinced thatdevelopers need support to collect store and share this knowledge ie in-formation from and about their debugging sessions including but not limitedto breakpoints locations visited statements and traversed paths To providesuch support Swarm Debugging includes (i) the Swarm Debugging Infrastruc-ture (SDI) with which practitioners and researchers can collect and share dataabout developersrsquo interactive debugging sessions and (ii) the Swarm Debug-ging Global View (GV) to display debugging paths

As a consequence of adopting SD an interesting question emerges whatdebugging information is useful to share among developers to ease debuggingDebugging provides a lot of information which could be possibly considereduseful to improve software comprehension but we are particularly interestedin two pieces of debugging information breakpoints (and their locations) andsessions (debugging paths) because these pieces of information are essentialfor the two main activities during debugging setting breakpoints and steppinginoverout statements

In general developers initiate an interactive debugging session by settinga breakpoint Setting a breakpoint is one of the most frequently used fea-tures of IDEs [13] To decide where to set a breakpoint developers use theirobservations recall their experiences with similar debugging tasks and formu-late hypotheses about their tasks [14] Tiarks and Rohms [15] observed thatdevelopers have difficulties in finding locations for setting the breakpointssuggesting that this is a demanding activity and that supporting developersto set appropriate breakpoints could reduce debugging effort

We conducted two sets of studies with the aim of understanding how de-velopers set breakpoints and navigate (step) during debugging sessions Inobservational studies we collected and analyzed more than 10 hours of devel-opersrsquo videos in 45 debugging sessions performed by 28 different independentdevelopers containing 307 breakpoints on three software systems These ob-servational studies help us understand how developers use breakpoints (RQ1to RQ4)

4Please give a shorter version with authorrunning and titlerunning prior to maketitle

We also conducted with 30 professional developers two studies a qualitativeevaluation and a controlled experiment to assess whether debugging sessionsshared through our Global View visualisation support developers in theirdebugging tasks and is useful for sharing debugging tasks among developers(R5 and RQ6) We collected participantsrsquo answers in electronic forms and morethan 3 hours of debugging sessions on video

This paper has the following contributions

ndash We introduce a novel approach for debugging named Swarm Debugging(SD) based on the concept of Swarm Intelligence and Information ForagingTheory

ndash We present an infrastructure the Swarm Debugging Infrastructure (SDI)to gather store and share data about interactive debugging activities tosupport SD

ndash We provide evidence about the relation between tasksrsquo elapsed time de-velopersrsquo expertise breakpoints setting and debugging patterns

ndash We present a new visualisation technique Global View (GV) built onshared debugging sessions by developers to ease debugging

ndash We provide evidence about the usefulness of sharing debugging session toease developersrsquo debugging

This paper extends our previous works [161718] as follows First we sum-marize the main characteristics of the Swarm Debugging approach providinga theoretical foundation to Swarm Debugging using Swarm Intelligence andInformation Foraging Theory Second we present the Swarm Debugging Infras-tructure (SDI) Third we perform an experiment on the debugging behaviorof 30 professional developers to evaluate if sharing debugging sessions supportsadequately their debugging tasks

The remainder of this article is organized as follows Section 2 providessome fundamentals of debugging and the foundations of SD the conceptsof swarm intelligence and information foraging theory Section 3 describesour approach and its implementation the Swarm Debugging InfrastructureSection 6 presents an experiment to assess the benefits that our SD approachcan bring to developers and Section 5 reports two experiments that wereconducted using SDI to understand developers debugging habits Next Section7 discusses implications of our results while Section 8 presents threats to thevalidity of our study Section 9 summarizes related work and finally Section10 concludes the paper and outlines future work

2 Background

This section provides background information about the debugging activityand setting breakpoints In the following we use failures as unintended be-haviours of a program ie when the program does something that it shouldnot and faults as the incorrect statements in source code causing failuresThe purpose of debugging is to locate and correct faults hence to fix failures

Swarm Debugging the Collective Intelligence on Interactive Debugging 5

21 Debugging and Interactive Debugging

The IEEE Standard Glossary of Software Engineering Terminology (see thedefinition at the beginning of Section 1) defines debugging as the act of de-tecting locating and correcting bugs in a computer program Debugging tech-niques include the use of breakpoints desk checking dumps inspection re-versible execution single-step operations and traces

Araki et al [19] describe debugging as a process where developers makehypotheses about the root-cause of a problem or defect and verify these hy-potheses by examining different parts of the source code of the program

Interactive debugging consists of using a tool ie a debugger to detectlocate and correct a fault in a program It is a process also known as programanimation stepping or following execution [20] Developers often refer to thisprocess simply as debugging because several IDEs provide debuggers to sup-port debugging However it must be noted that while debugging is the processof finding faults interactive debugging is one particular debugging approachin which developers use interactive tools Expressions such as interactive de-bugging stepping and debugging are used interchangeably and there is not yeta consensus on what is the best name for this process

22 Breakpoints and Supporting Mechanisms

Generally breakpoints allow pausing intentionally the execution of a programfor debugging purposes a means of acquiring knowledge about a program dur-ing its execution for example to examine the call stack and variable valueswhen the control flow reaches the locations of the breakpoints Thus a break-point indicates the location (line) in the source code of a program where apause occurs during its execution

Depending on the programming language its run-time environment (inparticular the capabilities of its virtual machines if any) and the debuggersdifferent types of breakpoints may be available to developers These types in-clude static breakpoints [21] that pause unconditionally the execution of aprogram and dynamic breakpoints [22] that pause depending on some con-ditions or threads or numbers of hits

Other types of breakpoints include watchpoints that pause the executionwhen a variable being watched is read andndashor written IDEs offer the meansto specify the different types of breakpoints depending on the programminglanguages and their run-time environment Fig 1-A and 1-B show examples ofstatic and dynamic breakpoints in Eclipse In the rest of this paper we focuson static breakpoints because they are the most used of all types [14]

There are different mechanisms for setting a breakpoint within the code

6Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

4httpjeffreysambellscom20140114using-breakpoints-in-xcode

5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

6httpsdocspythonorg2librarypdbhtml

7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

8httpsgithubcomcldwalkerdebugger

9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

Swarm Debugging the Collective Intelligence on Interactive Debugging 7

flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

ndash Step Out this action would take the debugger back to the line where thecurrent function was called

To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

23 Self-organization and Swarm Intelligence

Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

8Please give a shorter version with authorrunning and titlerunning prior to maketitle

emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

Dev1

Dev2

Dev3

DevN

VisualisationsSearching Tools

Recommendation Systems

Single Debugging Session Crowd Debugging Sessions Debugging Information

Positive feedback

Collect data Store data

Transform information

A B C

D

Fig 2 Overview of the Swarm Debugging approach

24 Information Foraging

Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

3 The Swarm Debugging Approach

Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

Swarm Debugging the Collective Intelligence on Interactive Debugging 9

First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

4 SDI in a Nutshell

To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

11 httpswwweclipseorg

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment

    2Please give a shorter version with authorrunning and titlerunning prior to maketitle

    Debugging is a common activity during software development mainte-nance and evolution [1] Developers use debugging tools to detect locateand correct faults Debugging tools can be interactive or automated

    Interactive debugging tools aka debuggers such as sdb [2] dbx [3] orgdb [4] have been used by developers for decades Modern debuggers are of-ten integrated in interactive environments eg DDD [5] or the debuggers ofEclipse NetBeans IntelliJ IDEA and Visual Studio They allow developersto navigate through the code look for locations to place breakpoints and stepoverinto statements While stepping debuggers can traverse method invoca-tions and allow developers to toggle one or more breakpoints and stoprestartexecutions Thus they allow developers to gain knowledge about programsand the causes of faults to fix them

    Automated debugging tools require both successful and failed runs and donot support programs with interactive inputs [6] Consequently they have notbeen widely adopted in practice Moreover automated debugging approachesare often unable to indicate the ldquotruerdquo locations of faults [7] Other hybridtools such as slicing and query languages may help developers but there isinsufficient evidence that they help developers during debugging

    Although Integrated Development Environments (IDEs) encourage devel-opers to work collaboratively exchanging code through Git or assessing codequality with SonarQube one activity remains solitary debugging Debuggingis still an individual activity during which a developer explores the sourcecode of the system under development or maintenance using the debugger pro-vided by an IDE She steps into hundreds of statements and traverses dozensof method invocations painstakingly to gain an understanding of the systemMoreover within modern interactive debugging tools such as those includedin Eclipse or IntelliJ a debugging session cannot start if the developer does notset a breakpoint Consequently it is mandatory to set at least one breakpointto launch an interactive debugging session

    Several studies have shown that developers spend over two-thirds of theirtime investigating code and one-third of this time is spent in debugging [8910] However developers do not reuse the knowledge accumulated duringdebugging directly When debugging is over they loose track of the pathsthat they followed into the code and of the breakpoints that they toggledMoreover they cannot share this knowledge with other developers easily If afault re-appears in the system or if a new fault similar to a previous one islogged the developer must restart the exploration from the beginning

    In fact debugging tools have not changed substantially in the last 30 yearsdevelopersrsquo primary tools for debugging their programs are still breakpoint de-buggers and print statements Indeed changing the way developers debug theirprograms is one of the main motivations of our work We are convinced thata collaborative way of using contextual information of (previous) debuggingsessions to support (future) debugging activities is a very interesting approach

    Roszligler [7] advocated for the development of a new family of debuggingtools that use contextual information To build context-aware debugging toolsresearchers need an understanding of developersrsquo debugging sessions to use

    Swarm Debugging the Collective Intelligence on Interactive Debugging 3

    this information as context for their debugging Thus researchers need toolsto collect and share data about developersrsquo debugging sessions

    Maalej et al [11] observed that capturing contextual information requiresthe instrumentation of the IDE and continuous observation of the developersrsquoactivities within the IDE Studies by Storey et al [12] showed that the newergeneration of developers who are proficient in social media are comfortablewith sharing such information Developers are nowadays open transparenteager to share their knowledge and generally willing to allow informationabout their activities to be collected by the IDEs automatically [12]

    Considering this context we introduce the concept of Swarm Debug-ging (SD) to (1) capture debugging contextual information (2) share it and(3) reuse it across debugging sessions and developers We build the concept ofSwarm Debugging based on the idea that many developers performing debug-ging sessions independently are in fact building collective knowledge whichcan be shared and reused with adequate support Thus we are convinced thatdevelopers need support to collect store and share this knowledge ie in-formation from and about their debugging sessions including but not limitedto breakpoints locations visited statements and traversed paths To providesuch support Swarm Debugging includes (i) the Swarm Debugging Infrastruc-ture (SDI) with which practitioners and researchers can collect and share dataabout developersrsquo interactive debugging sessions and (ii) the Swarm Debug-ging Global View (GV) to display debugging paths

    As a consequence of adopting SD an interesting question emerges whatdebugging information is useful to share among developers to ease debuggingDebugging provides a lot of information which could be possibly considereduseful to improve software comprehension but we are particularly interestedin two pieces of debugging information breakpoints (and their locations) andsessions (debugging paths) because these pieces of information are essentialfor the two main activities during debugging setting breakpoints and steppinginoverout statements

    In general developers initiate an interactive debugging session by settinga breakpoint Setting a breakpoint is one of the most frequently used fea-tures of IDEs [13] To decide where to set a breakpoint developers use theirobservations recall their experiences with similar debugging tasks and formu-late hypotheses about their tasks [14] Tiarks and Rohms [15] observed thatdevelopers have difficulties in finding locations for setting the breakpointssuggesting that this is a demanding activity and that supporting developersto set appropriate breakpoints could reduce debugging effort

    We conducted two sets of studies with the aim of understanding how de-velopers set breakpoints and navigate (step) during debugging sessions Inobservational studies we collected and analyzed more than 10 hours of devel-opersrsquo videos in 45 debugging sessions performed by 28 different independentdevelopers containing 307 breakpoints on three software systems These ob-servational studies help us understand how developers use breakpoints (RQ1to RQ4)

    4Please give a shorter version with authorrunning and titlerunning prior to maketitle

    We also conducted with 30 professional developers two studies a qualitativeevaluation and a controlled experiment to assess whether debugging sessionsshared through our Global View visualisation support developers in theirdebugging tasks and is useful for sharing debugging tasks among developers(R5 and RQ6) We collected participantsrsquo answers in electronic forms and morethan 3 hours of debugging sessions on video

    This paper has the following contributions

    ndash We introduce a novel approach for debugging named Swarm Debugging(SD) based on the concept of Swarm Intelligence and Information ForagingTheory

    ndash We present an infrastructure the Swarm Debugging Infrastructure (SDI)to gather store and share data about interactive debugging activities tosupport SD

    ndash We provide evidence about the relation between tasksrsquo elapsed time de-velopersrsquo expertise breakpoints setting and debugging patterns

    ndash We present a new visualisation technique Global View (GV) built onshared debugging sessions by developers to ease debugging

    ndash We provide evidence about the usefulness of sharing debugging session toease developersrsquo debugging

    This paper extends our previous works [161718] as follows First we sum-marize the main characteristics of the Swarm Debugging approach providinga theoretical foundation to Swarm Debugging using Swarm Intelligence andInformation Foraging Theory Second we present the Swarm Debugging Infras-tructure (SDI) Third we perform an experiment on the debugging behaviorof 30 professional developers to evaluate if sharing debugging sessions supportsadequately their debugging tasks

    The remainder of this article is organized as follows Section 2 providessome fundamentals of debugging and the foundations of SD the conceptsof swarm intelligence and information foraging theory Section 3 describesour approach and its implementation the Swarm Debugging InfrastructureSection 6 presents an experiment to assess the benefits that our SD approachcan bring to developers and Section 5 reports two experiments that wereconducted using SDI to understand developers debugging habits Next Section7 discusses implications of our results while Section 8 presents threats to thevalidity of our study Section 9 summarizes related work and finally Section10 concludes the paper and outlines future work

    2 Background

    This section provides background information about the debugging activityand setting breakpoints In the following we use failures as unintended be-haviours of a program ie when the program does something that it shouldnot and faults as the incorrect statements in source code causing failuresThe purpose of debugging is to locate and correct faults hence to fix failures

    Swarm Debugging the Collective Intelligence on Interactive Debugging 5

    21 Debugging and Interactive Debugging

    The IEEE Standard Glossary of Software Engineering Terminology (see thedefinition at the beginning of Section 1) defines debugging as the act of de-tecting locating and correcting bugs in a computer program Debugging tech-niques include the use of breakpoints desk checking dumps inspection re-versible execution single-step operations and traces

    Araki et al [19] describe debugging as a process where developers makehypotheses about the root-cause of a problem or defect and verify these hy-potheses by examining different parts of the source code of the program

    Interactive debugging consists of using a tool ie a debugger to detectlocate and correct a fault in a program It is a process also known as programanimation stepping or following execution [20] Developers often refer to thisprocess simply as debugging because several IDEs provide debuggers to sup-port debugging However it must be noted that while debugging is the processof finding faults interactive debugging is one particular debugging approachin which developers use interactive tools Expressions such as interactive de-bugging stepping and debugging are used interchangeably and there is not yeta consensus on what is the best name for this process

    22 Breakpoints and Supporting Mechanisms

    Generally breakpoints allow pausing intentionally the execution of a programfor debugging purposes a means of acquiring knowledge about a program dur-ing its execution for example to examine the call stack and variable valueswhen the control flow reaches the locations of the breakpoints Thus a break-point indicates the location (line) in the source code of a program where apause occurs during its execution

    Depending on the programming language its run-time environment (inparticular the capabilities of its virtual machines if any) and the debuggersdifferent types of breakpoints may be available to developers These types in-clude static breakpoints [21] that pause unconditionally the execution of aprogram and dynamic breakpoints [22] that pause depending on some con-ditions or threads or numbers of hits

    Other types of breakpoints include watchpoints that pause the executionwhen a variable being watched is read andndashor written IDEs offer the meansto specify the different types of breakpoints depending on the programminglanguages and their run-time environment Fig 1-A and 1-B show examples ofstatic and dynamic breakpoints in Eclipse In the rest of this paper we focuson static breakpoints because they are the most used of all types [14]

    There are different mechanisms for setting a breakpoint within the code

    6Please give a shorter version with authorrunning and titlerunning prior to maketitle

    Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

    ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

    ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

    ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

    There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

    A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

    1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

    2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

    3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

    4httpjeffreysambellscom20140114using-breakpoints-in-xcode

    5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

    6httpsdocspythonorg2librarypdbhtml

    7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

    8httpsgithubcomcldwalkerdebugger

    9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

    10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

    Swarm Debugging the Collective Intelligence on Interactive Debugging 7

    flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

    ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

    ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

    ndash Step Out this action would take the debugger back to the line where thecurrent function was called

    To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

    23 Self-organization and Swarm Intelligence

    Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

    Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

    8Please give a shorter version with authorrunning and titlerunning prior to maketitle

    emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

    Dev1

    Dev2

    Dev3

    DevN

    VisualisationsSearching Tools

    Recommendation Systems

    Single Debugging Session Crowd Debugging Sessions Debugging Information

    Positive feedback

    Collect data Store data

    Transform information

    A B C

    D

    Fig 2 Overview of the Swarm Debugging approach

    24 Information Foraging

    Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

    However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

    These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

    3 The Swarm Debugging Approach

    Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

    Swarm Debugging the Collective Intelligence on Interactive Debugging 9

    First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

    We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

    During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

    Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

    4 SDI in a Nutshell

    To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

    11 httpswwweclipseorg

    10Please give a shorter version with authorrunning and titlerunning prior to maketitle

    Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

    Debugging Views All the implementation details of SDI are available in theAppendix section

    41 Swarm Debugging Global View

    Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

    GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

    In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

    Swarm Debugging the Collective Intelligence on Interactive Debugging 11

    Fig 4 GV on all tasks

    nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

    GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

    5 Using SDI to Understand Debugging Activities

    The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

    Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

    In this study we answered the following research questions

    RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

    RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

    12Please give a shorter version with authorrunning and titlerunning prior to maketitle

    RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

    RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

    In this section we elaborate more on each of the studies

    51 Study 1 Observational Study on JabRef

    511 Subject System

    To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

    512 Participants

    We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

    513 Task Description

    We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

    12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

    Swarm Debugging the Collective Intelligence on Interactive Debugging 13

    Table 1 Summary of the issues considered in JabRef in Study 1

    Issues Summaries

    318 ldquoNormalize to Bibtex name formatrdquo

    667 ldquohashpound sign causes URL link to failrdquo

    669 ldquoJabRef 3132 writes bib file in a format

    that it will not readrdquo

    993 ldquoIssues in BibTeX source opens save dialog

    and opens dialog Problem with parsing entryrsquo

    multiple timesrdquo

    1026 ldquoJabref removes comments

    inside the Bibtex coderdquo

    514 Artifacts and Working Environment

    We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

    We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

    515 Study Procedure

    After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

    15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

    14Please give a shorter version with authorrunning and titlerunning prior to maketitle

    All participants who participated in our study correctly executed the warm-uptask

    After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

    516 Data Collection

    The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

    ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

    ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

    ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

    Based on this data we obtained or computed the following metrics perparticipant and task

    ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

    ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

    ndash End time (T ) the time when the participant finished a task

    Swarm Debugging the Collective Intelligence on Interactive Debugging 15

    ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

    We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

    52 Study 2 Empirical Study on PDFSaM and Raptor

    The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

    The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

    53 Results

    As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

    In the following we present the results regarding each research questionaddressed in the two studies

    18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

    16Please give a shorter version with authorrunning and titlerunning prior to maketitle

    RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

    We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

    MFB =EF

    ET(1)

    Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

    Tasks Average Times (min) Std Devs (min)

    318 44 64

    667 28 29

    669 22 25

    993 25 25

    1026 25 17

    PdfSam 54 18

    Raptor 59 13

    Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

    We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

    a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

    Swarm Debugging the Collective Intelligence on Interactive Debugging 17

    RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

    For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

    f(x) =α

    xβ(2)

    where α = 12 and β = 044

    Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

    We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

    This finding also corroborates previous results found with a different set oftasks [17]

    18Please give a shorter version with authorrunning and titlerunning prior to maketitle

    RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

    We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

    Table 3 Study 1 - Breakpoints per type of statement

    Statements Numbers of Breakpoints

    call 111 53

    if-statement 39 19

    assignment 36 17

    return 18 10

    while-loop 3 1

    Table 4 Study 2 - Breakpoints per type of statement

    Statements Numbers of Breakpoints

    call 43 43

    if-statement 22 22

    assignment 27 27

    return 4 4

    while-loop 4 4

    13

    Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

    Swarm Debugging the Collective Intelligence on Interactive Debugging 19

    RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

    We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

    In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

    JabRefDesktopopenExternalViewer(metaData()

    linktoString() field)

    Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

    Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

    Tasks Classes Lines of Code Breakpoints

    0318 AuthorsFormatter 43 5

    0318 AuthorsFormatter 131 3

    0667 BasePanel 935 2

    0667 BasePanel 969 3

    0667 JabRefDesktop 430 2

    0669 OpenDatabaseAction 268 2

    0669 OpenDatabaseAction 433 4

    0669 OpenDatabaseAction 451 4

    0993 EntryEditor 717 2

    0993 EntryEditor 720 2

    0993 EntryEditor 723 2

    0993 BibDatabase 187 2

    0993 BibDatabase 456 2

    1026 EntryEditor 1184 2

    1026 BibtexParser 160 2

    20Please give a shorter version with authorrunning and titlerunning prior to maketitle

    Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

    Classes Lines of Code Breakpoints

    PdfReader 230 2

    PdfReader 806 2

    PdfReader 1923 2

    ConsoleServicesFacade 89 2

    ConsoleClient 81 2

    PdfUtility 94 2

    PdfUtility 96 2

    PdfUtility 102 2

    Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

    Classes Lines of Code Breakpoints

    icsUtils 333 3

    Game 1751 2

    ExamineController 41 2

    ExamineController 84 3

    ExamineController 87 2

    ExamineController 92 2

    When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

    We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

    For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

    Swarm Debugging the Collective Intelligence on Interactive Debugging 21

    Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

    Classes Lines of Code Breakpoints

    BibtexParser 138151159 222

    160165168 323

    176198199299 2222

    EntryEditor 717720721 342

    723837842 232

    11841393 32

    BibDatabase 175187223456 2326

    OpenDatabaseAction 433450451 424

    JabRefDesktop 4084430 223

    SaveDatabaseAction 177188 42

    BasePanel 935969 25

    AuthorsFormatter 43131 54

    EntryTableTransferHandler 346 2

    FieldTextMenu 84 2

    JabRefFrame 1119 2

    JabRefMain 8 5

    URLUtil 95 2

    Fig 6 Methods with 5 or more breakpoints

    Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

    22Please give a shorter version with authorrunning and titlerunning prior to maketitle

    Table 9 Study 1 - Breakpoints by class across different tasks

    Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

    SaveDatabaseAction Yes Yes Yes 7 2

    BasePanel Yes Yes Yes Yes 14 7

    JabRefDesktop Yes Yes 9 4

    EntryEditor Yes Yes Yes 36 4

    BibtexParser Yes Yes Yes 44 6

    OpenDatabaseAction Yes Yes Yes 19 13

    JabRef Yes Yes Yes 3 3

    JabRefMain Yes Yes Yes Yes 5 4

    URLUtil Yes Yes 4 2

    BibDatabase Yes Yes Yes 19 4

    points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

    Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

    6 Evaluation of Swarm Debugging using GV

    To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

    RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

    RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

    Swarm Debugging the Collective Intelligence on Interactive Debugging 23

    61 Study design

    The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

    611 Subject System

    For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

    612 Participants

    Fig 7 Java expertise

    To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

    Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

    20 httpwwwjabreforg21 httpswwwfreelancercom

    24Please give a shorter version with authorrunning and titlerunning prior to maketitle

    613 Task Description

    We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

    614 Artifacts and Working Environment

    After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

    For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

    615 Study Procedure

    The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

    The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

    22 The full qualitative evaluation survey is available on httpsgooglforms

    c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

    Swarm Debugging the Collective Intelligence on Interactive Debugging 25

    group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

    616 Data Collection

    In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

    In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

    All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

    62 Results

    We now discuss the results of our evaluation

    RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

    During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

    25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

    26Please give a shorter version with authorrunning and titlerunning prior to maketitle

    number of participants who could propose a solution and the correctness ofthe solutions

    For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

    Fig 8 GV for Task 0318

    For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

    Fig 9 GV for Task 0667

    Swarm Debugging the Collective Intelligence on Interactive Debugging 27

    Fig 10 GV for Task 0669

    Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

    13

    Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

    RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

    We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

    28Please give a shorter version with authorrunning and titlerunning prior to maketitle

    Fig 11 GV usefulness - experimental phase one

    Fig 12 GV usefulness - experimental phase two

    The analysis of our results suggests that GV is useful to support software-maintenance tasks

    Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

    Swarm Debugging the Collective Intelligence on Interactive Debugging 29

    Table 10 Results from control and experimental groups (average)

    Task 0993

    Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

    First breakpoint 000255 000340 -44 126

    Time to start 000444 000518 -33 112

    Elapsed time 003008 001605 843 53

    Task 1026

    Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

    First breakpoint 000242 000448 -126 177

    Time to start 000402 000343 19 92

    Elapsed time 002458 002041 257 83

    63 Comparing Results from the Control and Experimental Groups

    We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

    Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

    Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

    30Please give a shorter version with authorrunning and titlerunning prior to maketitle

    64 Participantsrsquo Feedback

    As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

    641 Intrinsic Advantage

    Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

    Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

    642 Intrinsic Limitations

    Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

    However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

    Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

    Swarm Debugging the Collective Intelligence on Interactive Debugging 31

    Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

    One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

    Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

    We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

    Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

    Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

    643 Accidental Advantages

    Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

    Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

    Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

    32Please give a shorter version with authorrunning and titlerunning prior to maketitle

    644 Accidental Limitations

    Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

    Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

    One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

    Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

    Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

    645 General Feedback

    Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

    It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

    This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

    Swarm Debugging the Collective Intelligence on Interactive Debugging 33

    debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

    7 Discussion

    We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

    Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

    Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

    Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

    Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

    There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

    28 httpgithubcomswarmdebugging

    34Please give a shorter version with authorrunning and titlerunning prior to maketitle

    We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

    8 Threats to Validity

    Despite its promising results there exist threats to the validity of our studythat we discuss in this section

    As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

    Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

    Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

    We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

    Swarm Debugging the Collective Intelligence on Interactive Debugging 35

    Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

    Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

    Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

    External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

    9 Related work

    We now summarise works related to debugging to allow better positioning ofour study among the published research

    Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

    Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

    36Please give a shorter version with authorrunning and titlerunning prior to maketitle

    which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

    Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

    Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

    DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

    Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

    Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

    Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

    Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

    Swarm Debugging the Collective Intelligence on Interactive Debugging 37

    Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

    Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

    Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

    Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

    10 Conclusion

    Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

    To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

    The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

    38Please give a shorter version with authorrunning and titlerunning prior to maketitle

    breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

    Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

    Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

    In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

    Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

    Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

    Swarm Debugging the Collective Intelligence on Interactive Debugging 39

    haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

    Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

    11 Acknowledgment

    This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

    References

    1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

    2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

    3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

    Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

    rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

    neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

    8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

    9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

    10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

    neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

    on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

    13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

    org107287peerjpreprints2743v1

    14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

    40Please give a shorter version with authorrunning and titlerunning prior to maketitle

    15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

    16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

    17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

    18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

    19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

    101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

    oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

    1218575

    22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

    neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

    conditional_breakpointhtmampcp=1_3_6_0_5

    23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

    24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

    linkspringercom101007s10818-015-9203-6

    25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

    (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

    actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

    C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

    29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

    30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

    31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

    32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

    pmcentrezamprendertype=abstract

    33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

    34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

    35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

    36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

    doiacmorg1011452622669

    37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

    Swarm Debugging the Collective Intelligence on Interactive Debugging 41

    38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

    39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

    40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

    41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

    42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

    43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

    44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

    45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

    46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

    47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

    48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

    49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

    50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

    51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

    52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

    53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

    54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

    55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

    56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

    57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

    58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

    42Please give a shorter version with authorrunning and titlerunning prior to maketitle

    Appendix - Implementation of Swarm Debugging

    Swarm Debugging Services

    The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

    Fig 13 The Swarm Debugging Services architecture

    The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

    We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

    ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

    projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

    and debugging events

    Swarm Debugging the Collective Intelligence on Interactive Debugging 43

    Fig 14 The Swarm Debugging metadata [17]

    ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

    ndash Method is a method associated with a type which can be invoked duringdebugging sessions

    ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

    ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

    ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

    ndash Event is an event data that is collected when a developer performs someactions during a debugging session

    The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

    Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

    29 httpprojectsspringiospring-boot

    44Please give a shorter version with authorrunning and titlerunning prior to maketitle

    and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

    httpswarmdebuggingorgdevelopers

    searchfindByNamename=petrillo

    the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

    SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

    Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

    Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

    Fig 15 Swarm Debugging Dashboard

    30 httpdbswarmdebuggingorg31 httpswwwelasticco

    Swarm Debugging the Collective Intelligence on Interactive Debugging 45

    Fig 16 Neo4J Browser - a Cypher query example

    Graph Querying Console The SDS also persists debugging data in a Neo4J32

    graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

    Figure 16 shows an example of Cypher query and the resulting graph

    Swarm Debugging Tracer

    Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

    After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

    To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

    32 httpneo4jcom

    46Please give a shorter version with authorrunning and titlerunning prior to maketitle

    Fig 17 The Swarm Tracer architecture [17]

    Fig 18 The Swarm Manager view

    Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

    Swarm Debugging the Collective Intelligence on Interactive Debugging 47

    Fig 19 Breakpoint search tool (fuzzy search example)

    invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

    To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

    Swarm Debugging Views

    On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

    Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

    Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

    48Please give a shorter version with authorrunning and titlerunning prior to maketitle

    Fig 20 Sequence stack diagram for Bridge design pattern

    Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

    Breakpoint Search Tool

    Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

    Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

    Swarm Debugging the Collective Intelligence on Interactive Debugging 49

    Fig 21 Method call graph for Bridge design pattern [17]

    StartingEnding Method Search Tool

    This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

    Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

    StartingPoint = VSP | VSP isin α and VSP isin β

    EndingPoint = VEP | VEP isin β and VEP isin α

    Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

    Summary

    Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

    50Please give a shorter version with authorrunning and titlerunning prior to maketitle

    graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

    Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

    • 1 Introduction
    • 2 Background
    • 3 The Swarm Debugging Approach
    • 4 SDI in a Nutshell
    • 5 Using SDI to Understand Debugging Activities
    • 6 Evaluation of Swarm Debugging using GV
    • 7 Discussion
    • 8 Threats to Validity
    • 9 Related work
    • 10 Conclusion
    • 11 Acknowledgment

      Swarm Debugging the Collective Intelligence on Interactive Debugging 3

      this information as context for their debugging Thus researchers need toolsto collect and share data about developersrsquo debugging sessions

      Maalej et al [11] observed that capturing contextual information requiresthe instrumentation of the IDE and continuous observation of the developersrsquoactivities within the IDE Studies by Storey et al [12] showed that the newergeneration of developers who are proficient in social media are comfortablewith sharing such information Developers are nowadays open transparenteager to share their knowledge and generally willing to allow informationabout their activities to be collected by the IDEs automatically [12]

      Considering this context we introduce the concept of Swarm Debug-ging (SD) to (1) capture debugging contextual information (2) share it and(3) reuse it across debugging sessions and developers We build the concept ofSwarm Debugging based on the idea that many developers performing debug-ging sessions independently are in fact building collective knowledge whichcan be shared and reused with adequate support Thus we are convinced thatdevelopers need support to collect store and share this knowledge ie in-formation from and about their debugging sessions including but not limitedto breakpoints locations visited statements and traversed paths To providesuch support Swarm Debugging includes (i) the Swarm Debugging Infrastruc-ture (SDI) with which practitioners and researchers can collect and share dataabout developersrsquo interactive debugging sessions and (ii) the Swarm Debug-ging Global View (GV) to display debugging paths

      As a consequence of adopting SD an interesting question emerges whatdebugging information is useful to share among developers to ease debuggingDebugging provides a lot of information which could be possibly considereduseful to improve software comprehension but we are particularly interestedin two pieces of debugging information breakpoints (and their locations) andsessions (debugging paths) because these pieces of information are essentialfor the two main activities during debugging setting breakpoints and steppinginoverout statements

      In general developers initiate an interactive debugging session by settinga breakpoint Setting a breakpoint is one of the most frequently used fea-tures of IDEs [13] To decide where to set a breakpoint developers use theirobservations recall their experiences with similar debugging tasks and formu-late hypotheses about their tasks [14] Tiarks and Rohms [15] observed thatdevelopers have difficulties in finding locations for setting the breakpointssuggesting that this is a demanding activity and that supporting developersto set appropriate breakpoints could reduce debugging effort

      We conducted two sets of studies with the aim of understanding how de-velopers set breakpoints and navigate (step) during debugging sessions Inobservational studies we collected and analyzed more than 10 hours of devel-opersrsquo videos in 45 debugging sessions performed by 28 different independentdevelopers containing 307 breakpoints on three software systems These ob-servational studies help us understand how developers use breakpoints (RQ1to RQ4)

      4Please give a shorter version with authorrunning and titlerunning prior to maketitle

      We also conducted with 30 professional developers two studies a qualitativeevaluation and a controlled experiment to assess whether debugging sessionsshared through our Global View visualisation support developers in theirdebugging tasks and is useful for sharing debugging tasks among developers(R5 and RQ6) We collected participantsrsquo answers in electronic forms and morethan 3 hours of debugging sessions on video

      This paper has the following contributions

      ndash We introduce a novel approach for debugging named Swarm Debugging(SD) based on the concept of Swarm Intelligence and Information ForagingTheory

      ndash We present an infrastructure the Swarm Debugging Infrastructure (SDI)to gather store and share data about interactive debugging activities tosupport SD

      ndash We provide evidence about the relation between tasksrsquo elapsed time de-velopersrsquo expertise breakpoints setting and debugging patterns

      ndash We present a new visualisation technique Global View (GV) built onshared debugging sessions by developers to ease debugging

      ndash We provide evidence about the usefulness of sharing debugging session toease developersrsquo debugging

      This paper extends our previous works [161718] as follows First we sum-marize the main characteristics of the Swarm Debugging approach providinga theoretical foundation to Swarm Debugging using Swarm Intelligence andInformation Foraging Theory Second we present the Swarm Debugging Infras-tructure (SDI) Third we perform an experiment on the debugging behaviorof 30 professional developers to evaluate if sharing debugging sessions supportsadequately their debugging tasks

      The remainder of this article is organized as follows Section 2 providessome fundamentals of debugging and the foundations of SD the conceptsof swarm intelligence and information foraging theory Section 3 describesour approach and its implementation the Swarm Debugging InfrastructureSection 6 presents an experiment to assess the benefits that our SD approachcan bring to developers and Section 5 reports two experiments that wereconducted using SDI to understand developers debugging habits Next Section7 discusses implications of our results while Section 8 presents threats to thevalidity of our study Section 9 summarizes related work and finally Section10 concludes the paper and outlines future work

      2 Background

      This section provides background information about the debugging activityand setting breakpoints In the following we use failures as unintended be-haviours of a program ie when the program does something that it shouldnot and faults as the incorrect statements in source code causing failuresThe purpose of debugging is to locate and correct faults hence to fix failures

      Swarm Debugging the Collective Intelligence on Interactive Debugging 5

      21 Debugging and Interactive Debugging

      The IEEE Standard Glossary of Software Engineering Terminology (see thedefinition at the beginning of Section 1) defines debugging as the act of de-tecting locating and correcting bugs in a computer program Debugging tech-niques include the use of breakpoints desk checking dumps inspection re-versible execution single-step operations and traces

      Araki et al [19] describe debugging as a process where developers makehypotheses about the root-cause of a problem or defect and verify these hy-potheses by examining different parts of the source code of the program

      Interactive debugging consists of using a tool ie a debugger to detectlocate and correct a fault in a program It is a process also known as programanimation stepping or following execution [20] Developers often refer to thisprocess simply as debugging because several IDEs provide debuggers to sup-port debugging However it must be noted that while debugging is the processof finding faults interactive debugging is one particular debugging approachin which developers use interactive tools Expressions such as interactive de-bugging stepping and debugging are used interchangeably and there is not yeta consensus on what is the best name for this process

      22 Breakpoints and Supporting Mechanisms

      Generally breakpoints allow pausing intentionally the execution of a programfor debugging purposes a means of acquiring knowledge about a program dur-ing its execution for example to examine the call stack and variable valueswhen the control flow reaches the locations of the breakpoints Thus a break-point indicates the location (line) in the source code of a program where apause occurs during its execution

      Depending on the programming language its run-time environment (inparticular the capabilities of its virtual machines if any) and the debuggersdifferent types of breakpoints may be available to developers These types in-clude static breakpoints [21] that pause unconditionally the execution of aprogram and dynamic breakpoints [22] that pause depending on some con-ditions or threads or numbers of hits

      Other types of breakpoints include watchpoints that pause the executionwhen a variable being watched is read andndashor written IDEs offer the meansto specify the different types of breakpoints depending on the programminglanguages and their run-time environment Fig 1-A and 1-B show examples ofstatic and dynamic breakpoints in Eclipse In the rest of this paper we focuson static breakpoints because they are the most used of all types [14]

      There are different mechanisms for setting a breakpoint within the code

      6Please give a shorter version with authorrunning and titlerunning prior to maketitle

      Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

      ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

      ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

      ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

      There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

      A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

      1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

      2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

      3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

      4httpjeffreysambellscom20140114using-breakpoints-in-xcode

      5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

      6httpsdocspythonorg2librarypdbhtml

      7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

      8httpsgithubcomcldwalkerdebugger

      9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

      10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

      Swarm Debugging the Collective Intelligence on Interactive Debugging 7

      flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

      ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

      ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

      ndash Step Out this action would take the debugger back to the line where thecurrent function was called

      To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

      23 Self-organization and Swarm Intelligence

      Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

      Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

      8Please give a shorter version with authorrunning and titlerunning prior to maketitle

      emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

      Dev1

      Dev2

      Dev3

      DevN

      VisualisationsSearching Tools

      Recommendation Systems

      Single Debugging Session Crowd Debugging Sessions Debugging Information

      Positive feedback

      Collect data Store data

      Transform information

      A B C

      D

      Fig 2 Overview of the Swarm Debugging approach

      24 Information Foraging

      Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

      However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

      These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

      3 The Swarm Debugging Approach

      Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

      Swarm Debugging the Collective Intelligence on Interactive Debugging 9

      First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

      We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

      During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

      Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

      4 SDI in a Nutshell

      To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

      11 httpswwweclipseorg

      10Please give a shorter version with authorrunning and titlerunning prior to maketitle

      Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

      Debugging Views All the implementation details of SDI are available in theAppendix section

      41 Swarm Debugging Global View

      Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

      GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

      In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

      Swarm Debugging the Collective Intelligence on Interactive Debugging 11

      Fig 4 GV on all tasks

      nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

      GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

      5 Using SDI to Understand Debugging Activities

      The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

      Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

      In this study we answered the following research questions

      RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

      RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

      12Please give a shorter version with authorrunning and titlerunning prior to maketitle

      RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

      RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

      In this section we elaborate more on each of the studies

      51 Study 1 Observational Study on JabRef

      511 Subject System

      To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

      512 Participants

      We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

      513 Task Description

      We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

      12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

      Swarm Debugging the Collective Intelligence on Interactive Debugging 13

      Table 1 Summary of the issues considered in JabRef in Study 1

      Issues Summaries

      318 ldquoNormalize to Bibtex name formatrdquo

      667 ldquohashpound sign causes URL link to failrdquo

      669 ldquoJabRef 3132 writes bib file in a format

      that it will not readrdquo

      993 ldquoIssues in BibTeX source opens save dialog

      and opens dialog Problem with parsing entryrsquo

      multiple timesrdquo

      1026 ldquoJabref removes comments

      inside the Bibtex coderdquo

      514 Artifacts and Working Environment

      We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

      We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

      515 Study Procedure

      After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

      15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

      14Please give a shorter version with authorrunning and titlerunning prior to maketitle

      All participants who participated in our study correctly executed the warm-uptask

      After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

      516 Data Collection

      The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

      ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

      ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

      ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

      Based on this data we obtained or computed the following metrics perparticipant and task

      ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

      ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

      ndash End time (T ) the time when the participant finished a task

      Swarm Debugging the Collective Intelligence on Interactive Debugging 15

      ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

      We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

      52 Study 2 Empirical Study on PDFSaM and Raptor

      The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

      The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

      53 Results

      As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

      In the following we present the results regarding each research questionaddressed in the two studies

      18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

      16Please give a shorter version with authorrunning and titlerunning prior to maketitle

      RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

      We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

      MFB =EF

      ET(1)

      Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

      Tasks Average Times (min) Std Devs (min)

      318 44 64

      667 28 29

      669 22 25

      993 25 25

      1026 25 17

      PdfSam 54 18

      Raptor 59 13

      Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

      We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

      a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

      Swarm Debugging the Collective Intelligence on Interactive Debugging 17

      RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

      For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

      f(x) =α

      xβ(2)

      where α = 12 and β = 044

      Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

      We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

      This finding also corroborates previous results found with a different set oftasks [17]

      18Please give a shorter version with authorrunning and titlerunning prior to maketitle

      RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

      We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

      Table 3 Study 1 - Breakpoints per type of statement

      Statements Numbers of Breakpoints

      call 111 53

      if-statement 39 19

      assignment 36 17

      return 18 10

      while-loop 3 1

      Table 4 Study 2 - Breakpoints per type of statement

      Statements Numbers of Breakpoints

      call 43 43

      if-statement 22 22

      assignment 27 27

      return 4 4

      while-loop 4 4

      13

      Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

      Swarm Debugging the Collective Intelligence on Interactive Debugging 19

      RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

      We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

      In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

      JabRefDesktopopenExternalViewer(metaData()

      linktoString() field)

      Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

      Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

      Tasks Classes Lines of Code Breakpoints

      0318 AuthorsFormatter 43 5

      0318 AuthorsFormatter 131 3

      0667 BasePanel 935 2

      0667 BasePanel 969 3

      0667 JabRefDesktop 430 2

      0669 OpenDatabaseAction 268 2

      0669 OpenDatabaseAction 433 4

      0669 OpenDatabaseAction 451 4

      0993 EntryEditor 717 2

      0993 EntryEditor 720 2

      0993 EntryEditor 723 2

      0993 BibDatabase 187 2

      0993 BibDatabase 456 2

      1026 EntryEditor 1184 2

      1026 BibtexParser 160 2

      20Please give a shorter version with authorrunning and titlerunning prior to maketitle

      Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

      Classes Lines of Code Breakpoints

      PdfReader 230 2

      PdfReader 806 2

      PdfReader 1923 2

      ConsoleServicesFacade 89 2

      ConsoleClient 81 2

      PdfUtility 94 2

      PdfUtility 96 2

      PdfUtility 102 2

      Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

      Classes Lines of Code Breakpoints

      icsUtils 333 3

      Game 1751 2

      ExamineController 41 2

      ExamineController 84 3

      ExamineController 87 2

      ExamineController 92 2

      When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

      We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

      For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

      Swarm Debugging the Collective Intelligence on Interactive Debugging 21

      Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

      Classes Lines of Code Breakpoints

      BibtexParser 138151159 222

      160165168 323

      176198199299 2222

      EntryEditor 717720721 342

      723837842 232

      11841393 32

      BibDatabase 175187223456 2326

      OpenDatabaseAction 433450451 424

      JabRefDesktop 4084430 223

      SaveDatabaseAction 177188 42

      BasePanel 935969 25

      AuthorsFormatter 43131 54

      EntryTableTransferHandler 346 2

      FieldTextMenu 84 2

      JabRefFrame 1119 2

      JabRefMain 8 5

      URLUtil 95 2

      Fig 6 Methods with 5 or more breakpoints

      Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

      22Please give a shorter version with authorrunning and titlerunning prior to maketitle

      Table 9 Study 1 - Breakpoints by class across different tasks

      Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

      SaveDatabaseAction Yes Yes Yes 7 2

      BasePanel Yes Yes Yes Yes 14 7

      JabRefDesktop Yes Yes 9 4

      EntryEditor Yes Yes Yes 36 4

      BibtexParser Yes Yes Yes 44 6

      OpenDatabaseAction Yes Yes Yes 19 13

      JabRef Yes Yes Yes 3 3

      JabRefMain Yes Yes Yes Yes 5 4

      URLUtil Yes Yes 4 2

      BibDatabase Yes Yes Yes 19 4

      points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

      Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

      6 Evaluation of Swarm Debugging using GV

      To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

      RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

      RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

      Swarm Debugging the Collective Intelligence on Interactive Debugging 23

      61 Study design

      The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

      611 Subject System

      For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

      612 Participants

      Fig 7 Java expertise

      To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

      Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

      20 httpwwwjabreforg21 httpswwwfreelancercom

      24Please give a shorter version with authorrunning and titlerunning prior to maketitle

      613 Task Description

      We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

      614 Artifacts and Working Environment

      After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

      For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

      615 Study Procedure

      The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

      The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

      22 The full qualitative evaluation survey is available on httpsgooglforms

      c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

      Swarm Debugging the Collective Intelligence on Interactive Debugging 25

      group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

      616 Data Collection

      In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

      In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

      All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

      62 Results

      We now discuss the results of our evaluation

      RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

      During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

      25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

      26Please give a shorter version with authorrunning and titlerunning prior to maketitle

      number of participants who could propose a solution and the correctness ofthe solutions

      For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

      Fig 8 GV for Task 0318

      For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

      Fig 9 GV for Task 0667

      Swarm Debugging the Collective Intelligence on Interactive Debugging 27

      Fig 10 GV for Task 0669

      Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

      13

      Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

      RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

      We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

      28Please give a shorter version with authorrunning and titlerunning prior to maketitle

      Fig 11 GV usefulness - experimental phase one

      Fig 12 GV usefulness - experimental phase two

      The analysis of our results suggests that GV is useful to support software-maintenance tasks

      Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

      Swarm Debugging the Collective Intelligence on Interactive Debugging 29

      Table 10 Results from control and experimental groups (average)

      Task 0993

      Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

      First breakpoint 000255 000340 -44 126

      Time to start 000444 000518 -33 112

      Elapsed time 003008 001605 843 53

      Task 1026

      Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

      First breakpoint 000242 000448 -126 177

      Time to start 000402 000343 19 92

      Elapsed time 002458 002041 257 83

      63 Comparing Results from the Control and Experimental Groups

      We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

      Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

      Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

      30Please give a shorter version with authorrunning and titlerunning prior to maketitle

      64 Participantsrsquo Feedback

      As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

      641 Intrinsic Advantage

      Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

      Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

      642 Intrinsic Limitations

      Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

      However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

      Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

      Swarm Debugging the Collective Intelligence on Interactive Debugging 31

      Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

      One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

      Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

      We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

      Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

      Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

      643 Accidental Advantages

      Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

      Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

      Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

      32Please give a shorter version with authorrunning and titlerunning prior to maketitle

      644 Accidental Limitations

      Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

      Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

      One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

      Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

      Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

      645 General Feedback

      Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

      It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

      This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

      Swarm Debugging the Collective Intelligence on Interactive Debugging 33

      debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

      7 Discussion

      We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

      Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

      Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

      Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

      Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

      There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

      28 httpgithubcomswarmdebugging

      34Please give a shorter version with authorrunning and titlerunning prior to maketitle

      We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

      8 Threats to Validity

      Despite its promising results there exist threats to the validity of our studythat we discuss in this section

      As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

      Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

      Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

      We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

      Swarm Debugging the Collective Intelligence on Interactive Debugging 35

      Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

      Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

      Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

      External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

      9 Related work

      We now summarise works related to debugging to allow better positioning ofour study among the published research

      Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

      Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

      36Please give a shorter version with authorrunning and titlerunning prior to maketitle

      which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

      Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

      Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

      DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

      Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

      Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

      Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

      Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

      Swarm Debugging the Collective Intelligence on Interactive Debugging 37

      Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

      Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

      Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

      Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

      10 Conclusion

      Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

      To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

      The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

      38Please give a shorter version with authorrunning and titlerunning prior to maketitle

      breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

      Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

      Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

      In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

      Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

      Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

      Swarm Debugging the Collective Intelligence on Interactive Debugging 39

      haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

      Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

      11 Acknowledgment

      This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

      References

      1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

      2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

      3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

      Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

      rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

      neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

      8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

      9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

      10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

      neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

      on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

      13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

      org107287peerjpreprints2743v1

      14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

      40Please give a shorter version with authorrunning and titlerunning prior to maketitle

      15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

      16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

      17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

      18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

      19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

      101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

      oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

      1218575

      22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

      neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

      conditional_breakpointhtmampcp=1_3_6_0_5

      23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

      24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

      linkspringercom101007s10818-015-9203-6

      25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

      (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

      actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

      C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

      29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

      30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

      31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

      32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

      pmcentrezamprendertype=abstract

      33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

      34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

      35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

      36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

      doiacmorg1011452622669

      37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

      Swarm Debugging the Collective Intelligence on Interactive Debugging 41

      38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

      39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

      40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

      41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

      42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

      43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

      44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

      45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

      46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

      47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

      48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

      49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

      50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

      51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

      52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

      53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

      54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

      55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

      56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

      57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

      58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

      42Please give a shorter version with authorrunning and titlerunning prior to maketitle

      Appendix - Implementation of Swarm Debugging

      Swarm Debugging Services

      The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

      Fig 13 The Swarm Debugging Services architecture

      The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

      We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

      ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

      projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

      and debugging events

      Swarm Debugging the Collective Intelligence on Interactive Debugging 43

      Fig 14 The Swarm Debugging metadata [17]

      ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

      ndash Method is a method associated with a type which can be invoked duringdebugging sessions

      ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

      ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

      ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

      ndash Event is an event data that is collected when a developer performs someactions during a debugging session

      The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

      Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

      29 httpprojectsspringiospring-boot

      44Please give a shorter version with authorrunning and titlerunning prior to maketitle

      and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

      httpswarmdebuggingorgdevelopers

      searchfindByNamename=petrillo

      the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

      SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

      Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

      Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

      Fig 15 Swarm Debugging Dashboard

      30 httpdbswarmdebuggingorg31 httpswwwelasticco

      Swarm Debugging the Collective Intelligence on Interactive Debugging 45

      Fig 16 Neo4J Browser - a Cypher query example

      Graph Querying Console The SDS also persists debugging data in a Neo4J32

      graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

      Figure 16 shows an example of Cypher query and the resulting graph

      Swarm Debugging Tracer

      Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

      After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

      To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

      32 httpneo4jcom

      46Please give a shorter version with authorrunning and titlerunning prior to maketitle

      Fig 17 The Swarm Tracer architecture [17]

      Fig 18 The Swarm Manager view

      Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

      Swarm Debugging the Collective Intelligence on Interactive Debugging 47

      Fig 19 Breakpoint search tool (fuzzy search example)

      invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

      To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

      Swarm Debugging Views

      On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

      Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

      Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

      48Please give a shorter version with authorrunning and titlerunning prior to maketitle

      Fig 20 Sequence stack diagram for Bridge design pattern

      Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

      Breakpoint Search Tool

      Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

      Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

      Swarm Debugging the Collective Intelligence on Interactive Debugging 49

      Fig 21 Method call graph for Bridge design pattern [17]

      StartingEnding Method Search Tool

      This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

      Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

      StartingPoint = VSP | VSP isin α and VSP isin β

      EndingPoint = VEP | VEP isin β and VEP isin α

      Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

      Summary

      Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

      50Please give a shorter version with authorrunning and titlerunning prior to maketitle

      graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

      Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

      • 1 Introduction
      • 2 Background
      • 3 The Swarm Debugging Approach
      • 4 SDI in a Nutshell
      • 5 Using SDI to Understand Debugging Activities
      • 6 Evaluation of Swarm Debugging using GV
      • 7 Discussion
      • 8 Threats to Validity
      • 9 Related work
      • 10 Conclusion
      • 11 Acknowledgment

        4Please give a shorter version with authorrunning and titlerunning prior to maketitle

        We also conducted with 30 professional developers two studies a qualitativeevaluation and a controlled experiment to assess whether debugging sessionsshared through our Global View visualisation support developers in theirdebugging tasks and is useful for sharing debugging tasks among developers(R5 and RQ6) We collected participantsrsquo answers in electronic forms and morethan 3 hours of debugging sessions on video

        This paper has the following contributions

        ndash We introduce a novel approach for debugging named Swarm Debugging(SD) based on the concept of Swarm Intelligence and Information ForagingTheory

        ndash We present an infrastructure the Swarm Debugging Infrastructure (SDI)to gather store and share data about interactive debugging activities tosupport SD

        ndash We provide evidence about the relation between tasksrsquo elapsed time de-velopersrsquo expertise breakpoints setting and debugging patterns

        ndash We present a new visualisation technique Global View (GV) built onshared debugging sessions by developers to ease debugging

        ndash We provide evidence about the usefulness of sharing debugging session toease developersrsquo debugging

        This paper extends our previous works [161718] as follows First we sum-marize the main characteristics of the Swarm Debugging approach providinga theoretical foundation to Swarm Debugging using Swarm Intelligence andInformation Foraging Theory Second we present the Swarm Debugging Infras-tructure (SDI) Third we perform an experiment on the debugging behaviorof 30 professional developers to evaluate if sharing debugging sessions supportsadequately their debugging tasks

        The remainder of this article is organized as follows Section 2 providessome fundamentals of debugging and the foundations of SD the conceptsof swarm intelligence and information foraging theory Section 3 describesour approach and its implementation the Swarm Debugging InfrastructureSection 6 presents an experiment to assess the benefits that our SD approachcan bring to developers and Section 5 reports two experiments that wereconducted using SDI to understand developers debugging habits Next Section7 discusses implications of our results while Section 8 presents threats to thevalidity of our study Section 9 summarizes related work and finally Section10 concludes the paper and outlines future work

        2 Background

        This section provides background information about the debugging activityand setting breakpoints In the following we use failures as unintended be-haviours of a program ie when the program does something that it shouldnot and faults as the incorrect statements in source code causing failuresThe purpose of debugging is to locate and correct faults hence to fix failures

        Swarm Debugging the Collective Intelligence on Interactive Debugging 5

        21 Debugging and Interactive Debugging

        The IEEE Standard Glossary of Software Engineering Terminology (see thedefinition at the beginning of Section 1) defines debugging as the act of de-tecting locating and correcting bugs in a computer program Debugging tech-niques include the use of breakpoints desk checking dumps inspection re-versible execution single-step operations and traces

        Araki et al [19] describe debugging as a process where developers makehypotheses about the root-cause of a problem or defect and verify these hy-potheses by examining different parts of the source code of the program

        Interactive debugging consists of using a tool ie a debugger to detectlocate and correct a fault in a program It is a process also known as programanimation stepping or following execution [20] Developers often refer to thisprocess simply as debugging because several IDEs provide debuggers to sup-port debugging However it must be noted that while debugging is the processof finding faults interactive debugging is one particular debugging approachin which developers use interactive tools Expressions such as interactive de-bugging stepping and debugging are used interchangeably and there is not yeta consensus on what is the best name for this process

        22 Breakpoints and Supporting Mechanisms

        Generally breakpoints allow pausing intentionally the execution of a programfor debugging purposes a means of acquiring knowledge about a program dur-ing its execution for example to examine the call stack and variable valueswhen the control flow reaches the locations of the breakpoints Thus a break-point indicates the location (line) in the source code of a program where apause occurs during its execution

        Depending on the programming language its run-time environment (inparticular the capabilities of its virtual machines if any) and the debuggersdifferent types of breakpoints may be available to developers These types in-clude static breakpoints [21] that pause unconditionally the execution of aprogram and dynamic breakpoints [22] that pause depending on some con-ditions or threads or numbers of hits

        Other types of breakpoints include watchpoints that pause the executionwhen a variable being watched is read andndashor written IDEs offer the meansto specify the different types of breakpoints depending on the programminglanguages and their run-time environment Fig 1-A and 1-B show examples ofstatic and dynamic breakpoints in Eclipse In the rest of this paper we focuson static breakpoints because they are the most used of all types [14]

        There are different mechanisms for setting a breakpoint within the code

        6Please give a shorter version with authorrunning and titlerunning prior to maketitle

        Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

        ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

        ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

        ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

        There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

        A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

        1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

        2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

        3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

        4httpjeffreysambellscom20140114using-breakpoints-in-xcode

        5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

        6httpsdocspythonorg2librarypdbhtml

        7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

        8httpsgithubcomcldwalkerdebugger

        9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

        10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

        Swarm Debugging the Collective Intelligence on Interactive Debugging 7

        flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

        ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

        ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

        ndash Step Out this action would take the debugger back to the line where thecurrent function was called

        To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

        23 Self-organization and Swarm Intelligence

        Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

        Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

        8Please give a shorter version with authorrunning and titlerunning prior to maketitle

        emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

        Dev1

        Dev2

        Dev3

        DevN

        VisualisationsSearching Tools

        Recommendation Systems

        Single Debugging Session Crowd Debugging Sessions Debugging Information

        Positive feedback

        Collect data Store data

        Transform information

        A B C

        D

        Fig 2 Overview of the Swarm Debugging approach

        24 Information Foraging

        Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

        However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

        These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

        3 The Swarm Debugging Approach

        Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

        Swarm Debugging the Collective Intelligence on Interactive Debugging 9

        First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

        We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

        During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

        Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

        4 SDI in a Nutshell

        To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

        11 httpswwweclipseorg

        10Please give a shorter version with authorrunning and titlerunning prior to maketitle

        Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

        Debugging Views All the implementation details of SDI are available in theAppendix section

        41 Swarm Debugging Global View

        Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

        GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

        In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

        Swarm Debugging the Collective Intelligence on Interactive Debugging 11

        Fig 4 GV on all tasks

        nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

        GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

        5 Using SDI to Understand Debugging Activities

        The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

        Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

        In this study we answered the following research questions

        RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

        RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

        12Please give a shorter version with authorrunning and titlerunning prior to maketitle

        RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

        RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

        In this section we elaborate more on each of the studies

        51 Study 1 Observational Study on JabRef

        511 Subject System

        To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

        512 Participants

        We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

        513 Task Description

        We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

        12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

        Swarm Debugging the Collective Intelligence on Interactive Debugging 13

        Table 1 Summary of the issues considered in JabRef in Study 1

        Issues Summaries

        318 ldquoNormalize to Bibtex name formatrdquo

        667 ldquohashpound sign causes URL link to failrdquo

        669 ldquoJabRef 3132 writes bib file in a format

        that it will not readrdquo

        993 ldquoIssues in BibTeX source opens save dialog

        and opens dialog Problem with parsing entryrsquo

        multiple timesrdquo

        1026 ldquoJabref removes comments

        inside the Bibtex coderdquo

        514 Artifacts and Working Environment

        We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

        We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

        515 Study Procedure

        After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

        15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

        14Please give a shorter version with authorrunning and titlerunning prior to maketitle

        All participants who participated in our study correctly executed the warm-uptask

        After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

        516 Data Collection

        The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

        ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

        ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

        ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

        Based on this data we obtained or computed the following metrics perparticipant and task

        ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

        ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

        ndash End time (T ) the time when the participant finished a task

        Swarm Debugging the Collective Intelligence on Interactive Debugging 15

        ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

        We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

        52 Study 2 Empirical Study on PDFSaM and Raptor

        The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

        The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

        53 Results

        As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

        In the following we present the results regarding each research questionaddressed in the two studies

        18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

        16Please give a shorter version with authorrunning and titlerunning prior to maketitle

        RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

        We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

        MFB =EF

        ET(1)

        Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

        Tasks Average Times (min) Std Devs (min)

        318 44 64

        667 28 29

        669 22 25

        993 25 25

        1026 25 17

        PdfSam 54 18

        Raptor 59 13

        Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

        We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

        a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

        Swarm Debugging the Collective Intelligence on Interactive Debugging 17

        RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

        For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

        f(x) =α

        xβ(2)

        where α = 12 and β = 044

        Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

        We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

        This finding also corroborates previous results found with a different set oftasks [17]

        18Please give a shorter version with authorrunning and titlerunning prior to maketitle

        RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

        We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

        Table 3 Study 1 - Breakpoints per type of statement

        Statements Numbers of Breakpoints

        call 111 53

        if-statement 39 19

        assignment 36 17

        return 18 10

        while-loop 3 1

        Table 4 Study 2 - Breakpoints per type of statement

        Statements Numbers of Breakpoints

        call 43 43

        if-statement 22 22

        assignment 27 27

        return 4 4

        while-loop 4 4

        13

        Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

        Swarm Debugging the Collective Intelligence on Interactive Debugging 19

        RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

        We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

        In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

        JabRefDesktopopenExternalViewer(metaData()

        linktoString() field)

        Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

        Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

        Tasks Classes Lines of Code Breakpoints

        0318 AuthorsFormatter 43 5

        0318 AuthorsFormatter 131 3

        0667 BasePanel 935 2

        0667 BasePanel 969 3

        0667 JabRefDesktop 430 2

        0669 OpenDatabaseAction 268 2

        0669 OpenDatabaseAction 433 4

        0669 OpenDatabaseAction 451 4

        0993 EntryEditor 717 2

        0993 EntryEditor 720 2

        0993 EntryEditor 723 2

        0993 BibDatabase 187 2

        0993 BibDatabase 456 2

        1026 EntryEditor 1184 2

        1026 BibtexParser 160 2

        20Please give a shorter version with authorrunning and titlerunning prior to maketitle

        Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

        Classes Lines of Code Breakpoints

        PdfReader 230 2

        PdfReader 806 2

        PdfReader 1923 2

        ConsoleServicesFacade 89 2

        ConsoleClient 81 2

        PdfUtility 94 2

        PdfUtility 96 2

        PdfUtility 102 2

        Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

        Classes Lines of Code Breakpoints

        icsUtils 333 3

        Game 1751 2

        ExamineController 41 2

        ExamineController 84 3

        ExamineController 87 2

        ExamineController 92 2

        When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

        We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

        For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

        Swarm Debugging the Collective Intelligence on Interactive Debugging 21

        Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

        Classes Lines of Code Breakpoints

        BibtexParser 138151159 222

        160165168 323

        176198199299 2222

        EntryEditor 717720721 342

        723837842 232

        11841393 32

        BibDatabase 175187223456 2326

        OpenDatabaseAction 433450451 424

        JabRefDesktop 4084430 223

        SaveDatabaseAction 177188 42

        BasePanel 935969 25

        AuthorsFormatter 43131 54

        EntryTableTransferHandler 346 2

        FieldTextMenu 84 2

        JabRefFrame 1119 2

        JabRefMain 8 5

        URLUtil 95 2

        Fig 6 Methods with 5 or more breakpoints

        Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

        22Please give a shorter version with authorrunning and titlerunning prior to maketitle

        Table 9 Study 1 - Breakpoints by class across different tasks

        Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

        SaveDatabaseAction Yes Yes Yes 7 2

        BasePanel Yes Yes Yes Yes 14 7

        JabRefDesktop Yes Yes 9 4

        EntryEditor Yes Yes Yes 36 4

        BibtexParser Yes Yes Yes 44 6

        OpenDatabaseAction Yes Yes Yes 19 13

        JabRef Yes Yes Yes 3 3

        JabRefMain Yes Yes Yes Yes 5 4

        URLUtil Yes Yes 4 2

        BibDatabase Yes Yes Yes 19 4

        points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

        Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

        6 Evaluation of Swarm Debugging using GV

        To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

        RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

        RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

        Swarm Debugging the Collective Intelligence on Interactive Debugging 23

        61 Study design

        The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

        611 Subject System

        For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

        612 Participants

        Fig 7 Java expertise

        To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

        Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

        20 httpwwwjabreforg21 httpswwwfreelancercom

        24Please give a shorter version with authorrunning and titlerunning prior to maketitle

        613 Task Description

        We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

        614 Artifacts and Working Environment

        After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

        For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

        615 Study Procedure

        The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

        The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

        22 The full qualitative evaluation survey is available on httpsgooglforms

        c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

        Swarm Debugging the Collective Intelligence on Interactive Debugging 25

        group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

        616 Data Collection

        In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

        In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

        All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

        62 Results

        We now discuss the results of our evaluation

        RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

        During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

        25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

        26Please give a shorter version with authorrunning and titlerunning prior to maketitle

        number of participants who could propose a solution and the correctness ofthe solutions

        For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

        Fig 8 GV for Task 0318

        For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

        Fig 9 GV for Task 0667

        Swarm Debugging the Collective Intelligence on Interactive Debugging 27

        Fig 10 GV for Task 0669

        Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

        13

        Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

        RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

        We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

        28Please give a shorter version with authorrunning and titlerunning prior to maketitle

        Fig 11 GV usefulness - experimental phase one

        Fig 12 GV usefulness - experimental phase two

        The analysis of our results suggests that GV is useful to support software-maintenance tasks

        Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

        Swarm Debugging the Collective Intelligence on Interactive Debugging 29

        Table 10 Results from control and experimental groups (average)

        Task 0993

        Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

        First breakpoint 000255 000340 -44 126

        Time to start 000444 000518 -33 112

        Elapsed time 003008 001605 843 53

        Task 1026

        Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

        First breakpoint 000242 000448 -126 177

        Time to start 000402 000343 19 92

        Elapsed time 002458 002041 257 83

        63 Comparing Results from the Control and Experimental Groups

        We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

        Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

        Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

        30Please give a shorter version with authorrunning and titlerunning prior to maketitle

        64 Participantsrsquo Feedback

        As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

        641 Intrinsic Advantage

        Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

        Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

        642 Intrinsic Limitations

        Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

        However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

        Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

        Swarm Debugging the Collective Intelligence on Interactive Debugging 31

        Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

        One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

        Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

        We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

        Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

        Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

        643 Accidental Advantages

        Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

        Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

        Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

        32Please give a shorter version with authorrunning and titlerunning prior to maketitle

        644 Accidental Limitations

        Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

        Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

        One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

        Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

        Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

        645 General Feedback

        Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

        It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

        This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

        Swarm Debugging the Collective Intelligence on Interactive Debugging 33

        debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

        7 Discussion

        We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

        Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

        Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

        Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

        Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

        There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

        28 httpgithubcomswarmdebugging

        34Please give a shorter version with authorrunning and titlerunning prior to maketitle

        We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

        8 Threats to Validity

        Despite its promising results there exist threats to the validity of our studythat we discuss in this section

        As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

        Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

        Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

        We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

        Swarm Debugging the Collective Intelligence on Interactive Debugging 35

        Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

        Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

        Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

        External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

        9 Related work

        We now summarise works related to debugging to allow better positioning ofour study among the published research

        Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

        Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

        36Please give a shorter version with authorrunning and titlerunning prior to maketitle

        which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

        Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

        Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

        DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

        Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

        Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

        Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

        Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

        Swarm Debugging the Collective Intelligence on Interactive Debugging 37

        Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

        Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

        Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

        Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

        10 Conclusion

        Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

        To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

        The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

        38Please give a shorter version with authorrunning and titlerunning prior to maketitle

        breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

        Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

        Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

        In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

        Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

        Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

        Swarm Debugging the Collective Intelligence on Interactive Debugging 39

        haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

        Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

        11 Acknowledgment

        This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

        References

        1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

        2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

        3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

        Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

        rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

        neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

        8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

        9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

        10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

        neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

        on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

        13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

        org107287peerjpreprints2743v1

        14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

        40Please give a shorter version with authorrunning and titlerunning prior to maketitle

        15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

        16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

        17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

        18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

        19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

        101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

        oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

        1218575

        22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

        neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

        conditional_breakpointhtmampcp=1_3_6_0_5

        23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

        24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

        linkspringercom101007s10818-015-9203-6

        25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

        (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

        actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

        C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

        29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

        30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

        31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

        32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

        pmcentrezamprendertype=abstract

        33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

        34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

        35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

        36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

        doiacmorg1011452622669

        37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

        Swarm Debugging the Collective Intelligence on Interactive Debugging 41

        38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

        39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

        40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

        41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

        42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

        43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

        44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

        45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

        46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

        47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

        48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

        49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

        50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

        51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

        52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

        53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

        54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

        55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

        56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

        57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

        58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

        42Please give a shorter version with authorrunning and titlerunning prior to maketitle

        Appendix - Implementation of Swarm Debugging

        Swarm Debugging Services

        The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

        Fig 13 The Swarm Debugging Services architecture

        The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

        We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

        ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

        projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

        and debugging events

        Swarm Debugging the Collective Intelligence on Interactive Debugging 43

        Fig 14 The Swarm Debugging metadata [17]

        ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

        ndash Method is a method associated with a type which can be invoked duringdebugging sessions

        ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

        ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

        ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

        ndash Event is an event data that is collected when a developer performs someactions during a debugging session

        The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

        Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

        29 httpprojectsspringiospring-boot

        44Please give a shorter version with authorrunning and titlerunning prior to maketitle

        and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

        httpswarmdebuggingorgdevelopers

        searchfindByNamename=petrillo

        the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

        SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

        Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

        Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

        Fig 15 Swarm Debugging Dashboard

        30 httpdbswarmdebuggingorg31 httpswwwelasticco

        Swarm Debugging the Collective Intelligence on Interactive Debugging 45

        Fig 16 Neo4J Browser - a Cypher query example

        Graph Querying Console The SDS also persists debugging data in a Neo4J32

        graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

        Figure 16 shows an example of Cypher query and the resulting graph

        Swarm Debugging Tracer

        Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

        After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

        To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

        32 httpneo4jcom

        46Please give a shorter version with authorrunning and titlerunning prior to maketitle

        Fig 17 The Swarm Tracer architecture [17]

        Fig 18 The Swarm Manager view

        Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

        Swarm Debugging the Collective Intelligence on Interactive Debugging 47

        Fig 19 Breakpoint search tool (fuzzy search example)

        invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

        To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

        Swarm Debugging Views

        On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

        Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

        Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

        48Please give a shorter version with authorrunning and titlerunning prior to maketitle

        Fig 20 Sequence stack diagram for Bridge design pattern

        Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

        Breakpoint Search Tool

        Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

        Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

        Swarm Debugging the Collective Intelligence on Interactive Debugging 49

        Fig 21 Method call graph for Bridge design pattern [17]

        StartingEnding Method Search Tool

        This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

        Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

        StartingPoint = VSP | VSP isin α and VSP isin β

        EndingPoint = VEP | VEP isin β and VEP isin α

        Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

        Summary

        Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

        50Please give a shorter version with authorrunning and titlerunning prior to maketitle

        graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

        Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

        • 1 Introduction
        • 2 Background
        • 3 The Swarm Debugging Approach
        • 4 SDI in a Nutshell
        • 5 Using SDI to Understand Debugging Activities
        • 6 Evaluation of Swarm Debugging using GV
        • 7 Discussion
        • 8 Threats to Validity
        • 9 Related work
        • 10 Conclusion
        • 11 Acknowledgment

          Swarm Debugging the Collective Intelligence on Interactive Debugging 5

          21 Debugging and Interactive Debugging

          The IEEE Standard Glossary of Software Engineering Terminology (see thedefinition at the beginning of Section 1) defines debugging as the act of de-tecting locating and correcting bugs in a computer program Debugging tech-niques include the use of breakpoints desk checking dumps inspection re-versible execution single-step operations and traces

          Araki et al [19] describe debugging as a process where developers makehypotheses about the root-cause of a problem or defect and verify these hy-potheses by examining different parts of the source code of the program

          Interactive debugging consists of using a tool ie a debugger to detectlocate and correct a fault in a program It is a process also known as programanimation stepping or following execution [20] Developers often refer to thisprocess simply as debugging because several IDEs provide debuggers to sup-port debugging However it must be noted that while debugging is the processof finding faults interactive debugging is one particular debugging approachin which developers use interactive tools Expressions such as interactive de-bugging stepping and debugging are used interchangeably and there is not yeta consensus on what is the best name for this process

          22 Breakpoints and Supporting Mechanisms

          Generally breakpoints allow pausing intentionally the execution of a programfor debugging purposes a means of acquiring knowledge about a program dur-ing its execution for example to examine the call stack and variable valueswhen the control flow reaches the locations of the breakpoints Thus a break-point indicates the location (line) in the source code of a program where apause occurs during its execution

          Depending on the programming language its run-time environment (inparticular the capabilities of its virtual machines if any) and the debuggersdifferent types of breakpoints may be available to developers These types in-clude static breakpoints [21] that pause unconditionally the execution of aprogram and dynamic breakpoints [22] that pause depending on some con-ditions or threads or numbers of hits

          Other types of breakpoints include watchpoints that pause the executionwhen a variable being watched is read andndashor written IDEs offer the meansto specify the different types of breakpoints depending on the programminglanguages and their run-time environment Fig 1-A and 1-B show examples ofstatic and dynamic breakpoints in Eclipse In the rest of this paper we focuson static breakpoints because they are the most used of all types [14]

          There are different mechanisms for setting a breakpoint within the code

          6Please give a shorter version with authorrunning and titlerunning prior to maketitle

          Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

          ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

          ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

          ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

          There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

          A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

          1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

          2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

          3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

          4httpjeffreysambellscom20140114using-breakpoints-in-xcode

          5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

          6httpsdocspythonorg2librarypdbhtml

          7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

          8httpsgithubcomcldwalkerdebugger

          9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

          10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

          Swarm Debugging the Collective Intelligence on Interactive Debugging 7

          flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

          ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

          ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

          ndash Step Out this action would take the debugger back to the line where thecurrent function was called

          To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

          23 Self-organization and Swarm Intelligence

          Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

          Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

          8Please give a shorter version with authorrunning and titlerunning prior to maketitle

          emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

          Dev1

          Dev2

          Dev3

          DevN

          VisualisationsSearching Tools

          Recommendation Systems

          Single Debugging Session Crowd Debugging Sessions Debugging Information

          Positive feedback

          Collect data Store data

          Transform information

          A B C

          D

          Fig 2 Overview of the Swarm Debugging approach

          24 Information Foraging

          Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

          However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

          These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

          3 The Swarm Debugging Approach

          Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

          Swarm Debugging the Collective Intelligence on Interactive Debugging 9

          First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

          We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

          During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

          Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

          4 SDI in a Nutshell

          To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

          11 httpswwweclipseorg

          10Please give a shorter version with authorrunning and titlerunning prior to maketitle

          Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

          Debugging Views All the implementation details of SDI are available in theAppendix section

          41 Swarm Debugging Global View

          Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

          GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

          In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

          Swarm Debugging the Collective Intelligence on Interactive Debugging 11

          Fig 4 GV on all tasks

          nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

          GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

          5 Using SDI to Understand Debugging Activities

          The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

          Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

          In this study we answered the following research questions

          RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

          RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

          12Please give a shorter version with authorrunning and titlerunning prior to maketitle

          RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

          RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

          In this section we elaborate more on each of the studies

          51 Study 1 Observational Study on JabRef

          511 Subject System

          To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

          512 Participants

          We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

          513 Task Description

          We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

          12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

          Swarm Debugging the Collective Intelligence on Interactive Debugging 13

          Table 1 Summary of the issues considered in JabRef in Study 1

          Issues Summaries

          318 ldquoNormalize to Bibtex name formatrdquo

          667 ldquohashpound sign causes URL link to failrdquo

          669 ldquoJabRef 3132 writes bib file in a format

          that it will not readrdquo

          993 ldquoIssues in BibTeX source opens save dialog

          and opens dialog Problem with parsing entryrsquo

          multiple timesrdquo

          1026 ldquoJabref removes comments

          inside the Bibtex coderdquo

          514 Artifacts and Working Environment

          We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

          We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

          515 Study Procedure

          After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

          15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

          14Please give a shorter version with authorrunning and titlerunning prior to maketitle

          All participants who participated in our study correctly executed the warm-uptask

          After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

          516 Data Collection

          The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

          ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

          ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

          ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

          Based on this data we obtained or computed the following metrics perparticipant and task

          ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

          ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

          ndash End time (T ) the time when the participant finished a task

          Swarm Debugging the Collective Intelligence on Interactive Debugging 15

          ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

          We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

          52 Study 2 Empirical Study on PDFSaM and Raptor

          The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

          The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

          53 Results

          As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

          In the following we present the results regarding each research questionaddressed in the two studies

          18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

          16Please give a shorter version with authorrunning and titlerunning prior to maketitle

          RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

          We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

          MFB =EF

          ET(1)

          Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

          Tasks Average Times (min) Std Devs (min)

          318 44 64

          667 28 29

          669 22 25

          993 25 25

          1026 25 17

          PdfSam 54 18

          Raptor 59 13

          Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

          We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

          a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

          Swarm Debugging the Collective Intelligence on Interactive Debugging 17

          RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

          For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

          f(x) =α

          xβ(2)

          where α = 12 and β = 044

          Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

          We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

          This finding also corroborates previous results found with a different set oftasks [17]

          18Please give a shorter version with authorrunning and titlerunning prior to maketitle

          RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

          We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

          Table 3 Study 1 - Breakpoints per type of statement

          Statements Numbers of Breakpoints

          call 111 53

          if-statement 39 19

          assignment 36 17

          return 18 10

          while-loop 3 1

          Table 4 Study 2 - Breakpoints per type of statement

          Statements Numbers of Breakpoints

          call 43 43

          if-statement 22 22

          assignment 27 27

          return 4 4

          while-loop 4 4

          13

          Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

          Swarm Debugging the Collective Intelligence on Interactive Debugging 19

          RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

          We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

          In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

          JabRefDesktopopenExternalViewer(metaData()

          linktoString() field)

          Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

          Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

          Tasks Classes Lines of Code Breakpoints

          0318 AuthorsFormatter 43 5

          0318 AuthorsFormatter 131 3

          0667 BasePanel 935 2

          0667 BasePanel 969 3

          0667 JabRefDesktop 430 2

          0669 OpenDatabaseAction 268 2

          0669 OpenDatabaseAction 433 4

          0669 OpenDatabaseAction 451 4

          0993 EntryEditor 717 2

          0993 EntryEditor 720 2

          0993 EntryEditor 723 2

          0993 BibDatabase 187 2

          0993 BibDatabase 456 2

          1026 EntryEditor 1184 2

          1026 BibtexParser 160 2

          20Please give a shorter version with authorrunning and titlerunning prior to maketitle

          Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

          Classes Lines of Code Breakpoints

          PdfReader 230 2

          PdfReader 806 2

          PdfReader 1923 2

          ConsoleServicesFacade 89 2

          ConsoleClient 81 2

          PdfUtility 94 2

          PdfUtility 96 2

          PdfUtility 102 2

          Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

          Classes Lines of Code Breakpoints

          icsUtils 333 3

          Game 1751 2

          ExamineController 41 2

          ExamineController 84 3

          ExamineController 87 2

          ExamineController 92 2

          When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

          We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

          For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

          Swarm Debugging the Collective Intelligence on Interactive Debugging 21

          Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

          Classes Lines of Code Breakpoints

          BibtexParser 138151159 222

          160165168 323

          176198199299 2222

          EntryEditor 717720721 342

          723837842 232

          11841393 32

          BibDatabase 175187223456 2326

          OpenDatabaseAction 433450451 424

          JabRefDesktop 4084430 223

          SaveDatabaseAction 177188 42

          BasePanel 935969 25

          AuthorsFormatter 43131 54

          EntryTableTransferHandler 346 2

          FieldTextMenu 84 2

          JabRefFrame 1119 2

          JabRefMain 8 5

          URLUtil 95 2

          Fig 6 Methods with 5 or more breakpoints

          Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

          22Please give a shorter version with authorrunning and titlerunning prior to maketitle

          Table 9 Study 1 - Breakpoints by class across different tasks

          Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

          SaveDatabaseAction Yes Yes Yes 7 2

          BasePanel Yes Yes Yes Yes 14 7

          JabRefDesktop Yes Yes 9 4

          EntryEditor Yes Yes Yes 36 4

          BibtexParser Yes Yes Yes 44 6

          OpenDatabaseAction Yes Yes Yes 19 13

          JabRef Yes Yes Yes 3 3

          JabRefMain Yes Yes Yes Yes 5 4

          URLUtil Yes Yes 4 2

          BibDatabase Yes Yes Yes 19 4

          points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

          Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

          6 Evaluation of Swarm Debugging using GV

          To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

          RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

          RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

          Swarm Debugging the Collective Intelligence on Interactive Debugging 23

          61 Study design

          The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

          611 Subject System

          For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

          612 Participants

          Fig 7 Java expertise

          To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

          Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

          20 httpwwwjabreforg21 httpswwwfreelancercom

          24Please give a shorter version with authorrunning and titlerunning prior to maketitle

          613 Task Description

          We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

          614 Artifacts and Working Environment

          After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

          For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

          615 Study Procedure

          The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

          The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

          22 The full qualitative evaluation survey is available on httpsgooglforms

          c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

          Swarm Debugging the Collective Intelligence on Interactive Debugging 25

          group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

          616 Data Collection

          In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

          In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

          All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

          62 Results

          We now discuss the results of our evaluation

          RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

          During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

          25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

          26Please give a shorter version with authorrunning and titlerunning prior to maketitle

          number of participants who could propose a solution and the correctness ofthe solutions

          For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

          Fig 8 GV for Task 0318

          For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

          Fig 9 GV for Task 0667

          Swarm Debugging the Collective Intelligence on Interactive Debugging 27

          Fig 10 GV for Task 0669

          Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

          13

          Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

          RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

          We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

          28Please give a shorter version with authorrunning and titlerunning prior to maketitle

          Fig 11 GV usefulness - experimental phase one

          Fig 12 GV usefulness - experimental phase two

          The analysis of our results suggests that GV is useful to support software-maintenance tasks

          Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

          Swarm Debugging the Collective Intelligence on Interactive Debugging 29

          Table 10 Results from control and experimental groups (average)

          Task 0993

          Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

          First breakpoint 000255 000340 -44 126

          Time to start 000444 000518 -33 112

          Elapsed time 003008 001605 843 53

          Task 1026

          Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

          First breakpoint 000242 000448 -126 177

          Time to start 000402 000343 19 92

          Elapsed time 002458 002041 257 83

          63 Comparing Results from the Control and Experimental Groups

          We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

          Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

          Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

          30Please give a shorter version with authorrunning and titlerunning prior to maketitle

          64 Participantsrsquo Feedback

          As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

          641 Intrinsic Advantage

          Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

          Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

          642 Intrinsic Limitations

          Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

          However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

          Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

          Swarm Debugging the Collective Intelligence on Interactive Debugging 31

          Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

          One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

          Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

          We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

          Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

          Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

          643 Accidental Advantages

          Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

          Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

          Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

          32Please give a shorter version with authorrunning and titlerunning prior to maketitle

          644 Accidental Limitations

          Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

          Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

          One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

          Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

          Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

          645 General Feedback

          Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

          It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

          This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

          Swarm Debugging the Collective Intelligence on Interactive Debugging 33

          debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

          7 Discussion

          We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

          Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

          Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

          Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

          Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

          There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

          28 httpgithubcomswarmdebugging

          34Please give a shorter version with authorrunning and titlerunning prior to maketitle

          We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

          8 Threats to Validity

          Despite its promising results there exist threats to the validity of our studythat we discuss in this section

          As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

          Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

          Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

          We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

          Swarm Debugging the Collective Intelligence on Interactive Debugging 35

          Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

          Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

          Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

          External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

          9 Related work

          We now summarise works related to debugging to allow better positioning ofour study among the published research

          Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

          Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

          36Please give a shorter version with authorrunning and titlerunning prior to maketitle

          which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

          Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

          Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

          DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

          Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

          Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

          Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

          Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

          Swarm Debugging the Collective Intelligence on Interactive Debugging 37

          Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

          Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

          Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

          Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

          10 Conclusion

          Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

          To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

          The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

          38Please give a shorter version with authorrunning and titlerunning prior to maketitle

          breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

          Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

          Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

          In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

          Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

          Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

          Swarm Debugging the Collective Intelligence on Interactive Debugging 39

          haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

          Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

          11 Acknowledgment

          This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

          References

          1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

          2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

          3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

          Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

          rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

          neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

          8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

          9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

          10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

          neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

          on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

          13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

          org107287peerjpreprints2743v1

          14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

          40Please give a shorter version with authorrunning and titlerunning prior to maketitle

          15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

          16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

          17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

          18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

          19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

          101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

          oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

          1218575

          22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

          neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

          conditional_breakpointhtmampcp=1_3_6_0_5

          23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

          24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

          linkspringercom101007s10818-015-9203-6

          25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

          (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

          actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

          C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

          29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

          30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

          31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

          32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

          pmcentrezamprendertype=abstract

          33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

          34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

          35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

          36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

          doiacmorg1011452622669

          37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

          Swarm Debugging the Collective Intelligence on Interactive Debugging 41

          38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

          39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

          40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

          41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

          42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

          43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

          44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

          45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

          46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

          47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

          48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

          49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

          50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

          51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

          52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

          53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

          54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

          55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

          56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

          57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

          58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

          42Please give a shorter version with authorrunning and titlerunning prior to maketitle

          Appendix - Implementation of Swarm Debugging

          Swarm Debugging Services

          The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

          Fig 13 The Swarm Debugging Services architecture

          The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

          We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

          ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

          projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

          and debugging events

          Swarm Debugging the Collective Intelligence on Interactive Debugging 43

          Fig 14 The Swarm Debugging metadata [17]

          ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

          ndash Method is a method associated with a type which can be invoked duringdebugging sessions

          ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

          ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

          ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

          ndash Event is an event data that is collected when a developer performs someactions during a debugging session

          The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

          Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

          29 httpprojectsspringiospring-boot

          44Please give a shorter version with authorrunning and titlerunning prior to maketitle

          and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

          httpswarmdebuggingorgdevelopers

          searchfindByNamename=petrillo

          the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

          SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

          Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

          Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

          Fig 15 Swarm Debugging Dashboard

          30 httpdbswarmdebuggingorg31 httpswwwelasticco

          Swarm Debugging the Collective Intelligence on Interactive Debugging 45

          Fig 16 Neo4J Browser - a Cypher query example

          Graph Querying Console The SDS also persists debugging data in a Neo4J32

          graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

          Figure 16 shows an example of Cypher query and the resulting graph

          Swarm Debugging Tracer

          Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

          After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

          To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

          32 httpneo4jcom

          46Please give a shorter version with authorrunning and titlerunning prior to maketitle

          Fig 17 The Swarm Tracer architecture [17]

          Fig 18 The Swarm Manager view

          Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

          Swarm Debugging the Collective Intelligence on Interactive Debugging 47

          Fig 19 Breakpoint search tool (fuzzy search example)

          invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

          To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

          Swarm Debugging Views

          On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

          Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

          Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

          48Please give a shorter version with authorrunning and titlerunning prior to maketitle

          Fig 20 Sequence stack diagram for Bridge design pattern

          Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

          Breakpoint Search Tool

          Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

          Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

          Swarm Debugging the Collective Intelligence on Interactive Debugging 49

          Fig 21 Method call graph for Bridge design pattern [17]

          StartingEnding Method Search Tool

          This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

          Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

          StartingPoint = VSP | VSP isin α and VSP isin β

          EndingPoint = VEP | VEP isin β and VEP isin α

          Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

          Summary

          Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

          50Please give a shorter version with authorrunning and titlerunning prior to maketitle

          graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

          Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

          • 1 Introduction
          • 2 Background
          • 3 The Swarm Debugging Approach
          • 4 SDI in a Nutshell
          • 5 Using SDI to Understand Debugging Activities
          • 6 Evaluation of Swarm Debugging using GV
          • 7 Discussion
          • 8 Threats to Validity
          • 9 Related work
          • 10 Conclusion
          • 11 Acknowledgment

            6Please give a shorter version with authorrunning and titlerunning prior to maketitle

            Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

            ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

            ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

            ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

            There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

            A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

            1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

            2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

            3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

            4httpjeffreysambellscom20140114using-breakpoints-in-xcode

            5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

            6httpsdocspythonorg2librarypdbhtml

            7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

            8httpsgithubcomcldwalkerdebugger

            9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

            10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

            Swarm Debugging the Collective Intelligence on Interactive Debugging 7

            flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

            ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

            ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

            ndash Step Out this action would take the debugger back to the line where thecurrent function was called

            To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

            23 Self-organization and Swarm Intelligence

            Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

            Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

            8Please give a shorter version with authorrunning and titlerunning prior to maketitle

            emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

            Dev1

            Dev2

            Dev3

            DevN

            VisualisationsSearching Tools

            Recommendation Systems

            Single Debugging Session Crowd Debugging Sessions Debugging Information

            Positive feedback

            Collect data Store data

            Transform information

            A B C

            D

            Fig 2 Overview of the Swarm Debugging approach

            24 Information Foraging

            Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

            However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

            These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

            3 The Swarm Debugging Approach

            Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

            Swarm Debugging the Collective Intelligence on Interactive Debugging 9

            First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

            We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

            During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

            Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

            4 SDI in a Nutshell

            To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

            11 httpswwweclipseorg

            10Please give a shorter version with authorrunning and titlerunning prior to maketitle

            Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

            Debugging Views All the implementation details of SDI are available in theAppendix section

            41 Swarm Debugging Global View

            Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

            GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

            In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

            Swarm Debugging the Collective Intelligence on Interactive Debugging 11

            Fig 4 GV on all tasks

            nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

            GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

            5 Using SDI to Understand Debugging Activities

            The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

            Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

            In this study we answered the following research questions

            RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

            RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

            12Please give a shorter version with authorrunning and titlerunning prior to maketitle

            RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

            RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

            In this section we elaborate more on each of the studies

            51 Study 1 Observational Study on JabRef

            511 Subject System

            To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

            512 Participants

            We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

            513 Task Description

            We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

            12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

            Swarm Debugging the Collective Intelligence on Interactive Debugging 13

            Table 1 Summary of the issues considered in JabRef in Study 1

            Issues Summaries

            318 ldquoNormalize to Bibtex name formatrdquo

            667 ldquohashpound sign causes URL link to failrdquo

            669 ldquoJabRef 3132 writes bib file in a format

            that it will not readrdquo

            993 ldquoIssues in BibTeX source opens save dialog

            and opens dialog Problem with parsing entryrsquo

            multiple timesrdquo

            1026 ldquoJabref removes comments

            inside the Bibtex coderdquo

            514 Artifacts and Working Environment

            We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

            We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

            515 Study Procedure

            After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

            15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

            14Please give a shorter version with authorrunning and titlerunning prior to maketitle

            All participants who participated in our study correctly executed the warm-uptask

            After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

            516 Data Collection

            The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

            ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

            ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

            ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

            Based on this data we obtained or computed the following metrics perparticipant and task

            ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

            ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

            ndash End time (T ) the time when the participant finished a task

            Swarm Debugging the Collective Intelligence on Interactive Debugging 15

            ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

            We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

            52 Study 2 Empirical Study on PDFSaM and Raptor

            The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

            The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

            53 Results

            As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

            In the following we present the results regarding each research questionaddressed in the two studies

            18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

            16Please give a shorter version with authorrunning and titlerunning prior to maketitle

            RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

            We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

            MFB =EF

            ET(1)

            Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

            Tasks Average Times (min) Std Devs (min)

            318 44 64

            667 28 29

            669 22 25

            993 25 25

            1026 25 17

            PdfSam 54 18

            Raptor 59 13

            Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

            We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

            a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

            Swarm Debugging the Collective Intelligence on Interactive Debugging 17

            RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

            For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

            f(x) =α

            xβ(2)

            where α = 12 and β = 044

            Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

            We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

            This finding also corroborates previous results found with a different set oftasks [17]

            18Please give a shorter version with authorrunning and titlerunning prior to maketitle

            RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

            We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

            Table 3 Study 1 - Breakpoints per type of statement

            Statements Numbers of Breakpoints

            call 111 53

            if-statement 39 19

            assignment 36 17

            return 18 10

            while-loop 3 1

            Table 4 Study 2 - Breakpoints per type of statement

            Statements Numbers of Breakpoints

            call 43 43

            if-statement 22 22

            assignment 27 27

            return 4 4

            while-loop 4 4

            13

            Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

            Swarm Debugging the Collective Intelligence on Interactive Debugging 19

            RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

            We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

            In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

            JabRefDesktopopenExternalViewer(metaData()

            linktoString() field)

            Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

            Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

            Tasks Classes Lines of Code Breakpoints

            0318 AuthorsFormatter 43 5

            0318 AuthorsFormatter 131 3

            0667 BasePanel 935 2

            0667 BasePanel 969 3

            0667 JabRefDesktop 430 2

            0669 OpenDatabaseAction 268 2

            0669 OpenDatabaseAction 433 4

            0669 OpenDatabaseAction 451 4

            0993 EntryEditor 717 2

            0993 EntryEditor 720 2

            0993 EntryEditor 723 2

            0993 BibDatabase 187 2

            0993 BibDatabase 456 2

            1026 EntryEditor 1184 2

            1026 BibtexParser 160 2

            20Please give a shorter version with authorrunning and titlerunning prior to maketitle

            Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

            Classes Lines of Code Breakpoints

            PdfReader 230 2

            PdfReader 806 2

            PdfReader 1923 2

            ConsoleServicesFacade 89 2

            ConsoleClient 81 2

            PdfUtility 94 2

            PdfUtility 96 2

            PdfUtility 102 2

            Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

            Classes Lines of Code Breakpoints

            icsUtils 333 3

            Game 1751 2

            ExamineController 41 2

            ExamineController 84 3

            ExamineController 87 2

            ExamineController 92 2

            When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

            We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

            For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

            Swarm Debugging the Collective Intelligence on Interactive Debugging 21

            Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

            Classes Lines of Code Breakpoints

            BibtexParser 138151159 222

            160165168 323

            176198199299 2222

            EntryEditor 717720721 342

            723837842 232

            11841393 32

            BibDatabase 175187223456 2326

            OpenDatabaseAction 433450451 424

            JabRefDesktop 4084430 223

            SaveDatabaseAction 177188 42

            BasePanel 935969 25

            AuthorsFormatter 43131 54

            EntryTableTransferHandler 346 2

            FieldTextMenu 84 2

            JabRefFrame 1119 2

            JabRefMain 8 5

            URLUtil 95 2

            Fig 6 Methods with 5 or more breakpoints

            Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

            22Please give a shorter version with authorrunning and titlerunning prior to maketitle

            Table 9 Study 1 - Breakpoints by class across different tasks

            Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

            SaveDatabaseAction Yes Yes Yes 7 2

            BasePanel Yes Yes Yes Yes 14 7

            JabRefDesktop Yes Yes 9 4

            EntryEditor Yes Yes Yes 36 4

            BibtexParser Yes Yes Yes 44 6

            OpenDatabaseAction Yes Yes Yes 19 13

            JabRef Yes Yes Yes 3 3

            JabRefMain Yes Yes Yes Yes 5 4

            URLUtil Yes Yes 4 2

            BibDatabase Yes Yes Yes 19 4

            points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

            Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

            6 Evaluation of Swarm Debugging using GV

            To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

            RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

            RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

            Swarm Debugging the Collective Intelligence on Interactive Debugging 23

            61 Study design

            The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

            611 Subject System

            For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

            612 Participants

            Fig 7 Java expertise

            To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

            Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

            20 httpwwwjabreforg21 httpswwwfreelancercom

            24Please give a shorter version with authorrunning and titlerunning prior to maketitle

            613 Task Description

            We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

            614 Artifacts and Working Environment

            After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

            For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

            615 Study Procedure

            The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

            The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

            22 The full qualitative evaluation survey is available on httpsgooglforms

            c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

            Swarm Debugging the Collective Intelligence on Interactive Debugging 25

            group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

            616 Data Collection

            In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

            In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

            All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

            62 Results

            We now discuss the results of our evaluation

            RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

            During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

            25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

            26Please give a shorter version with authorrunning and titlerunning prior to maketitle

            number of participants who could propose a solution and the correctness ofthe solutions

            For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

            Fig 8 GV for Task 0318

            For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

            Fig 9 GV for Task 0667

            Swarm Debugging the Collective Intelligence on Interactive Debugging 27

            Fig 10 GV for Task 0669

            Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

            13

            Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

            RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

            We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

            28Please give a shorter version with authorrunning and titlerunning prior to maketitle

            Fig 11 GV usefulness - experimental phase one

            Fig 12 GV usefulness - experimental phase two

            The analysis of our results suggests that GV is useful to support software-maintenance tasks

            Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

            Swarm Debugging the Collective Intelligence on Interactive Debugging 29

            Table 10 Results from control and experimental groups (average)

            Task 0993

            Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

            First breakpoint 000255 000340 -44 126

            Time to start 000444 000518 -33 112

            Elapsed time 003008 001605 843 53

            Task 1026

            Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

            First breakpoint 000242 000448 -126 177

            Time to start 000402 000343 19 92

            Elapsed time 002458 002041 257 83

            63 Comparing Results from the Control and Experimental Groups

            We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

            Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

            Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

            30Please give a shorter version with authorrunning and titlerunning prior to maketitle

            64 Participantsrsquo Feedback

            As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

            641 Intrinsic Advantage

            Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

            Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

            642 Intrinsic Limitations

            Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

            However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

            Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

            Swarm Debugging the Collective Intelligence on Interactive Debugging 31

            Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

            One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

            Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

            We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

            Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

            Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

            643 Accidental Advantages

            Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

            Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

            Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

            32Please give a shorter version with authorrunning and titlerunning prior to maketitle

            644 Accidental Limitations

            Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

            Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

            One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

            Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

            Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

            645 General Feedback

            Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

            It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

            This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

            Swarm Debugging the Collective Intelligence on Interactive Debugging 33

            debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

            7 Discussion

            We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

            Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

            Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

            Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

            Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

            There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

            28 httpgithubcomswarmdebugging

            34Please give a shorter version with authorrunning and titlerunning prior to maketitle

            We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

            8 Threats to Validity

            Despite its promising results there exist threats to the validity of our studythat we discuss in this section

            As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

            Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

            Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

            We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

            Swarm Debugging the Collective Intelligence on Interactive Debugging 35

            Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

            Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

            Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

            External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

            9 Related work

            We now summarise works related to debugging to allow better positioning ofour study among the published research

            Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

            Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

            36Please give a shorter version with authorrunning and titlerunning prior to maketitle

            which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

            Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

            Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

            DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

            Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

            Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

            Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

            Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

            Swarm Debugging the Collective Intelligence on Interactive Debugging 37

            Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

            Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

            Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

            Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

            10 Conclusion

            Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

            To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

            The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

            38Please give a shorter version with authorrunning and titlerunning prior to maketitle

            breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

            Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

            Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

            In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

            Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

            Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

            Swarm Debugging the Collective Intelligence on Interactive Debugging 39

            haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

            Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

            11 Acknowledgment

            This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

            References

            1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

            2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

            3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

            Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

            rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

            neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

            8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

            9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

            10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

            neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

            on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

            13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

            org107287peerjpreprints2743v1

            14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

            40Please give a shorter version with authorrunning and titlerunning prior to maketitle

            15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

            16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

            17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

            18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

            19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

            101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

            oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

            1218575

            22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

            neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

            conditional_breakpointhtmampcp=1_3_6_0_5

            23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

            24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

            linkspringercom101007s10818-015-9203-6

            25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

            (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

            actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

            C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

            29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

            30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

            31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

            32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

            pmcentrezamprendertype=abstract

            33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

            34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

            35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

            36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

            doiacmorg1011452622669

            37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

            Swarm Debugging the Collective Intelligence on Interactive Debugging 41

            38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

            39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

            40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

            41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

            42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

            43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

            44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

            45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

            46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

            47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

            48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

            49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

            50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

            51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

            52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

            53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

            54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

            55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

            56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

            57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

            58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

            42Please give a shorter version with authorrunning and titlerunning prior to maketitle

            Appendix - Implementation of Swarm Debugging

            Swarm Debugging Services

            The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

            Fig 13 The Swarm Debugging Services architecture

            The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

            We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

            ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

            projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

            and debugging events

            Swarm Debugging the Collective Intelligence on Interactive Debugging 43

            Fig 14 The Swarm Debugging metadata [17]

            ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

            ndash Method is a method associated with a type which can be invoked duringdebugging sessions

            ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

            ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

            ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

            ndash Event is an event data that is collected when a developer performs someactions during a debugging session

            The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

            Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

            29 httpprojectsspringiospring-boot

            44Please give a shorter version with authorrunning and titlerunning prior to maketitle

            and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

            httpswarmdebuggingorgdevelopers

            searchfindByNamename=petrillo

            the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

            SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

            Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

            Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

            Fig 15 Swarm Debugging Dashboard

            30 httpdbswarmdebuggingorg31 httpswwwelasticco

            Swarm Debugging the Collective Intelligence on Interactive Debugging 45

            Fig 16 Neo4J Browser - a Cypher query example

            Graph Querying Console The SDS also persists debugging data in a Neo4J32

            graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

            Figure 16 shows an example of Cypher query and the resulting graph

            Swarm Debugging Tracer

            Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

            After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

            To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

            32 httpneo4jcom

            46Please give a shorter version with authorrunning and titlerunning prior to maketitle

            Fig 17 The Swarm Tracer architecture [17]

            Fig 18 The Swarm Manager view

            Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

            Swarm Debugging the Collective Intelligence on Interactive Debugging 47

            Fig 19 Breakpoint search tool (fuzzy search example)

            invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

            To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

            Swarm Debugging Views

            On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

            Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

            Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

            48Please give a shorter version with authorrunning and titlerunning prior to maketitle

            Fig 20 Sequence stack diagram for Bridge design pattern

            Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

            Breakpoint Search Tool

            Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

            Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

            Swarm Debugging the Collective Intelligence on Interactive Debugging 49

            Fig 21 Method call graph for Bridge design pattern [17]

            StartingEnding Method Search Tool

            This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

            Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

            StartingPoint = VSP | VSP isin α and VSP isin β

            EndingPoint = VEP | VEP isin β and VEP isin α

            Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

            Summary

            Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

            50Please give a shorter version with authorrunning and titlerunning prior to maketitle

            graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

            Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

            • 1 Introduction
            • 2 Background
            • 3 The Swarm Debugging Approach
            • 4 SDI in a Nutshell
            • 5 Using SDI to Understand Debugging Activities
            • 6 Evaluation of Swarm Debugging using GV
            • 7 Discussion
            • 8 Threats to Validity
            • 9 Related work
            • 10 Conclusion
            • 11 Acknowledgment

              Swarm Debugging the Collective Intelligence on Interactive Debugging 7

              flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

              ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

              ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

              ndash Step Out this action would take the debugger back to the line where thecurrent function was called

              To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

              23 Self-organization and Swarm Intelligence

              Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

              Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

              8Please give a shorter version with authorrunning and titlerunning prior to maketitle

              emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

              Dev1

              Dev2

              Dev3

              DevN

              VisualisationsSearching Tools

              Recommendation Systems

              Single Debugging Session Crowd Debugging Sessions Debugging Information

              Positive feedback

              Collect data Store data

              Transform information

              A B C

              D

              Fig 2 Overview of the Swarm Debugging approach

              24 Information Foraging

              Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

              However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

              These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

              3 The Swarm Debugging Approach

              Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

              Swarm Debugging the Collective Intelligence on Interactive Debugging 9

              First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

              We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

              During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

              Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

              4 SDI in a Nutshell

              To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

              11 httpswwweclipseorg

              10Please give a shorter version with authorrunning and titlerunning prior to maketitle

              Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

              Debugging Views All the implementation details of SDI are available in theAppendix section

              41 Swarm Debugging Global View

              Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

              GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

              In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

              Swarm Debugging the Collective Intelligence on Interactive Debugging 11

              Fig 4 GV on all tasks

              nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

              GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

              5 Using SDI to Understand Debugging Activities

              The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

              Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

              In this study we answered the following research questions

              RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

              RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

              12Please give a shorter version with authorrunning and titlerunning prior to maketitle

              RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

              RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

              In this section we elaborate more on each of the studies

              51 Study 1 Observational Study on JabRef

              511 Subject System

              To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

              512 Participants

              We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

              513 Task Description

              We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

              12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

              Swarm Debugging the Collective Intelligence on Interactive Debugging 13

              Table 1 Summary of the issues considered in JabRef in Study 1

              Issues Summaries

              318 ldquoNormalize to Bibtex name formatrdquo

              667 ldquohashpound sign causes URL link to failrdquo

              669 ldquoJabRef 3132 writes bib file in a format

              that it will not readrdquo

              993 ldquoIssues in BibTeX source opens save dialog

              and opens dialog Problem with parsing entryrsquo

              multiple timesrdquo

              1026 ldquoJabref removes comments

              inside the Bibtex coderdquo

              514 Artifacts and Working Environment

              We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

              We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

              515 Study Procedure

              After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

              15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

              14Please give a shorter version with authorrunning and titlerunning prior to maketitle

              All participants who participated in our study correctly executed the warm-uptask

              After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

              516 Data Collection

              The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

              ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

              ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

              ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

              Based on this data we obtained or computed the following metrics perparticipant and task

              ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

              ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

              ndash End time (T ) the time when the participant finished a task

              Swarm Debugging the Collective Intelligence on Interactive Debugging 15

              ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

              We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

              52 Study 2 Empirical Study on PDFSaM and Raptor

              The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

              The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

              53 Results

              As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

              In the following we present the results regarding each research questionaddressed in the two studies

              18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

              16Please give a shorter version with authorrunning and titlerunning prior to maketitle

              RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

              We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

              MFB =EF

              ET(1)

              Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

              Tasks Average Times (min) Std Devs (min)

              318 44 64

              667 28 29

              669 22 25

              993 25 25

              1026 25 17

              PdfSam 54 18

              Raptor 59 13

              Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

              We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

              a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

              Swarm Debugging the Collective Intelligence on Interactive Debugging 17

              RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

              For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

              f(x) =α

              xβ(2)

              where α = 12 and β = 044

              Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

              We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

              This finding also corroborates previous results found with a different set oftasks [17]

              18Please give a shorter version with authorrunning and titlerunning prior to maketitle

              RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

              We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

              Table 3 Study 1 - Breakpoints per type of statement

              Statements Numbers of Breakpoints

              call 111 53

              if-statement 39 19

              assignment 36 17

              return 18 10

              while-loop 3 1

              Table 4 Study 2 - Breakpoints per type of statement

              Statements Numbers of Breakpoints

              call 43 43

              if-statement 22 22

              assignment 27 27

              return 4 4

              while-loop 4 4

              13

              Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

              Swarm Debugging the Collective Intelligence on Interactive Debugging 19

              RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

              We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

              In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

              JabRefDesktopopenExternalViewer(metaData()

              linktoString() field)

              Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

              Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

              Tasks Classes Lines of Code Breakpoints

              0318 AuthorsFormatter 43 5

              0318 AuthorsFormatter 131 3

              0667 BasePanel 935 2

              0667 BasePanel 969 3

              0667 JabRefDesktop 430 2

              0669 OpenDatabaseAction 268 2

              0669 OpenDatabaseAction 433 4

              0669 OpenDatabaseAction 451 4

              0993 EntryEditor 717 2

              0993 EntryEditor 720 2

              0993 EntryEditor 723 2

              0993 BibDatabase 187 2

              0993 BibDatabase 456 2

              1026 EntryEditor 1184 2

              1026 BibtexParser 160 2

              20Please give a shorter version with authorrunning and titlerunning prior to maketitle

              Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

              Classes Lines of Code Breakpoints

              PdfReader 230 2

              PdfReader 806 2

              PdfReader 1923 2

              ConsoleServicesFacade 89 2

              ConsoleClient 81 2

              PdfUtility 94 2

              PdfUtility 96 2

              PdfUtility 102 2

              Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

              Classes Lines of Code Breakpoints

              icsUtils 333 3

              Game 1751 2

              ExamineController 41 2

              ExamineController 84 3

              ExamineController 87 2

              ExamineController 92 2

              When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

              We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

              For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

              Swarm Debugging the Collective Intelligence on Interactive Debugging 21

              Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

              Classes Lines of Code Breakpoints

              BibtexParser 138151159 222

              160165168 323

              176198199299 2222

              EntryEditor 717720721 342

              723837842 232

              11841393 32

              BibDatabase 175187223456 2326

              OpenDatabaseAction 433450451 424

              JabRefDesktop 4084430 223

              SaveDatabaseAction 177188 42

              BasePanel 935969 25

              AuthorsFormatter 43131 54

              EntryTableTransferHandler 346 2

              FieldTextMenu 84 2

              JabRefFrame 1119 2

              JabRefMain 8 5

              URLUtil 95 2

              Fig 6 Methods with 5 or more breakpoints

              Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

              22Please give a shorter version with authorrunning and titlerunning prior to maketitle

              Table 9 Study 1 - Breakpoints by class across different tasks

              Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

              SaveDatabaseAction Yes Yes Yes 7 2

              BasePanel Yes Yes Yes Yes 14 7

              JabRefDesktop Yes Yes 9 4

              EntryEditor Yes Yes Yes 36 4

              BibtexParser Yes Yes Yes 44 6

              OpenDatabaseAction Yes Yes Yes 19 13

              JabRef Yes Yes Yes 3 3

              JabRefMain Yes Yes Yes Yes 5 4

              URLUtil Yes Yes 4 2

              BibDatabase Yes Yes Yes 19 4

              points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

              Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

              6 Evaluation of Swarm Debugging using GV

              To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

              RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

              RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

              Swarm Debugging the Collective Intelligence on Interactive Debugging 23

              61 Study design

              The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

              611 Subject System

              For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

              612 Participants

              Fig 7 Java expertise

              To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

              Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

              20 httpwwwjabreforg21 httpswwwfreelancercom

              24Please give a shorter version with authorrunning and titlerunning prior to maketitle

              613 Task Description

              We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

              614 Artifacts and Working Environment

              After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

              For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

              615 Study Procedure

              The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

              The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

              22 The full qualitative evaluation survey is available on httpsgooglforms

              c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

              Swarm Debugging the Collective Intelligence on Interactive Debugging 25

              group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

              616 Data Collection

              In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

              In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

              All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

              62 Results

              We now discuss the results of our evaluation

              RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

              During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

              25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

              26Please give a shorter version with authorrunning and titlerunning prior to maketitle

              number of participants who could propose a solution and the correctness ofthe solutions

              For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

              Fig 8 GV for Task 0318

              For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

              Fig 9 GV for Task 0667

              Swarm Debugging the Collective Intelligence on Interactive Debugging 27

              Fig 10 GV for Task 0669

              Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

              13

              Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

              RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

              We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

              28Please give a shorter version with authorrunning and titlerunning prior to maketitle

              Fig 11 GV usefulness - experimental phase one

              Fig 12 GV usefulness - experimental phase two

              The analysis of our results suggests that GV is useful to support software-maintenance tasks

              Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

              Swarm Debugging the Collective Intelligence on Interactive Debugging 29

              Table 10 Results from control and experimental groups (average)

              Task 0993

              Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

              First breakpoint 000255 000340 -44 126

              Time to start 000444 000518 -33 112

              Elapsed time 003008 001605 843 53

              Task 1026

              Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

              First breakpoint 000242 000448 -126 177

              Time to start 000402 000343 19 92

              Elapsed time 002458 002041 257 83

              63 Comparing Results from the Control and Experimental Groups

              We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

              Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

              Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

              30Please give a shorter version with authorrunning and titlerunning prior to maketitle

              64 Participantsrsquo Feedback

              As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

              641 Intrinsic Advantage

              Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

              Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

              642 Intrinsic Limitations

              Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

              However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

              Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

              Swarm Debugging the Collective Intelligence on Interactive Debugging 31

              Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

              One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

              Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

              We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

              Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

              Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

              643 Accidental Advantages

              Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

              Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

              Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

              32Please give a shorter version with authorrunning and titlerunning prior to maketitle

              644 Accidental Limitations

              Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

              Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

              One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

              Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

              Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

              645 General Feedback

              Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

              It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

              This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

              Swarm Debugging the Collective Intelligence on Interactive Debugging 33

              debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

              7 Discussion

              We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

              Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

              Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

              Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

              Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

              There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

              28 httpgithubcomswarmdebugging

              34Please give a shorter version with authorrunning and titlerunning prior to maketitle

              We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

              8 Threats to Validity

              Despite its promising results there exist threats to the validity of our studythat we discuss in this section

              As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

              Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

              Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

              We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

              Swarm Debugging the Collective Intelligence on Interactive Debugging 35

              Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

              Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

              Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

              External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

              9 Related work

              We now summarise works related to debugging to allow better positioning ofour study among the published research

              Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

              Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

              36Please give a shorter version with authorrunning and titlerunning prior to maketitle

              which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

              Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

              Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

              DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

              Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

              Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

              Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

              Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

              Swarm Debugging the Collective Intelligence on Interactive Debugging 37

              Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

              Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

              Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

              Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

              10 Conclusion

              Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

              To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

              The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

              38Please give a shorter version with authorrunning and titlerunning prior to maketitle

              breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

              Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

              Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

              In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

              Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

              Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

              Swarm Debugging the Collective Intelligence on Interactive Debugging 39

              haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

              Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

              11 Acknowledgment

              This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

              References

              1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

              2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

              3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

              Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

              rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

              neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

              8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

              9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

              10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

              neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

              on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

              13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

              org107287peerjpreprints2743v1

              14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

              40Please give a shorter version with authorrunning and titlerunning prior to maketitle

              15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

              16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

              17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

              18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

              19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

              101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

              oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

              1218575

              22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

              neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

              conditional_breakpointhtmampcp=1_3_6_0_5

              23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

              24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

              linkspringercom101007s10818-015-9203-6

              25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

              (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

              actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

              C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

              29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

              30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

              31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

              32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

              pmcentrezamprendertype=abstract

              33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

              34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

              35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

              36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

              doiacmorg1011452622669

              37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

              Swarm Debugging the Collective Intelligence on Interactive Debugging 41

              38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

              39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

              40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

              41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

              42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

              43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

              44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

              45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

              46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

              47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

              48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

              49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

              50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

              51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

              52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

              53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

              54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

              55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

              56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

              57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

              58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

              42Please give a shorter version with authorrunning and titlerunning prior to maketitle

              Appendix - Implementation of Swarm Debugging

              Swarm Debugging Services

              The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

              Fig 13 The Swarm Debugging Services architecture

              The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

              We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

              ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

              projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

              and debugging events

              Swarm Debugging the Collective Intelligence on Interactive Debugging 43

              Fig 14 The Swarm Debugging metadata [17]

              ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

              ndash Method is a method associated with a type which can be invoked duringdebugging sessions

              ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

              ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

              ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

              ndash Event is an event data that is collected when a developer performs someactions during a debugging session

              The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

              Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

              29 httpprojectsspringiospring-boot

              44Please give a shorter version with authorrunning and titlerunning prior to maketitle

              and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

              httpswarmdebuggingorgdevelopers

              searchfindByNamename=petrillo

              the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

              SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

              Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

              Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

              Fig 15 Swarm Debugging Dashboard

              30 httpdbswarmdebuggingorg31 httpswwwelasticco

              Swarm Debugging the Collective Intelligence on Interactive Debugging 45

              Fig 16 Neo4J Browser - a Cypher query example

              Graph Querying Console The SDS also persists debugging data in a Neo4J32

              graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

              Figure 16 shows an example of Cypher query and the resulting graph

              Swarm Debugging Tracer

              Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

              After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

              To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

              32 httpneo4jcom

              46Please give a shorter version with authorrunning and titlerunning prior to maketitle

              Fig 17 The Swarm Tracer architecture [17]

              Fig 18 The Swarm Manager view

              Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

              Swarm Debugging the Collective Intelligence on Interactive Debugging 47

              Fig 19 Breakpoint search tool (fuzzy search example)

              invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

              To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

              Swarm Debugging Views

              On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

              Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

              Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

              48Please give a shorter version with authorrunning and titlerunning prior to maketitle

              Fig 20 Sequence stack diagram for Bridge design pattern

              Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

              Breakpoint Search Tool

              Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

              Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

              Swarm Debugging the Collective Intelligence on Interactive Debugging 49

              Fig 21 Method call graph for Bridge design pattern [17]

              StartingEnding Method Search Tool

              This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

              Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

              StartingPoint = VSP | VSP isin α and VSP isin β

              EndingPoint = VEP | VEP isin β and VEP isin α

              Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

              Summary

              Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

              50Please give a shorter version with authorrunning and titlerunning prior to maketitle

              graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

              Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

              • 1 Introduction
              • 2 Background
              • 3 The Swarm Debugging Approach
              • 4 SDI in a Nutshell
              • 5 Using SDI to Understand Debugging Activities
              • 6 Evaluation of Swarm Debugging using GV
              • 7 Discussion
              • 8 Threats to Validity
              • 9 Related work
              • 10 Conclusion
              • 11 Acknowledgment

                8Please give a shorter version with authorrunning and titlerunning prior to maketitle

                emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

                Dev1

                Dev2

                Dev3

                DevN

                VisualisationsSearching Tools

                Recommendation Systems

                Single Debugging Session Crowd Debugging Sessions Debugging Information

                Positive feedback

                Collect data Store data

                Transform information

                A B C

                D

                Fig 2 Overview of the Swarm Debugging approach

                24 Information Foraging

                Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

                However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

                These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

                3 The Swarm Debugging Approach

                Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

                Swarm Debugging the Collective Intelligence on Interactive Debugging 9

                First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

                We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

                During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

                Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

                4 SDI in a Nutshell

                To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

                11 httpswwweclipseorg

                10Please give a shorter version with authorrunning and titlerunning prior to maketitle

                Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

                Debugging Views All the implementation details of SDI are available in theAppendix section

                41 Swarm Debugging Global View

                Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

                GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

                In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

                Swarm Debugging the Collective Intelligence on Interactive Debugging 11

                Fig 4 GV on all tasks

                nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

                GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

                5 Using SDI to Understand Debugging Activities

                The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

                Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

                In this study we answered the following research questions

                RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

                RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

                12Please give a shorter version with authorrunning and titlerunning prior to maketitle

                RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                In this section we elaborate more on each of the studies

                51 Study 1 Observational Study on JabRef

                511 Subject System

                To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

                512 Participants

                We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

                513 Task Description

                We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

                12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

                Swarm Debugging the Collective Intelligence on Interactive Debugging 13

                Table 1 Summary of the issues considered in JabRef in Study 1

                Issues Summaries

                318 ldquoNormalize to Bibtex name formatrdquo

                667 ldquohashpound sign causes URL link to failrdquo

                669 ldquoJabRef 3132 writes bib file in a format

                that it will not readrdquo

                993 ldquoIssues in BibTeX source opens save dialog

                and opens dialog Problem with parsing entryrsquo

                multiple timesrdquo

                1026 ldquoJabref removes comments

                inside the Bibtex coderdquo

                514 Artifacts and Working Environment

                We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

                We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

                515 Study Procedure

                After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

                15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

                14Please give a shorter version with authorrunning and titlerunning prior to maketitle

                All participants who participated in our study correctly executed the warm-uptask

                After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

                516 Data Collection

                The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

                ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

                ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

                ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

                Based on this data we obtained or computed the following metrics perparticipant and task

                ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

                ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

                ndash End time (T ) the time when the participant finished a task

                Swarm Debugging the Collective Intelligence on Interactive Debugging 15

                ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

                We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

                52 Study 2 Empirical Study on PDFSaM and Raptor

                The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

                The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

                53 Results

                As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

                In the following we present the results regarding each research questionaddressed in the two studies

                18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

                16Please give a shorter version with authorrunning and titlerunning prior to maketitle

                RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

                We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

                MFB =EF

                ET(1)

                Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

                Tasks Average Times (min) Std Devs (min)

                318 44 64

                667 28 29

                669 22 25

                993 25 25

                1026 25 17

                PdfSam 54 18

                Raptor 59 13

                Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

                We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

                a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

                Swarm Debugging the Collective Intelligence on Interactive Debugging 17

                RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

                For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

                f(x) =α

                xβ(2)

                where α = 12 and β = 044

                Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

                We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

                This finding also corroborates previous results found with a different set oftasks [17]

                18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                Table 3 Study 1 - Breakpoints per type of statement

                Statements Numbers of Breakpoints

                call 111 53

                if-statement 39 19

                assignment 36 17

                return 18 10

                while-loop 3 1

                Table 4 Study 2 - Breakpoints per type of statement

                Statements Numbers of Breakpoints

                call 43 43

                if-statement 22 22

                assignment 27 27

                return 4 4

                while-loop 4 4

                13

                Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                JabRefDesktopopenExternalViewer(metaData()

                linktoString() field)

                Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                Tasks Classes Lines of Code Breakpoints

                0318 AuthorsFormatter 43 5

                0318 AuthorsFormatter 131 3

                0667 BasePanel 935 2

                0667 BasePanel 969 3

                0667 JabRefDesktop 430 2

                0669 OpenDatabaseAction 268 2

                0669 OpenDatabaseAction 433 4

                0669 OpenDatabaseAction 451 4

                0993 EntryEditor 717 2

                0993 EntryEditor 720 2

                0993 EntryEditor 723 2

                0993 BibDatabase 187 2

                0993 BibDatabase 456 2

                1026 EntryEditor 1184 2

                1026 BibtexParser 160 2

                20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                Classes Lines of Code Breakpoints

                PdfReader 230 2

                PdfReader 806 2

                PdfReader 1923 2

                ConsoleServicesFacade 89 2

                ConsoleClient 81 2

                PdfUtility 94 2

                PdfUtility 96 2

                PdfUtility 102 2

                Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                Classes Lines of Code Breakpoints

                icsUtils 333 3

                Game 1751 2

                ExamineController 41 2

                ExamineController 84 3

                ExamineController 87 2

                ExamineController 92 2

                When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                Classes Lines of Code Breakpoints

                BibtexParser 138151159 222

                160165168 323

                176198199299 2222

                EntryEditor 717720721 342

                723837842 232

                11841393 32

                BibDatabase 175187223456 2326

                OpenDatabaseAction 433450451 424

                JabRefDesktop 4084430 223

                SaveDatabaseAction 177188 42

                BasePanel 935969 25

                AuthorsFormatter 43131 54

                EntryTableTransferHandler 346 2

                FieldTextMenu 84 2

                JabRefFrame 1119 2

                JabRefMain 8 5

                URLUtil 95 2

                Fig 6 Methods with 5 or more breakpoints

                Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                Table 9 Study 1 - Breakpoints by class across different tasks

                Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                SaveDatabaseAction Yes Yes Yes 7 2

                BasePanel Yes Yes Yes Yes 14 7

                JabRefDesktop Yes Yes 9 4

                EntryEditor Yes Yes Yes 36 4

                BibtexParser Yes Yes Yes 44 6

                OpenDatabaseAction Yes Yes Yes 19 13

                JabRef Yes Yes Yes 3 3

                JabRefMain Yes Yes Yes Yes 5 4

                URLUtil Yes Yes 4 2

                BibDatabase Yes Yes Yes 19 4

                points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                6 Evaluation of Swarm Debugging using GV

                To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                61 Study design

                The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                611 Subject System

                For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                612 Participants

                Fig 7 Java expertise

                To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                20 httpwwwjabreforg21 httpswwwfreelancercom

                24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                613 Task Description

                We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                614 Artifacts and Working Environment

                After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                615 Study Procedure

                The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                22 The full qualitative evaluation survey is available on httpsgooglforms

                c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                616 Data Collection

                In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                62 Results

                We now discuss the results of our evaluation

                RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                number of participants who could propose a solution and the correctness ofthe solutions

                For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                Fig 8 GV for Task 0318

                For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                Fig 9 GV for Task 0667

                Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                Fig 10 GV for Task 0669

                Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                13

                Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                Fig 11 GV usefulness - experimental phase one

                Fig 12 GV usefulness - experimental phase two

                The analysis of our results suggests that GV is useful to support software-maintenance tasks

                Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                Table 10 Results from control and experimental groups (average)

                Task 0993

                Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                First breakpoint 000255 000340 -44 126

                Time to start 000444 000518 -33 112

                Elapsed time 003008 001605 843 53

                Task 1026

                Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                First breakpoint 000242 000448 -126 177

                Time to start 000402 000343 19 92

                Elapsed time 002458 002041 257 83

                63 Comparing Results from the Control and Experimental Groups

                We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                64 Participantsrsquo Feedback

                As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                641 Intrinsic Advantage

                Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                642 Intrinsic Limitations

                Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                643 Accidental Advantages

                Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                644 Accidental Limitations

                Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                645 General Feedback

                Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                7 Discussion

                We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                28 httpgithubcomswarmdebugging

                34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                8 Threats to Validity

                Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                9 Related work

                We now summarise works related to debugging to allow better positioning ofour study among the published research

                Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                10 Conclusion

                Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                11 Acknowledgment

                This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                References

                1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                org107287peerjpreprints2743v1

                14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                1218575

                22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                conditional_breakpointhtmampcp=1_3_6_0_5

                23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                linkspringercom101007s10818-015-9203-6

                25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                pmcentrezamprendertype=abstract

                33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                doiacmorg1011452622669

                37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                Appendix - Implementation of Swarm Debugging

                Swarm Debugging Services

                The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                Fig 13 The Swarm Debugging Services architecture

                The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                and debugging events

                Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                Fig 14 The Swarm Debugging metadata [17]

                ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                29 httpprojectsspringiospring-boot

                44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                httpswarmdebuggingorgdevelopers

                searchfindByNamename=petrillo

                the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                Fig 15 Swarm Debugging Dashboard

                30 httpdbswarmdebuggingorg31 httpswwwelasticco

                Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                Fig 16 Neo4J Browser - a Cypher query example

                Graph Querying Console The SDS also persists debugging data in a Neo4J32

                graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                Figure 16 shows an example of Cypher query and the resulting graph

                Swarm Debugging Tracer

                Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                32 httpneo4jcom

                46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                Fig 17 The Swarm Tracer architecture [17]

                Fig 18 The Swarm Manager view

                Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                Fig 19 Breakpoint search tool (fuzzy search example)

                invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                Swarm Debugging Views

                On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                Fig 20 Sequence stack diagram for Bridge design pattern

                Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                Breakpoint Search Tool

                Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                Fig 21 Method call graph for Bridge design pattern [17]

                StartingEnding Method Search Tool

                This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                StartingPoint = VSP | VSP isin α and VSP isin β

                EndingPoint = VEP | VEP isin β and VEP isin α

                Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                Summary

                Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                • 1 Introduction
                • 2 Background
                • 3 The Swarm Debugging Approach
                • 4 SDI in a Nutshell
                • 5 Using SDI to Understand Debugging Activities
                • 6 Evaluation of Swarm Debugging using GV
                • 7 Discussion
                • 8 Threats to Validity
                • 9 Related work
                • 10 Conclusion
                • 11 Acknowledgment

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 9

                  First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

                  We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

                  During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

                  Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

                  4 SDI in a Nutshell

                  To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

                  11 httpswwweclipseorg

                  10Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

                  Debugging Views All the implementation details of SDI are available in theAppendix section

                  41 Swarm Debugging Global View

                  Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

                  GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

                  In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 11

                  Fig 4 GV on all tasks

                  nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

                  GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

                  5 Using SDI to Understand Debugging Activities

                  The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

                  Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

                  In this study we answered the following research questions

                  RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

                  RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

                  12Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                  RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                  In this section we elaborate more on each of the studies

                  51 Study 1 Observational Study on JabRef

                  511 Subject System

                  To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

                  512 Participants

                  We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

                  513 Task Description

                  We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

                  12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 13

                  Table 1 Summary of the issues considered in JabRef in Study 1

                  Issues Summaries

                  318 ldquoNormalize to Bibtex name formatrdquo

                  667 ldquohashpound sign causes URL link to failrdquo

                  669 ldquoJabRef 3132 writes bib file in a format

                  that it will not readrdquo

                  993 ldquoIssues in BibTeX source opens save dialog

                  and opens dialog Problem with parsing entryrsquo

                  multiple timesrdquo

                  1026 ldquoJabref removes comments

                  inside the Bibtex coderdquo

                  514 Artifacts and Working Environment

                  We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

                  We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

                  515 Study Procedure

                  After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

                  15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

                  14Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  All participants who participated in our study correctly executed the warm-uptask

                  After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

                  516 Data Collection

                  The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

                  ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

                  ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

                  ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

                  Based on this data we obtained or computed the following metrics perparticipant and task

                  ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

                  ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

                  ndash End time (T ) the time when the participant finished a task

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 15

                  ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

                  We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

                  52 Study 2 Empirical Study on PDFSaM and Raptor

                  The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

                  The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

                  53 Results

                  As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

                  In the following we present the results regarding each research questionaddressed in the two studies

                  18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

                  16Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

                  We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

                  MFB =EF

                  ET(1)

                  Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

                  Tasks Average Times (min) Std Devs (min)

                  318 44 64

                  667 28 29

                  669 22 25

                  993 25 25

                  1026 25 17

                  PdfSam 54 18

                  Raptor 59 13

                  Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

                  We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

                  a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 17

                  RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

                  For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

                  f(x) =α

                  xβ(2)

                  where α = 12 and β = 044

                  Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

                  We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

                  This finding also corroborates previous results found with a different set oftasks [17]

                  18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                  We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                  Table 3 Study 1 - Breakpoints per type of statement

                  Statements Numbers of Breakpoints

                  call 111 53

                  if-statement 39 19

                  assignment 36 17

                  return 18 10

                  while-loop 3 1

                  Table 4 Study 2 - Breakpoints per type of statement

                  Statements Numbers of Breakpoints

                  call 43 43

                  if-statement 22 22

                  assignment 27 27

                  return 4 4

                  while-loop 4 4

                  13

                  Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                  RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                  We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                  In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                  JabRefDesktopopenExternalViewer(metaData()

                  linktoString() field)

                  Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                  Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                  Tasks Classes Lines of Code Breakpoints

                  0318 AuthorsFormatter 43 5

                  0318 AuthorsFormatter 131 3

                  0667 BasePanel 935 2

                  0667 BasePanel 969 3

                  0667 JabRefDesktop 430 2

                  0669 OpenDatabaseAction 268 2

                  0669 OpenDatabaseAction 433 4

                  0669 OpenDatabaseAction 451 4

                  0993 EntryEditor 717 2

                  0993 EntryEditor 720 2

                  0993 EntryEditor 723 2

                  0993 BibDatabase 187 2

                  0993 BibDatabase 456 2

                  1026 EntryEditor 1184 2

                  1026 BibtexParser 160 2

                  20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                  Classes Lines of Code Breakpoints

                  PdfReader 230 2

                  PdfReader 806 2

                  PdfReader 1923 2

                  ConsoleServicesFacade 89 2

                  ConsoleClient 81 2

                  PdfUtility 94 2

                  PdfUtility 96 2

                  PdfUtility 102 2

                  Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                  Classes Lines of Code Breakpoints

                  icsUtils 333 3

                  Game 1751 2

                  ExamineController 41 2

                  ExamineController 84 3

                  ExamineController 87 2

                  ExamineController 92 2

                  When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                  We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                  For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                  Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                  Classes Lines of Code Breakpoints

                  BibtexParser 138151159 222

                  160165168 323

                  176198199299 2222

                  EntryEditor 717720721 342

                  723837842 232

                  11841393 32

                  BibDatabase 175187223456 2326

                  OpenDatabaseAction 433450451 424

                  JabRefDesktop 4084430 223

                  SaveDatabaseAction 177188 42

                  BasePanel 935969 25

                  AuthorsFormatter 43131 54

                  EntryTableTransferHandler 346 2

                  FieldTextMenu 84 2

                  JabRefFrame 1119 2

                  JabRefMain 8 5

                  URLUtil 95 2

                  Fig 6 Methods with 5 or more breakpoints

                  Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                  22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  Table 9 Study 1 - Breakpoints by class across different tasks

                  Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                  SaveDatabaseAction Yes Yes Yes 7 2

                  BasePanel Yes Yes Yes Yes 14 7

                  JabRefDesktop Yes Yes 9 4

                  EntryEditor Yes Yes Yes 36 4

                  BibtexParser Yes Yes Yes 44 6

                  OpenDatabaseAction Yes Yes Yes 19 13

                  JabRef Yes Yes Yes 3 3

                  JabRefMain Yes Yes Yes Yes 5 4

                  URLUtil Yes Yes 4 2

                  BibDatabase Yes Yes Yes 19 4

                  points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                  Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                  6 Evaluation of Swarm Debugging using GV

                  To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                  RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                  RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                  61 Study design

                  The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                  611 Subject System

                  For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                  612 Participants

                  Fig 7 Java expertise

                  To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                  Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                  20 httpwwwjabreforg21 httpswwwfreelancercom

                  24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  613 Task Description

                  We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                  614 Artifacts and Working Environment

                  After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                  For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                  615 Study Procedure

                  The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                  The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                  22 The full qualitative evaluation survey is available on httpsgooglforms

                  c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                  group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                  616 Data Collection

                  In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                  In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                  All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                  62 Results

                  We now discuss the results of our evaluation

                  RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                  During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                  25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                  26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  number of participants who could propose a solution and the correctness ofthe solutions

                  For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                  Fig 8 GV for Task 0318

                  For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                  Fig 9 GV for Task 0667

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                  Fig 10 GV for Task 0669

                  Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                  13

                  Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                  RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                  We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                  28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  Fig 11 GV usefulness - experimental phase one

                  Fig 12 GV usefulness - experimental phase two

                  The analysis of our results suggests that GV is useful to support software-maintenance tasks

                  Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                  Table 10 Results from control and experimental groups (average)

                  Task 0993

                  Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                  First breakpoint 000255 000340 -44 126

                  Time to start 000444 000518 -33 112

                  Elapsed time 003008 001605 843 53

                  Task 1026

                  Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                  First breakpoint 000242 000448 -126 177

                  Time to start 000402 000343 19 92

                  Elapsed time 002458 002041 257 83

                  63 Comparing Results from the Control and Experimental Groups

                  We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                  Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                  Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                  30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  64 Participantsrsquo Feedback

                  As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                  641 Intrinsic Advantage

                  Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                  Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                  642 Intrinsic Limitations

                  Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                  However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                  Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                  Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                  One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                  Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                  We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                  Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                  Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                  643 Accidental Advantages

                  Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                  Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                  Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                  32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  644 Accidental Limitations

                  Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                  Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                  One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                  Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                  Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                  645 General Feedback

                  Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                  It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                  This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                  debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                  7 Discussion

                  We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                  Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                  Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                  Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                  Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                  There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                  28 httpgithubcomswarmdebugging

                  34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                  8 Threats to Validity

                  Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                  As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                  Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                  Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                  We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                  Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                  Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                  Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                  External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                  9 Related work

                  We now summarise works related to debugging to allow better positioning ofour study among the published research

                  Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                  Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                  36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                  Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                  Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                  DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                  Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                  Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                  Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                  Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                  Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                  Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                  Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                  Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                  10 Conclusion

                  Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                  To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                  The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                  38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                  Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                  Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                  In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                  Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                  Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                  haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                  Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                  11 Acknowledgment

                  This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                  References

                  1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                  2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                  3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                  Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                  rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                  neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                  8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                  9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                  10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                  neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                  on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                  13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                  org107287peerjpreprints2743v1

                  14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                  40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                  16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                  17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                  18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                  19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                  101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                  oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                  1218575

                  22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                  neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                  conditional_breakpointhtmampcp=1_3_6_0_5

                  23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                  24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                  linkspringercom101007s10818-015-9203-6

                  25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                  (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                  actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                  C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                  29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                  30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                  31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                  32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                  pmcentrezamprendertype=abstract

                  33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                  34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                  35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                  36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                  doiacmorg1011452622669

                  37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                  38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                  39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                  40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                  41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                  42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                  43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                  44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                  45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                  46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                  47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                  48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                  49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                  50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                  51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                  52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                  53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                  54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                  55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                  56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                  57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                  58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                  42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  Appendix - Implementation of Swarm Debugging

                  Swarm Debugging Services

                  The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                  Fig 13 The Swarm Debugging Services architecture

                  The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                  We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                  ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                  projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                  and debugging events

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                  Fig 14 The Swarm Debugging metadata [17]

                  ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                  ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                  ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                  ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                  ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                  ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                  The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                  Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                  29 httpprojectsspringiospring-boot

                  44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                  httpswarmdebuggingorgdevelopers

                  searchfindByNamename=petrillo

                  the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                  SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                  Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                  Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                  Fig 15 Swarm Debugging Dashboard

                  30 httpdbswarmdebuggingorg31 httpswwwelasticco

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                  Fig 16 Neo4J Browser - a Cypher query example

                  Graph Querying Console The SDS also persists debugging data in a Neo4J32

                  graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                  Figure 16 shows an example of Cypher query and the resulting graph

                  Swarm Debugging Tracer

                  Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                  After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                  To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                  32 httpneo4jcom

                  46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  Fig 17 The Swarm Tracer architecture [17]

                  Fig 18 The Swarm Manager view

                  Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                  Fig 19 Breakpoint search tool (fuzzy search example)

                  invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                  To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                  Swarm Debugging Views

                  On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                  Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                  Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                  48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  Fig 20 Sequence stack diagram for Bridge design pattern

                  Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                  Breakpoint Search Tool

                  Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                  Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                  Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                  Fig 21 Method call graph for Bridge design pattern [17]

                  StartingEnding Method Search Tool

                  This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                  Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                  StartingPoint = VSP | VSP isin α and VSP isin β

                  EndingPoint = VEP | VEP isin β and VEP isin α

                  Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                  Summary

                  Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                  50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                  graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                  Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                  • 1 Introduction
                  • 2 Background
                  • 3 The Swarm Debugging Approach
                  • 4 SDI in a Nutshell
                  • 5 Using SDI to Understand Debugging Activities
                  • 6 Evaluation of Swarm Debugging using GV
                  • 7 Discussion
                  • 8 Threats to Validity
                  • 9 Related work
                  • 10 Conclusion
                  • 11 Acknowledgment

                    10Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

                    Debugging Views All the implementation details of SDI are available in theAppendix section

                    41 Swarm Debugging Global View

                    Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

                    GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

                    In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 11

                    Fig 4 GV on all tasks

                    nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

                    GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

                    5 Using SDI to Understand Debugging Activities

                    The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

                    Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

                    In this study we answered the following research questions

                    RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

                    RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

                    12Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                    RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                    In this section we elaborate more on each of the studies

                    51 Study 1 Observational Study on JabRef

                    511 Subject System

                    To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

                    512 Participants

                    We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

                    513 Task Description

                    We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

                    12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 13

                    Table 1 Summary of the issues considered in JabRef in Study 1

                    Issues Summaries

                    318 ldquoNormalize to Bibtex name formatrdquo

                    667 ldquohashpound sign causes URL link to failrdquo

                    669 ldquoJabRef 3132 writes bib file in a format

                    that it will not readrdquo

                    993 ldquoIssues in BibTeX source opens save dialog

                    and opens dialog Problem with parsing entryrsquo

                    multiple timesrdquo

                    1026 ldquoJabref removes comments

                    inside the Bibtex coderdquo

                    514 Artifacts and Working Environment

                    We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

                    We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

                    515 Study Procedure

                    After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

                    15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

                    14Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    All participants who participated in our study correctly executed the warm-uptask

                    After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

                    516 Data Collection

                    The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

                    ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

                    ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

                    ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

                    Based on this data we obtained or computed the following metrics perparticipant and task

                    ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

                    ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

                    ndash End time (T ) the time when the participant finished a task

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 15

                    ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

                    We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

                    52 Study 2 Empirical Study on PDFSaM and Raptor

                    The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

                    The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

                    53 Results

                    As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

                    In the following we present the results regarding each research questionaddressed in the two studies

                    18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

                    16Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

                    We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

                    MFB =EF

                    ET(1)

                    Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

                    Tasks Average Times (min) Std Devs (min)

                    318 44 64

                    667 28 29

                    669 22 25

                    993 25 25

                    1026 25 17

                    PdfSam 54 18

                    Raptor 59 13

                    Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

                    We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

                    a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 17

                    RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

                    For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

                    f(x) =α

                    xβ(2)

                    where α = 12 and β = 044

                    Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

                    We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

                    This finding also corroborates previous results found with a different set oftasks [17]

                    18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                    We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                    Table 3 Study 1 - Breakpoints per type of statement

                    Statements Numbers of Breakpoints

                    call 111 53

                    if-statement 39 19

                    assignment 36 17

                    return 18 10

                    while-loop 3 1

                    Table 4 Study 2 - Breakpoints per type of statement

                    Statements Numbers of Breakpoints

                    call 43 43

                    if-statement 22 22

                    assignment 27 27

                    return 4 4

                    while-loop 4 4

                    13

                    Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                    RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                    We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                    In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                    JabRefDesktopopenExternalViewer(metaData()

                    linktoString() field)

                    Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                    Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                    Tasks Classes Lines of Code Breakpoints

                    0318 AuthorsFormatter 43 5

                    0318 AuthorsFormatter 131 3

                    0667 BasePanel 935 2

                    0667 BasePanel 969 3

                    0667 JabRefDesktop 430 2

                    0669 OpenDatabaseAction 268 2

                    0669 OpenDatabaseAction 433 4

                    0669 OpenDatabaseAction 451 4

                    0993 EntryEditor 717 2

                    0993 EntryEditor 720 2

                    0993 EntryEditor 723 2

                    0993 BibDatabase 187 2

                    0993 BibDatabase 456 2

                    1026 EntryEditor 1184 2

                    1026 BibtexParser 160 2

                    20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                    Classes Lines of Code Breakpoints

                    PdfReader 230 2

                    PdfReader 806 2

                    PdfReader 1923 2

                    ConsoleServicesFacade 89 2

                    ConsoleClient 81 2

                    PdfUtility 94 2

                    PdfUtility 96 2

                    PdfUtility 102 2

                    Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                    Classes Lines of Code Breakpoints

                    icsUtils 333 3

                    Game 1751 2

                    ExamineController 41 2

                    ExamineController 84 3

                    ExamineController 87 2

                    ExamineController 92 2

                    When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                    We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                    For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                    Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                    Classes Lines of Code Breakpoints

                    BibtexParser 138151159 222

                    160165168 323

                    176198199299 2222

                    EntryEditor 717720721 342

                    723837842 232

                    11841393 32

                    BibDatabase 175187223456 2326

                    OpenDatabaseAction 433450451 424

                    JabRefDesktop 4084430 223

                    SaveDatabaseAction 177188 42

                    BasePanel 935969 25

                    AuthorsFormatter 43131 54

                    EntryTableTransferHandler 346 2

                    FieldTextMenu 84 2

                    JabRefFrame 1119 2

                    JabRefMain 8 5

                    URLUtil 95 2

                    Fig 6 Methods with 5 or more breakpoints

                    Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                    22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    Table 9 Study 1 - Breakpoints by class across different tasks

                    Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                    SaveDatabaseAction Yes Yes Yes 7 2

                    BasePanel Yes Yes Yes Yes 14 7

                    JabRefDesktop Yes Yes 9 4

                    EntryEditor Yes Yes Yes 36 4

                    BibtexParser Yes Yes Yes 44 6

                    OpenDatabaseAction Yes Yes Yes 19 13

                    JabRef Yes Yes Yes 3 3

                    JabRefMain Yes Yes Yes Yes 5 4

                    URLUtil Yes Yes 4 2

                    BibDatabase Yes Yes Yes 19 4

                    points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                    Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                    6 Evaluation of Swarm Debugging using GV

                    To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                    RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                    RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                    61 Study design

                    The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                    611 Subject System

                    For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                    612 Participants

                    Fig 7 Java expertise

                    To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                    Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                    20 httpwwwjabreforg21 httpswwwfreelancercom

                    24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    613 Task Description

                    We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                    614 Artifacts and Working Environment

                    After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                    For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                    615 Study Procedure

                    The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                    The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                    22 The full qualitative evaluation survey is available on httpsgooglforms

                    c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                    group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                    616 Data Collection

                    In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                    In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                    All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                    62 Results

                    We now discuss the results of our evaluation

                    RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                    During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                    25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                    26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    number of participants who could propose a solution and the correctness ofthe solutions

                    For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                    Fig 8 GV for Task 0318

                    For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                    Fig 9 GV for Task 0667

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                    Fig 10 GV for Task 0669

                    Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                    13

                    Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                    RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                    We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                    28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    Fig 11 GV usefulness - experimental phase one

                    Fig 12 GV usefulness - experimental phase two

                    The analysis of our results suggests that GV is useful to support software-maintenance tasks

                    Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                    Table 10 Results from control and experimental groups (average)

                    Task 0993

                    Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                    First breakpoint 000255 000340 -44 126

                    Time to start 000444 000518 -33 112

                    Elapsed time 003008 001605 843 53

                    Task 1026

                    Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                    First breakpoint 000242 000448 -126 177

                    Time to start 000402 000343 19 92

                    Elapsed time 002458 002041 257 83

                    63 Comparing Results from the Control and Experimental Groups

                    We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                    Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                    Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                    30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    64 Participantsrsquo Feedback

                    As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                    641 Intrinsic Advantage

                    Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                    Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                    642 Intrinsic Limitations

                    Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                    However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                    Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                    Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                    One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                    Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                    We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                    Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                    Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                    643 Accidental Advantages

                    Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                    Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                    Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                    32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    644 Accidental Limitations

                    Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                    Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                    One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                    Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                    Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                    645 General Feedback

                    Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                    It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                    This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                    debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                    7 Discussion

                    We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                    Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                    Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                    Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                    Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                    There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                    28 httpgithubcomswarmdebugging

                    34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                    8 Threats to Validity

                    Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                    As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                    Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                    Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                    We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                    Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                    Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                    Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                    External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                    9 Related work

                    We now summarise works related to debugging to allow better positioning ofour study among the published research

                    Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                    Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                    36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                    Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                    Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                    DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                    Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                    Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                    Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                    Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                    Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                    Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                    Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                    Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                    10 Conclusion

                    Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                    To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                    The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                    38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                    Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                    Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                    In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                    Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                    Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                    haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                    Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                    11 Acknowledgment

                    This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                    References

                    1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                    2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                    3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                    Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                    rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                    neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                    8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                    9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                    10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                    neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                    on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                    13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                    org107287peerjpreprints2743v1

                    14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                    40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                    16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                    17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                    18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                    19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                    101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                    oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                    1218575

                    22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                    neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                    conditional_breakpointhtmampcp=1_3_6_0_5

                    23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                    24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                    linkspringercom101007s10818-015-9203-6

                    25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                    (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                    actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                    C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                    29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                    30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                    31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                    32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                    pmcentrezamprendertype=abstract

                    33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                    34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                    35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                    36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                    doiacmorg1011452622669

                    37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                    38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                    39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                    40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                    41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                    42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                    43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                    44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                    45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                    46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                    47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                    48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                    49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                    50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                    51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                    52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                    53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                    54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                    55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                    56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                    57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                    58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                    42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    Appendix - Implementation of Swarm Debugging

                    Swarm Debugging Services

                    The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                    Fig 13 The Swarm Debugging Services architecture

                    The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                    We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                    ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                    projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                    and debugging events

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                    Fig 14 The Swarm Debugging metadata [17]

                    ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                    ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                    ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                    ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                    ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                    ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                    The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                    Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                    29 httpprojectsspringiospring-boot

                    44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                    httpswarmdebuggingorgdevelopers

                    searchfindByNamename=petrillo

                    the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                    SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                    Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                    Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                    Fig 15 Swarm Debugging Dashboard

                    30 httpdbswarmdebuggingorg31 httpswwwelasticco

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                    Fig 16 Neo4J Browser - a Cypher query example

                    Graph Querying Console The SDS also persists debugging data in a Neo4J32

                    graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                    Figure 16 shows an example of Cypher query and the resulting graph

                    Swarm Debugging Tracer

                    Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                    After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                    To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                    32 httpneo4jcom

                    46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    Fig 17 The Swarm Tracer architecture [17]

                    Fig 18 The Swarm Manager view

                    Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                    Fig 19 Breakpoint search tool (fuzzy search example)

                    invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                    To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                    Swarm Debugging Views

                    On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                    Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                    Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                    48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    Fig 20 Sequence stack diagram for Bridge design pattern

                    Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                    Breakpoint Search Tool

                    Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                    Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                    Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                    Fig 21 Method call graph for Bridge design pattern [17]

                    StartingEnding Method Search Tool

                    This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                    Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                    StartingPoint = VSP | VSP isin α and VSP isin β

                    EndingPoint = VEP | VEP isin β and VEP isin α

                    Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                    Summary

                    Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                    50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                    graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                    Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                    • 1 Introduction
                    • 2 Background
                    • 3 The Swarm Debugging Approach
                    • 4 SDI in a Nutshell
                    • 5 Using SDI to Understand Debugging Activities
                    • 6 Evaluation of Swarm Debugging using GV
                    • 7 Discussion
                    • 8 Threats to Validity
                    • 9 Related work
                    • 10 Conclusion
                    • 11 Acknowledgment

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 11

                      Fig 4 GV on all tasks

                      nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

                      GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

                      5 Using SDI to Understand Debugging Activities

                      The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

                      Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

                      In this study we answered the following research questions

                      RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

                      RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

                      12Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                      RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                      In this section we elaborate more on each of the studies

                      51 Study 1 Observational Study on JabRef

                      511 Subject System

                      To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

                      512 Participants

                      We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

                      513 Task Description

                      We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

                      12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 13

                      Table 1 Summary of the issues considered in JabRef in Study 1

                      Issues Summaries

                      318 ldquoNormalize to Bibtex name formatrdquo

                      667 ldquohashpound sign causes URL link to failrdquo

                      669 ldquoJabRef 3132 writes bib file in a format

                      that it will not readrdquo

                      993 ldquoIssues in BibTeX source opens save dialog

                      and opens dialog Problem with parsing entryrsquo

                      multiple timesrdquo

                      1026 ldquoJabref removes comments

                      inside the Bibtex coderdquo

                      514 Artifacts and Working Environment

                      We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

                      We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

                      515 Study Procedure

                      After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

                      15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

                      14Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      All participants who participated in our study correctly executed the warm-uptask

                      After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

                      516 Data Collection

                      The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

                      ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

                      ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

                      ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

                      Based on this data we obtained or computed the following metrics perparticipant and task

                      ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

                      ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

                      ndash End time (T ) the time when the participant finished a task

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 15

                      ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

                      We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

                      52 Study 2 Empirical Study on PDFSaM and Raptor

                      The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

                      The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

                      53 Results

                      As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

                      In the following we present the results regarding each research questionaddressed in the two studies

                      18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

                      16Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

                      We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

                      MFB =EF

                      ET(1)

                      Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

                      Tasks Average Times (min) Std Devs (min)

                      318 44 64

                      667 28 29

                      669 22 25

                      993 25 25

                      1026 25 17

                      PdfSam 54 18

                      Raptor 59 13

                      Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

                      We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

                      a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 17

                      RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

                      For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

                      f(x) =α

                      xβ(2)

                      where α = 12 and β = 044

                      Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

                      We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

                      This finding also corroborates previous results found with a different set oftasks [17]

                      18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                      We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                      Table 3 Study 1 - Breakpoints per type of statement

                      Statements Numbers of Breakpoints

                      call 111 53

                      if-statement 39 19

                      assignment 36 17

                      return 18 10

                      while-loop 3 1

                      Table 4 Study 2 - Breakpoints per type of statement

                      Statements Numbers of Breakpoints

                      call 43 43

                      if-statement 22 22

                      assignment 27 27

                      return 4 4

                      while-loop 4 4

                      13

                      Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                      RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                      We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                      In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                      JabRefDesktopopenExternalViewer(metaData()

                      linktoString() field)

                      Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                      Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                      Tasks Classes Lines of Code Breakpoints

                      0318 AuthorsFormatter 43 5

                      0318 AuthorsFormatter 131 3

                      0667 BasePanel 935 2

                      0667 BasePanel 969 3

                      0667 JabRefDesktop 430 2

                      0669 OpenDatabaseAction 268 2

                      0669 OpenDatabaseAction 433 4

                      0669 OpenDatabaseAction 451 4

                      0993 EntryEditor 717 2

                      0993 EntryEditor 720 2

                      0993 EntryEditor 723 2

                      0993 BibDatabase 187 2

                      0993 BibDatabase 456 2

                      1026 EntryEditor 1184 2

                      1026 BibtexParser 160 2

                      20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                      Classes Lines of Code Breakpoints

                      PdfReader 230 2

                      PdfReader 806 2

                      PdfReader 1923 2

                      ConsoleServicesFacade 89 2

                      ConsoleClient 81 2

                      PdfUtility 94 2

                      PdfUtility 96 2

                      PdfUtility 102 2

                      Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                      Classes Lines of Code Breakpoints

                      icsUtils 333 3

                      Game 1751 2

                      ExamineController 41 2

                      ExamineController 84 3

                      ExamineController 87 2

                      ExamineController 92 2

                      When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                      We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                      For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                      Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                      Classes Lines of Code Breakpoints

                      BibtexParser 138151159 222

                      160165168 323

                      176198199299 2222

                      EntryEditor 717720721 342

                      723837842 232

                      11841393 32

                      BibDatabase 175187223456 2326

                      OpenDatabaseAction 433450451 424

                      JabRefDesktop 4084430 223

                      SaveDatabaseAction 177188 42

                      BasePanel 935969 25

                      AuthorsFormatter 43131 54

                      EntryTableTransferHandler 346 2

                      FieldTextMenu 84 2

                      JabRefFrame 1119 2

                      JabRefMain 8 5

                      URLUtil 95 2

                      Fig 6 Methods with 5 or more breakpoints

                      Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                      22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      Table 9 Study 1 - Breakpoints by class across different tasks

                      Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                      SaveDatabaseAction Yes Yes Yes 7 2

                      BasePanel Yes Yes Yes Yes 14 7

                      JabRefDesktop Yes Yes 9 4

                      EntryEditor Yes Yes Yes 36 4

                      BibtexParser Yes Yes Yes 44 6

                      OpenDatabaseAction Yes Yes Yes 19 13

                      JabRef Yes Yes Yes 3 3

                      JabRefMain Yes Yes Yes Yes 5 4

                      URLUtil Yes Yes 4 2

                      BibDatabase Yes Yes Yes 19 4

                      points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                      Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                      6 Evaluation of Swarm Debugging using GV

                      To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                      RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                      RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                      61 Study design

                      The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                      611 Subject System

                      For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                      612 Participants

                      Fig 7 Java expertise

                      To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                      Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                      20 httpwwwjabreforg21 httpswwwfreelancercom

                      24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      613 Task Description

                      We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                      614 Artifacts and Working Environment

                      After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                      For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                      615 Study Procedure

                      The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                      The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                      22 The full qualitative evaluation survey is available on httpsgooglforms

                      c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                      group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                      616 Data Collection

                      In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                      In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                      All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                      62 Results

                      We now discuss the results of our evaluation

                      RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                      During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                      25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                      26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      number of participants who could propose a solution and the correctness ofthe solutions

                      For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                      Fig 8 GV for Task 0318

                      For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                      Fig 9 GV for Task 0667

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                      Fig 10 GV for Task 0669

                      Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                      13

                      Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                      RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                      We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                      28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      Fig 11 GV usefulness - experimental phase one

                      Fig 12 GV usefulness - experimental phase two

                      The analysis of our results suggests that GV is useful to support software-maintenance tasks

                      Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                      Table 10 Results from control and experimental groups (average)

                      Task 0993

                      Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                      First breakpoint 000255 000340 -44 126

                      Time to start 000444 000518 -33 112

                      Elapsed time 003008 001605 843 53

                      Task 1026

                      Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                      First breakpoint 000242 000448 -126 177

                      Time to start 000402 000343 19 92

                      Elapsed time 002458 002041 257 83

                      63 Comparing Results from the Control and Experimental Groups

                      We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                      Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                      Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                      30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      64 Participantsrsquo Feedback

                      As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                      641 Intrinsic Advantage

                      Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                      Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                      642 Intrinsic Limitations

                      Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                      However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                      Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                      Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                      One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                      Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                      We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                      Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                      Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                      643 Accidental Advantages

                      Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                      Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                      Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                      32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      644 Accidental Limitations

                      Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                      Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                      One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                      Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                      Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                      645 General Feedback

                      Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                      It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                      This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                      debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                      7 Discussion

                      We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                      Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                      Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                      Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                      Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                      There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                      28 httpgithubcomswarmdebugging

                      34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                      8 Threats to Validity

                      Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                      As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                      Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                      Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                      We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                      Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                      Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                      Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                      External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                      9 Related work

                      We now summarise works related to debugging to allow better positioning ofour study among the published research

                      Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                      Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                      36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                      Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                      Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                      DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                      Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                      Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                      Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                      Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                      Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                      Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                      Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                      Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                      10 Conclusion

                      Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                      To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                      The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                      38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                      Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                      Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                      In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                      Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                      Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                      haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                      Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                      11 Acknowledgment

                      This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                      References

                      1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                      2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                      3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                      Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                      rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                      neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                      8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                      9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                      10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                      neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                      on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                      13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                      org107287peerjpreprints2743v1

                      14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                      40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                      16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                      17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                      18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                      19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                      101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                      oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                      1218575

                      22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                      neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                      conditional_breakpointhtmampcp=1_3_6_0_5

                      23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                      24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                      linkspringercom101007s10818-015-9203-6

                      25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                      (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                      actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                      C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                      29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                      30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                      31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                      32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                      pmcentrezamprendertype=abstract

                      33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                      34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                      35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                      36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                      doiacmorg1011452622669

                      37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                      38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                      39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                      40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                      41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                      42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                      43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                      44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                      45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                      46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                      47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                      48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                      49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                      50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                      51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                      52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                      53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                      54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                      55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                      56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                      57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                      58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                      42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      Appendix - Implementation of Swarm Debugging

                      Swarm Debugging Services

                      The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                      Fig 13 The Swarm Debugging Services architecture

                      The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                      We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                      ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                      projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                      and debugging events

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                      Fig 14 The Swarm Debugging metadata [17]

                      ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                      ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                      ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                      ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                      ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                      ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                      The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                      Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                      29 httpprojectsspringiospring-boot

                      44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                      httpswarmdebuggingorgdevelopers

                      searchfindByNamename=petrillo

                      the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                      SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                      Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                      Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                      Fig 15 Swarm Debugging Dashboard

                      30 httpdbswarmdebuggingorg31 httpswwwelasticco

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                      Fig 16 Neo4J Browser - a Cypher query example

                      Graph Querying Console The SDS also persists debugging data in a Neo4J32

                      graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                      Figure 16 shows an example of Cypher query and the resulting graph

                      Swarm Debugging Tracer

                      Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                      After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                      To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                      32 httpneo4jcom

                      46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      Fig 17 The Swarm Tracer architecture [17]

                      Fig 18 The Swarm Manager view

                      Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                      Fig 19 Breakpoint search tool (fuzzy search example)

                      invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                      To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                      Swarm Debugging Views

                      On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                      Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                      Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                      48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      Fig 20 Sequence stack diagram for Bridge design pattern

                      Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                      Breakpoint Search Tool

                      Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                      Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                      Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                      Fig 21 Method call graph for Bridge design pattern [17]

                      StartingEnding Method Search Tool

                      This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                      Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                      StartingPoint = VSP | VSP isin α and VSP isin β

                      EndingPoint = VEP | VEP isin β and VEP isin α

                      Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                      Summary

                      Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                      50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                      graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                      Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                      • 1 Introduction
                      • 2 Background
                      • 3 The Swarm Debugging Approach
                      • 4 SDI in a Nutshell
                      • 5 Using SDI to Understand Debugging Activities
                      • 6 Evaluation of Swarm Debugging using GV
                      • 7 Discussion
                      • 8 Threats to Validity
                      • 9 Related work
                      • 10 Conclusion
                      • 11 Acknowledgment

                        12Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                        RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                        In this section we elaborate more on each of the studies

                        51 Study 1 Observational Study on JabRef

                        511 Subject System

                        To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

                        512 Participants

                        We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

                        513 Task Description

                        We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

                        12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 13

                        Table 1 Summary of the issues considered in JabRef in Study 1

                        Issues Summaries

                        318 ldquoNormalize to Bibtex name formatrdquo

                        667 ldquohashpound sign causes URL link to failrdquo

                        669 ldquoJabRef 3132 writes bib file in a format

                        that it will not readrdquo

                        993 ldquoIssues in BibTeX source opens save dialog

                        and opens dialog Problem with parsing entryrsquo

                        multiple timesrdquo

                        1026 ldquoJabref removes comments

                        inside the Bibtex coderdquo

                        514 Artifacts and Working Environment

                        We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

                        We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

                        515 Study Procedure

                        After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

                        15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

                        14Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        All participants who participated in our study correctly executed the warm-uptask

                        After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

                        516 Data Collection

                        The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

                        ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

                        ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

                        ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

                        Based on this data we obtained or computed the following metrics perparticipant and task

                        ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

                        ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

                        ndash End time (T ) the time when the participant finished a task

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 15

                        ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

                        We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

                        52 Study 2 Empirical Study on PDFSaM and Raptor

                        The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

                        The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

                        53 Results

                        As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

                        In the following we present the results regarding each research questionaddressed in the two studies

                        18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

                        16Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

                        We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

                        MFB =EF

                        ET(1)

                        Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

                        Tasks Average Times (min) Std Devs (min)

                        318 44 64

                        667 28 29

                        669 22 25

                        993 25 25

                        1026 25 17

                        PdfSam 54 18

                        Raptor 59 13

                        Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

                        We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

                        a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 17

                        RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

                        For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

                        f(x) =α

                        xβ(2)

                        where α = 12 and β = 044

                        Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

                        We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

                        This finding also corroborates previous results found with a different set oftasks [17]

                        18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                        We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                        Table 3 Study 1 - Breakpoints per type of statement

                        Statements Numbers of Breakpoints

                        call 111 53

                        if-statement 39 19

                        assignment 36 17

                        return 18 10

                        while-loop 3 1

                        Table 4 Study 2 - Breakpoints per type of statement

                        Statements Numbers of Breakpoints

                        call 43 43

                        if-statement 22 22

                        assignment 27 27

                        return 4 4

                        while-loop 4 4

                        13

                        Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                        RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                        We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                        In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                        JabRefDesktopopenExternalViewer(metaData()

                        linktoString() field)

                        Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                        Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                        Tasks Classes Lines of Code Breakpoints

                        0318 AuthorsFormatter 43 5

                        0318 AuthorsFormatter 131 3

                        0667 BasePanel 935 2

                        0667 BasePanel 969 3

                        0667 JabRefDesktop 430 2

                        0669 OpenDatabaseAction 268 2

                        0669 OpenDatabaseAction 433 4

                        0669 OpenDatabaseAction 451 4

                        0993 EntryEditor 717 2

                        0993 EntryEditor 720 2

                        0993 EntryEditor 723 2

                        0993 BibDatabase 187 2

                        0993 BibDatabase 456 2

                        1026 EntryEditor 1184 2

                        1026 BibtexParser 160 2

                        20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                        Classes Lines of Code Breakpoints

                        PdfReader 230 2

                        PdfReader 806 2

                        PdfReader 1923 2

                        ConsoleServicesFacade 89 2

                        ConsoleClient 81 2

                        PdfUtility 94 2

                        PdfUtility 96 2

                        PdfUtility 102 2

                        Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                        Classes Lines of Code Breakpoints

                        icsUtils 333 3

                        Game 1751 2

                        ExamineController 41 2

                        ExamineController 84 3

                        ExamineController 87 2

                        ExamineController 92 2

                        When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                        We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                        For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                        Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                        Classes Lines of Code Breakpoints

                        BibtexParser 138151159 222

                        160165168 323

                        176198199299 2222

                        EntryEditor 717720721 342

                        723837842 232

                        11841393 32

                        BibDatabase 175187223456 2326

                        OpenDatabaseAction 433450451 424

                        JabRefDesktop 4084430 223

                        SaveDatabaseAction 177188 42

                        BasePanel 935969 25

                        AuthorsFormatter 43131 54

                        EntryTableTransferHandler 346 2

                        FieldTextMenu 84 2

                        JabRefFrame 1119 2

                        JabRefMain 8 5

                        URLUtil 95 2

                        Fig 6 Methods with 5 or more breakpoints

                        Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                        22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        Table 9 Study 1 - Breakpoints by class across different tasks

                        Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                        SaveDatabaseAction Yes Yes Yes 7 2

                        BasePanel Yes Yes Yes Yes 14 7

                        JabRefDesktop Yes Yes 9 4

                        EntryEditor Yes Yes Yes 36 4

                        BibtexParser Yes Yes Yes 44 6

                        OpenDatabaseAction Yes Yes Yes 19 13

                        JabRef Yes Yes Yes 3 3

                        JabRefMain Yes Yes Yes Yes 5 4

                        URLUtil Yes Yes 4 2

                        BibDatabase Yes Yes Yes 19 4

                        points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                        Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                        6 Evaluation of Swarm Debugging using GV

                        To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                        RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                        RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                        61 Study design

                        The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                        611 Subject System

                        For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                        612 Participants

                        Fig 7 Java expertise

                        To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                        Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                        20 httpwwwjabreforg21 httpswwwfreelancercom

                        24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        613 Task Description

                        We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                        614 Artifacts and Working Environment

                        After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                        For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                        615 Study Procedure

                        The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                        The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                        22 The full qualitative evaluation survey is available on httpsgooglforms

                        c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                        group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                        616 Data Collection

                        In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                        In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                        All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                        62 Results

                        We now discuss the results of our evaluation

                        RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                        During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                        25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                        26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        number of participants who could propose a solution and the correctness ofthe solutions

                        For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                        Fig 8 GV for Task 0318

                        For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                        Fig 9 GV for Task 0667

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                        Fig 10 GV for Task 0669

                        Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                        13

                        Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                        RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                        We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                        28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        Fig 11 GV usefulness - experimental phase one

                        Fig 12 GV usefulness - experimental phase two

                        The analysis of our results suggests that GV is useful to support software-maintenance tasks

                        Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                        Table 10 Results from control and experimental groups (average)

                        Task 0993

                        Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                        First breakpoint 000255 000340 -44 126

                        Time to start 000444 000518 -33 112

                        Elapsed time 003008 001605 843 53

                        Task 1026

                        Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                        First breakpoint 000242 000448 -126 177

                        Time to start 000402 000343 19 92

                        Elapsed time 002458 002041 257 83

                        63 Comparing Results from the Control and Experimental Groups

                        We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                        Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                        Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                        30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        64 Participantsrsquo Feedback

                        As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                        641 Intrinsic Advantage

                        Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                        Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                        642 Intrinsic Limitations

                        Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                        However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                        Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                        Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                        One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                        Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                        We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                        Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                        Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                        643 Accidental Advantages

                        Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                        Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                        Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                        32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        644 Accidental Limitations

                        Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                        Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                        One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                        Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                        Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                        645 General Feedback

                        Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                        It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                        This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                        debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                        7 Discussion

                        We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                        Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                        Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                        Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                        Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                        There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                        28 httpgithubcomswarmdebugging

                        34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                        8 Threats to Validity

                        Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                        As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                        Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                        Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                        We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                        Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                        Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                        Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                        External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                        9 Related work

                        We now summarise works related to debugging to allow better positioning ofour study among the published research

                        Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                        Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                        36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                        Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                        Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                        DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                        Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                        Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                        Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                        Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                        Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                        Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                        Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                        Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                        10 Conclusion

                        Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                        To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                        The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                        38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                        Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                        Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                        In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                        Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                        Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                        haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                        Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                        11 Acknowledgment

                        This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                        References

                        1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                        2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                        3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                        Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                        rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                        neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                        8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                        9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                        10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                        neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                        on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                        13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                        org107287peerjpreprints2743v1

                        14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                        40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                        16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                        17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                        18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                        19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                        101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                        oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                        1218575

                        22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                        neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                        conditional_breakpointhtmampcp=1_3_6_0_5

                        23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                        24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                        linkspringercom101007s10818-015-9203-6

                        25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                        (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                        actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                        C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                        29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                        30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                        31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                        32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                        pmcentrezamprendertype=abstract

                        33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                        34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                        35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                        36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                        doiacmorg1011452622669

                        37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                        38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                        39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                        40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                        41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                        42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                        43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                        44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                        45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                        46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                        47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                        48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                        49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                        50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                        51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                        52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                        53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                        54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                        55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                        56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                        57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                        58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                        42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        Appendix - Implementation of Swarm Debugging

                        Swarm Debugging Services

                        The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                        Fig 13 The Swarm Debugging Services architecture

                        The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                        We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                        ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                        projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                        and debugging events

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                        Fig 14 The Swarm Debugging metadata [17]

                        ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                        ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                        ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                        ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                        ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                        ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                        The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                        Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                        29 httpprojectsspringiospring-boot

                        44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                        httpswarmdebuggingorgdevelopers

                        searchfindByNamename=petrillo

                        the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                        SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                        Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                        Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                        Fig 15 Swarm Debugging Dashboard

                        30 httpdbswarmdebuggingorg31 httpswwwelasticco

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                        Fig 16 Neo4J Browser - a Cypher query example

                        Graph Querying Console The SDS also persists debugging data in a Neo4J32

                        graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                        Figure 16 shows an example of Cypher query and the resulting graph

                        Swarm Debugging Tracer

                        Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                        After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                        To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                        32 httpneo4jcom

                        46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        Fig 17 The Swarm Tracer architecture [17]

                        Fig 18 The Swarm Manager view

                        Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                        Fig 19 Breakpoint search tool (fuzzy search example)

                        invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                        To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                        Swarm Debugging Views

                        On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                        Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                        Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                        48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        Fig 20 Sequence stack diagram for Bridge design pattern

                        Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                        Breakpoint Search Tool

                        Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                        Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                        Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                        Fig 21 Method call graph for Bridge design pattern [17]

                        StartingEnding Method Search Tool

                        This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                        Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                        StartingPoint = VSP | VSP isin α and VSP isin β

                        EndingPoint = VEP | VEP isin β and VEP isin α

                        Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                        Summary

                        Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                        50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                        graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                        Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                        • 1 Introduction
                        • 2 Background
                        • 3 The Swarm Debugging Approach
                        • 4 SDI in a Nutshell
                        • 5 Using SDI to Understand Debugging Activities
                        • 6 Evaluation of Swarm Debugging using GV
                        • 7 Discussion
                        • 8 Threats to Validity
                        • 9 Related work
                        • 10 Conclusion
                        • 11 Acknowledgment

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 13

                          Table 1 Summary of the issues considered in JabRef in Study 1

                          Issues Summaries

                          318 ldquoNormalize to Bibtex name formatrdquo

                          667 ldquohashpound sign causes URL link to failrdquo

                          669 ldquoJabRef 3132 writes bib file in a format

                          that it will not readrdquo

                          993 ldquoIssues in BibTeX source opens save dialog

                          and opens dialog Problem with parsing entryrsquo

                          multiple timesrdquo

                          1026 ldquoJabref removes comments

                          inside the Bibtex coderdquo

                          514 Artifacts and Working Environment

                          We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

                          We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

                          515 Study Procedure

                          After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

                          15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

                          14Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          All participants who participated in our study correctly executed the warm-uptask

                          After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

                          516 Data Collection

                          The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

                          ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

                          ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

                          ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

                          Based on this data we obtained or computed the following metrics perparticipant and task

                          ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

                          ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

                          ndash End time (T ) the time when the participant finished a task

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 15

                          ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

                          We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

                          52 Study 2 Empirical Study on PDFSaM and Raptor

                          The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

                          The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

                          53 Results

                          As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

                          In the following we present the results regarding each research questionaddressed in the two studies

                          18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

                          16Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

                          We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

                          MFB =EF

                          ET(1)

                          Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

                          Tasks Average Times (min) Std Devs (min)

                          318 44 64

                          667 28 29

                          669 22 25

                          993 25 25

                          1026 25 17

                          PdfSam 54 18

                          Raptor 59 13

                          Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

                          We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

                          a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 17

                          RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

                          For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

                          f(x) =α

                          xβ(2)

                          where α = 12 and β = 044

                          Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

                          We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

                          This finding also corroborates previous results found with a different set oftasks [17]

                          18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                          We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                          Table 3 Study 1 - Breakpoints per type of statement

                          Statements Numbers of Breakpoints

                          call 111 53

                          if-statement 39 19

                          assignment 36 17

                          return 18 10

                          while-loop 3 1

                          Table 4 Study 2 - Breakpoints per type of statement

                          Statements Numbers of Breakpoints

                          call 43 43

                          if-statement 22 22

                          assignment 27 27

                          return 4 4

                          while-loop 4 4

                          13

                          Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                          RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                          We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                          In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                          JabRefDesktopopenExternalViewer(metaData()

                          linktoString() field)

                          Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                          Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                          Tasks Classes Lines of Code Breakpoints

                          0318 AuthorsFormatter 43 5

                          0318 AuthorsFormatter 131 3

                          0667 BasePanel 935 2

                          0667 BasePanel 969 3

                          0667 JabRefDesktop 430 2

                          0669 OpenDatabaseAction 268 2

                          0669 OpenDatabaseAction 433 4

                          0669 OpenDatabaseAction 451 4

                          0993 EntryEditor 717 2

                          0993 EntryEditor 720 2

                          0993 EntryEditor 723 2

                          0993 BibDatabase 187 2

                          0993 BibDatabase 456 2

                          1026 EntryEditor 1184 2

                          1026 BibtexParser 160 2

                          20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                          Classes Lines of Code Breakpoints

                          PdfReader 230 2

                          PdfReader 806 2

                          PdfReader 1923 2

                          ConsoleServicesFacade 89 2

                          ConsoleClient 81 2

                          PdfUtility 94 2

                          PdfUtility 96 2

                          PdfUtility 102 2

                          Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                          Classes Lines of Code Breakpoints

                          icsUtils 333 3

                          Game 1751 2

                          ExamineController 41 2

                          ExamineController 84 3

                          ExamineController 87 2

                          ExamineController 92 2

                          When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                          We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                          For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                          Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                          Classes Lines of Code Breakpoints

                          BibtexParser 138151159 222

                          160165168 323

                          176198199299 2222

                          EntryEditor 717720721 342

                          723837842 232

                          11841393 32

                          BibDatabase 175187223456 2326

                          OpenDatabaseAction 433450451 424

                          JabRefDesktop 4084430 223

                          SaveDatabaseAction 177188 42

                          BasePanel 935969 25

                          AuthorsFormatter 43131 54

                          EntryTableTransferHandler 346 2

                          FieldTextMenu 84 2

                          JabRefFrame 1119 2

                          JabRefMain 8 5

                          URLUtil 95 2

                          Fig 6 Methods with 5 or more breakpoints

                          Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                          22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          Table 9 Study 1 - Breakpoints by class across different tasks

                          Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                          SaveDatabaseAction Yes Yes Yes 7 2

                          BasePanel Yes Yes Yes Yes 14 7

                          JabRefDesktop Yes Yes 9 4

                          EntryEditor Yes Yes Yes 36 4

                          BibtexParser Yes Yes Yes 44 6

                          OpenDatabaseAction Yes Yes Yes 19 13

                          JabRef Yes Yes Yes 3 3

                          JabRefMain Yes Yes Yes Yes 5 4

                          URLUtil Yes Yes 4 2

                          BibDatabase Yes Yes Yes 19 4

                          points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                          Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                          6 Evaluation of Swarm Debugging using GV

                          To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                          RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                          RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                          61 Study design

                          The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                          611 Subject System

                          For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                          612 Participants

                          Fig 7 Java expertise

                          To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                          Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                          20 httpwwwjabreforg21 httpswwwfreelancercom

                          24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          613 Task Description

                          We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                          614 Artifacts and Working Environment

                          After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                          For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                          615 Study Procedure

                          The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                          The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                          22 The full qualitative evaluation survey is available on httpsgooglforms

                          c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                          group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                          616 Data Collection

                          In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                          In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                          All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                          62 Results

                          We now discuss the results of our evaluation

                          RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                          During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                          25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                          26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          number of participants who could propose a solution and the correctness ofthe solutions

                          For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                          Fig 8 GV for Task 0318

                          For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                          Fig 9 GV for Task 0667

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                          Fig 10 GV for Task 0669

                          Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                          13

                          Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                          RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                          We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                          28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          Fig 11 GV usefulness - experimental phase one

                          Fig 12 GV usefulness - experimental phase two

                          The analysis of our results suggests that GV is useful to support software-maintenance tasks

                          Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                          Table 10 Results from control and experimental groups (average)

                          Task 0993

                          Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                          First breakpoint 000255 000340 -44 126

                          Time to start 000444 000518 -33 112

                          Elapsed time 003008 001605 843 53

                          Task 1026

                          Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                          First breakpoint 000242 000448 -126 177

                          Time to start 000402 000343 19 92

                          Elapsed time 002458 002041 257 83

                          63 Comparing Results from the Control and Experimental Groups

                          We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                          Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                          Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                          30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          64 Participantsrsquo Feedback

                          As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                          641 Intrinsic Advantage

                          Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                          Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                          642 Intrinsic Limitations

                          Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                          However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                          Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                          Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                          One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                          Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                          We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                          Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                          Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                          643 Accidental Advantages

                          Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                          Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                          Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                          32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          644 Accidental Limitations

                          Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                          Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                          One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                          Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                          Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                          645 General Feedback

                          Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                          It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                          This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                          debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                          7 Discussion

                          We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                          Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                          Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                          Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                          Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                          There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                          28 httpgithubcomswarmdebugging

                          34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                          8 Threats to Validity

                          Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                          As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                          Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                          Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                          We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                          Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                          Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                          Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                          External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                          9 Related work

                          We now summarise works related to debugging to allow better positioning ofour study among the published research

                          Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                          Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                          36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                          Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                          Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                          DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                          Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                          Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                          Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                          Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                          Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                          Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                          Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                          Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                          10 Conclusion

                          Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                          To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                          The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                          38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                          Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                          Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                          In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                          Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                          Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                          haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                          Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                          11 Acknowledgment

                          This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                          References

                          1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                          2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                          3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                          Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                          rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                          neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                          8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                          9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                          10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                          neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                          on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                          13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                          org107287peerjpreprints2743v1

                          14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                          40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                          16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                          17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                          18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                          19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                          101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                          oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                          1218575

                          22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                          neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                          conditional_breakpointhtmampcp=1_3_6_0_5

                          23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                          24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                          linkspringercom101007s10818-015-9203-6

                          25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                          (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                          actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                          C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                          29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                          30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                          31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                          32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                          pmcentrezamprendertype=abstract

                          33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                          34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                          35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                          36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                          doiacmorg1011452622669

                          37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                          38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                          39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                          40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                          41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                          42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                          43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                          44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                          45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                          46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                          47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                          48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                          49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                          50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                          51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                          52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                          53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                          54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                          55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                          56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                          57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                          58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                          42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          Appendix - Implementation of Swarm Debugging

                          Swarm Debugging Services

                          The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                          Fig 13 The Swarm Debugging Services architecture

                          The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                          We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                          ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                          projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                          and debugging events

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                          Fig 14 The Swarm Debugging metadata [17]

                          ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                          ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                          ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                          ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                          ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                          ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                          The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                          Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                          29 httpprojectsspringiospring-boot

                          44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                          httpswarmdebuggingorgdevelopers

                          searchfindByNamename=petrillo

                          the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                          SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                          Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                          Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                          Fig 15 Swarm Debugging Dashboard

                          30 httpdbswarmdebuggingorg31 httpswwwelasticco

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                          Fig 16 Neo4J Browser - a Cypher query example

                          Graph Querying Console The SDS also persists debugging data in a Neo4J32

                          graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                          Figure 16 shows an example of Cypher query and the resulting graph

                          Swarm Debugging Tracer

                          Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                          After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                          To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                          32 httpneo4jcom

                          46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          Fig 17 The Swarm Tracer architecture [17]

                          Fig 18 The Swarm Manager view

                          Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                          Fig 19 Breakpoint search tool (fuzzy search example)

                          invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                          To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                          Swarm Debugging Views

                          On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                          Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                          Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                          48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          Fig 20 Sequence stack diagram for Bridge design pattern

                          Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                          Breakpoint Search Tool

                          Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                          Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                          Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                          Fig 21 Method call graph for Bridge design pattern [17]

                          StartingEnding Method Search Tool

                          This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                          Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                          StartingPoint = VSP | VSP isin α and VSP isin β

                          EndingPoint = VEP | VEP isin β and VEP isin α

                          Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                          Summary

                          Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                          50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                          graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                          Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                          • 1 Introduction
                          • 2 Background
                          • 3 The Swarm Debugging Approach
                          • 4 SDI in a Nutshell
                          • 5 Using SDI to Understand Debugging Activities
                          • 6 Evaluation of Swarm Debugging using GV
                          • 7 Discussion
                          • 8 Threats to Validity
                          • 9 Related work
                          • 10 Conclusion
                          • 11 Acknowledgment

                            14Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            All participants who participated in our study correctly executed the warm-uptask

                            After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

                            516 Data Collection

                            The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

                            ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

                            ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

                            ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

                            Based on this data we obtained or computed the following metrics perparticipant and task

                            ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

                            ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

                            ndash End time (T ) the time when the participant finished a task

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 15

                            ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

                            We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

                            52 Study 2 Empirical Study on PDFSaM and Raptor

                            The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

                            The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

                            53 Results

                            As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

                            In the following we present the results regarding each research questionaddressed in the two studies

                            18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

                            16Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

                            We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

                            MFB =EF

                            ET(1)

                            Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

                            Tasks Average Times (min) Std Devs (min)

                            318 44 64

                            667 28 29

                            669 22 25

                            993 25 25

                            1026 25 17

                            PdfSam 54 18

                            Raptor 59 13

                            Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

                            We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

                            a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 17

                            RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

                            For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

                            f(x) =α

                            xβ(2)

                            where α = 12 and β = 044

                            Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

                            We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

                            This finding also corroborates previous results found with a different set oftasks [17]

                            18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                            We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                            Table 3 Study 1 - Breakpoints per type of statement

                            Statements Numbers of Breakpoints

                            call 111 53

                            if-statement 39 19

                            assignment 36 17

                            return 18 10

                            while-loop 3 1

                            Table 4 Study 2 - Breakpoints per type of statement

                            Statements Numbers of Breakpoints

                            call 43 43

                            if-statement 22 22

                            assignment 27 27

                            return 4 4

                            while-loop 4 4

                            13

                            Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                            RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                            We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                            In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                            JabRefDesktopopenExternalViewer(metaData()

                            linktoString() field)

                            Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                            Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                            Tasks Classes Lines of Code Breakpoints

                            0318 AuthorsFormatter 43 5

                            0318 AuthorsFormatter 131 3

                            0667 BasePanel 935 2

                            0667 BasePanel 969 3

                            0667 JabRefDesktop 430 2

                            0669 OpenDatabaseAction 268 2

                            0669 OpenDatabaseAction 433 4

                            0669 OpenDatabaseAction 451 4

                            0993 EntryEditor 717 2

                            0993 EntryEditor 720 2

                            0993 EntryEditor 723 2

                            0993 BibDatabase 187 2

                            0993 BibDatabase 456 2

                            1026 EntryEditor 1184 2

                            1026 BibtexParser 160 2

                            20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                            Classes Lines of Code Breakpoints

                            PdfReader 230 2

                            PdfReader 806 2

                            PdfReader 1923 2

                            ConsoleServicesFacade 89 2

                            ConsoleClient 81 2

                            PdfUtility 94 2

                            PdfUtility 96 2

                            PdfUtility 102 2

                            Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                            Classes Lines of Code Breakpoints

                            icsUtils 333 3

                            Game 1751 2

                            ExamineController 41 2

                            ExamineController 84 3

                            ExamineController 87 2

                            ExamineController 92 2

                            When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                            We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                            For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                            Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                            Classes Lines of Code Breakpoints

                            BibtexParser 138151159 222

                            160165168 323

                            176198199299 2222

                            EntryEditor 717720721 342

                            723837842 232

                            11841393 32

                            BibDatabase 175187223456 2326

                            OpenDatabaseAction 433450451 424

                            JabRefDesktop 4084430 223

                            SaveDatabaseAction 177188 42

                            BasePanel 935969 25

                            AuthorsFormatter 43131 54

                            EntryTableTransferHandler 346 2

                            FieldTextMenu 84 2

                            JabRefFrame 1119 2

                            JabRefMain 8 5

                            URLUtil 95 2

                            Fig 6 Methods with 5 or more breakpoints

                            Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                            22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            Table 9 Study 1 - Breakpoints by class across different tasks

                            Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                            SaveDatabaseAction Yes Yes Yes 7 2

                            BasePanel Yes Yes Yes Yes 14 7

                            JabRefDesktop Yes Yes 9 4

                            EntryEditor Yes Yes Yes 36 4

                            BibtexParser Yes Yes Yes 44 6

                            OpenDatabaseAction Yes Yes Yes 19 13

                            JabRef Yes Yes Yes 3 3

                            JabRefMain Yes Yes Yes Yes 5 4

                            URLUtil Yes Yes 4 2

                            BibDatabase Yes Yes Yes 19 4

                            points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                            Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                            6 Evaluation of Swarm Debugging using GV

                            To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                            RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                            RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                            61 Study design

                            The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                            611 Subject System

                            For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                            612 Participants

                            Fig 7 Java expertise

                            To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                            Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                            20 httpwwwjabreforg21 httpswwwfreelancercom

                            24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            613 Task Description

                            We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                            614 Artifacts and Working Environment

                            After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                            For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                            615 Study Procedure

                            The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                            The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                            22 The full qualitative evaluation survey is available on httpsgooglforms

                            c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                            group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                            616 Data Collection

                            In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                            In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                            All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                            62 Results

                            We now discuss the results of our evaluation

                            RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                            During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                            25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                            26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            number of participants who could propose a solution and the correctness ofthe solutions

                            For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                            Fig 8 GV for Task 0318

                            For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                            Fig 9 GV for Task 0667

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                            Fig 10 GV for Task 0669

                            Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                            13

                            Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                            RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                            We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                            28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            Fig 11 GV usefulness - experimental phase one

                            Fig 12 GV usefulness - experimental phase two

                            The analysis of our results suggests that GV is useful to support software-maintenance tasks

                            Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                            Table 10 Results from control and experimental groups (average)

                            Task 0993

                            Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                            First breakpoint 000255 000340 -44 126

                            Time to start 000444 000518 -33 112

                            Elapsed time 003008 001605 843 53

                            Task 1026

                            Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                            First breakpoint 000242 000448 -126 177

                            Time to start 000402 000343 19 92

                            Elapsed time 002458 002041 257 83

                            63 Comparing Results from the Control and Experimental Groups

                            We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                            Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                            Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                            30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            64 Participantsrsquo Feedback

                            As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                            641 Intrinsic Advantage

                            Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                            Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                            642 Intrinsic Limitations

                            Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                            However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                            Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                            Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                            One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                            Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                            We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                            Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                            Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                            643 Accidental Advantages

                            Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                            Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                            Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                            32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            644 Accidental Limitations

                            Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                            Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                            One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                            Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                            Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                            645 General Feedback

                            Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                            It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                            This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                            debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                            7 Discussion

                            We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                            Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                            Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                            Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                            Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                            There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                            28 httpgithubcomswarmdebugging

                            34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                            8 Threats to Validity

                            Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                            As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                            Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                            Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                            We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                            Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                            Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                            Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                            External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                            9 Related work

                            We now summarise works related to debugging to allow better positioning ofour study among the published research

                            Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                            Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                            36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                            Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                            Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                            DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                            Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                            Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                            Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                            Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                            Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                            Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                            Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                            Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                            10 Conclusion

                            Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                            To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                            The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                            38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                            Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                            Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                            In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                            Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                            Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                            haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                            Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                            11 Acknowledgment

                            This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                            References

                            1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                            2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                            3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                            Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                            rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                            neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                            8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                            9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                            10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                            neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                            on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                            13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                            org107287peerjpreprints2743v1

                            14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                            40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                            16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                            17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                            18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                            19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                            101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                            oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                            1218575

                            22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                            neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                            conditional_breakpointhtmampcp=1_3_6_0_5

                            23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                            24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                            linkspringercom101007s10818-015-9203-6

                            25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                            (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                            actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                            C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                            29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                            30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                            31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                            32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                            pmcentrezamprendertype=abstract

                            33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                            34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                            35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                            36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                            doiacmorg1011452622669

                            37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                            38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                            39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                            40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                            41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                            42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                            43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                            44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                            45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                            46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                            47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                            48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                            49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                            50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                            51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                            52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                            53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                            54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                            55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                            56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                            57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                            58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                            42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            Appendix - Implementation of Swarm Debugging

                            Swarm Debugging Services

                            The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                            Fig 13 The Swarm Debugging Services architecture

                            The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                            We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                            ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                            projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                            and debugging events

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                            Fig 14 The Swarm Debugging metadata [17]

                            ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                            ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                            ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                            ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                            ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                            ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                            The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                            Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                            29 httpprojectsspringiospring-boot

                            44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                            httpswarmdebuggingorgdevelopers

                            searchfindByNamename=petrillo

                            the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                            SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                            Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                            Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                            Fig 15 Swarm Debugging Dashboard

                            30 httpdbswarmdebuggingorg31 httpswwwelasticco

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                            Fig 16 Neo4J Browser - a Cypher query example

                            Graph Querying Console The SDS also persists debugging data in a Neo4J32

                            graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                            Figure 16 shows an example of Cypher query and the resulting graph

                            Swarm Debugging Tracer

                            Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                            After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                            To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                            32 httpneo4jcom

                            46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            Fig 17 The Swarm Tracer architecture [17]

                            Fig 18 The Swarm Manager view

                            Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                            Fig 19 Breakpoint search tool (fuzzy search example)

                            invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                            To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                            Swarm Debugging Views

                            On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                            Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                            Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                            48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            Fig 20 Sequence stack diagram for Bridge design pattern

                            Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                            Breakpoint Search Tool

                            Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                            Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                            Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                            Fig 21 Method call graph for Bridge design pattern [17]

                            StartingEnding Method Search Tool

                            This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                            Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                            StartingPoint = VSP | VSP isin α and VSP isin β

                            EndingPoint = VEP | VEP isin β and VEP isin α

                            Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                            Summary

                            Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                            50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                            graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                            Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                            • 1 Introduction
                            • 2 Background
                            • 3 The Swarm Debugging Approach
                            • 4 SDI in a Nutshell
                            • 5 Using SDI to Understand Debugging Activities
                            • 6 Evaluation of Swarm Debugging using GV
                            • 7 Discussion
                            • 8 Threats to Validity
                            • 9 Related work
                            • 10 Conclusion
                            • 11 Acknowledgment

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 15

                              ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

                              We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

                              52 Study 2 Empirical Study on PDFSaM and Raptor

                              The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

                              The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

                              53 Results

                              As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

                              In the following we present the results regarding each research questionaddressed in the two studies

                              18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

                              16Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

                              We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

                              MFB =EF

                              ET(1)

                              Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

                              Tasks Average Times (min) Std Devs (min)

                              318 44 64

                              667 28 29

                              669 22 25

                              993 25 25

                              1026 25 17

                              PdfSam 54 18

                              Raptor 59 13

                              Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

                              We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

                              a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 17

                              RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

                              For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

                              f(x) =α

                              xβ(2)

                              where α = 12 and β = 044

                              Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

                              We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

                              This finding also corroborates previous results found with a different set oftasks [17]

                              18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                              We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                              Table 3 Study 1 - Breakpoints per type of statement

                              Statements Numbers of Breakpoints

                              call 111 53

                              if-statement 39 19

                              assignment 36 17

                              return 18 10

                              while-loop 3 1

                              Table 4 Study 2 - Breakpoints per type of statement

                              Statements Numbers of Breakpoints

                              call 43 43

                              if-statement 22 22

                              assignment 27 27

                              return 4 4

                              while-loop 4 4

                              13

                              Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                              RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                              We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                              In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                              JabRefDesktopopenExternalViewer(metaData()

                              linktoString() field)

                              Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                              Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                              Tasks Classes Lines of Code Breakpoints

                              0318 AuthorsFormatter 43 5

                              0318 AuthorsFormatter 131 3

                              0667 BasePanel 935 2

                              0667 BasePanel 969 3

                              0667 JabRefDesktop 430 2

                              0669 OpenDatabaseAction 268 2

                              0669 OpenDatabaseAction 433 4

                              0669 OpenDatabaseAction 451 4

                              0993 EntryEditor 717 2

                              0993 EntryEditor 720 2

                              0993 EntryEditor 723 2

                              0993 BibDatabase 187 2

                              0993 BibDatabase 456 2

                              1026 EntryEditor 1184 2

                              1026 BibtexParser 160 2

                              20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                              Classes Lines of Code Breakpoints

                              PdfReader 230 2

                              PdfReader 806 2

                              PdfReader 1923 2

                              ConsoleServicesFacade 89 2

                              ConsoleClient 81 2

                              PdfUtility 94 2

                              PdfUtility 96 2

                              PdfUtility 102 2

                              Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                              Classes Lines of Code Breakpoints

                              icsUtils 333 3

                              Game 1751 2

                              ExamineController 41 2

                              ExamineController 84 3

                              ExamineController 87 2

                              ExamineController 92 2

                              When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                              We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                              For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                              Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                              Classes Lines of Code Breakpoints

                              BibtexParser 138151159 222

                              160165168 323

                              176198199299 2222

                              EntryEditor 717720721 342

                              723837842 232

                              11841393 32

                              BibDatabase 175187223456 2326

                              OpenDatabaseAction 433450451 424

                              JabRefDesktop 4084430 223

                              SaveDatabaseAction 177188 42

                              BasePanel 935969 25

                              AuthorsFormatter 43131 54

                              EntryTableTransferHandler 346 2

                              FieldTextMenu 84 2

                              JabRefFrame 1119 2

                              JabRefMain 8 5

                              URLUtil 95 2

                              Fig 6 Methods with 5 or more breakpoints

                              Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                              22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              Table 9 Study 1 - Breakpoints by class across different tasks

                              Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                              SaveDatabaseAction Yes Yes Yes 7 2

                              BasePanel Yes Yes Yes Yes 14 7

                              JabRefDesktop Yes Yes 9 4

                              EntryEditor Yes Yes Yes 36 4

                              BibtexParser Yes Yes Yes 44 6

                              OpenDatabaseAction Yes Yes Yes 19 13

                              JabRef Yes Yes Yes 3 3

                              JabRefMain Yes Yes Yes Yes 5 4

                              URLUtil Yes Yes 4 2

                              BibDatabase Yes Yes Yes 19 4

                              points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                              Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                              6 Evaluation of Swarm Debugging using GV

                              To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                              RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                              RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                              61 Study design

                              The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                              611 Subject System

                              For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                              612 Participants

                              Fig 7 Java expertise

                              To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                              Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                              20 httpwwwjabreforg21 httpswwwfreelancercom

                              24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              613 Task Description

                              We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                              614 Artifacts and Working Environment

                              After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                              For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                              615 Study Procedure

                              The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                              The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                              22 The full qualitative evaluation survey is available on httpsgooglforms

                              c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                              group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                              616 Data Collection

                              In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                              In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                              All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                              62 Results

                              We now discuss the results of our evaluation

                              RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                              During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                              25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                              26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              number of participants who could propose a solution and the correctness ofthe solutions

                              For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                              Fig 8 GV for Task 0318

                              For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                              Fig 9 GV for Task 0667

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                              Fig 10 GV for Task 0669

                              Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                              13

                              Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                              RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                              We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                              28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              Fig 11 GV usefulness - experimental phase one

                              Fig 12 GV usefulness - experimental phase two

                              The analysis of our results suggests that GV is useful to support software-maintenance tasks

                              Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                              Table 10 Results from control and experimental groups (average)

                              Task 0993

                              Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                              First breakpoint 000255 000340 -44 126

                              Time to start 000444 000518 -33 112

                              Elapsed time 003008 001605 843 53

                              Task 1026

                              Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                              First breakpoint 000242 000448 -126 177

                              Time to start 000402 000343 19 92

                              Elapsed time 002458 002041 257 83

                              63 Comparing Results from the Control and Experimental Groups

                              We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                              Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                              Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                              30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              64 Participantsrsquo Feedback

                              As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                              641 Intrinsic Advantage

                              Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                              Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                              642 Intrinsic Limitations

                              Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                              However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                              Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                              Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                              One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                              Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                              We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                              Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                              Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                              643 Accidental Advantages

                              Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                              Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                              Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                              32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              644 Accidental Limitations

                              Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                              Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                              One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                              Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                              Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                              645 General Feedback

                              Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                              It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                              This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                              debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                              7 Discussion

                              We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                              Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                              Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                              Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                              Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                              There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                              28 httpgithubcomswarmdebugging

                              34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                              8 Threats to Validity

                              Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                              As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                              Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                              Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                              We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                              Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                              Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                              Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                              External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                              9 Related work

                              We now summarise works related to debugging to allow better positioning ofour study among the published research

                              Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                              Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                              36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                              Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                              Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                              DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                              Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                              Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                              Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                              Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                              Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                              Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                              Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                              Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                              10 Conclusion

                              Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                              To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                              The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                              38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                              Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                              Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                              In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                              Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                              Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                              haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                              Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                              11 Acknowledgment

                              This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                              References

                              1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                              2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                              3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                              Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                              rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                              neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                              8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                              9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                              10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                              neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                              on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                              13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                              org107287peerjpreprints2743v1

                              14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                              40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                              16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                              17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                              18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                              19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                              101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                              oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                              1218575

                              22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                              neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                              conditional_breakpointhtmampcp=1_3_6_0_5

                              23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                              24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                              linkspringercom101007s10818-015-9203-6

                              25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                              (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                              actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                              C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                              29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                              30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                              31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                              32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                              pmcentrezamprendertype=abstract

                              33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                              34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                              35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                              36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                              doiacmorg1011452622669

                              37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                              38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                              39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                              40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                              41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                              42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                              43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                              44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                              45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                              46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                              47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                              48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                              49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                              50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                              51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                              52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                              53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                              54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                              55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                              56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                              57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                              58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                              42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              Appendix - Implementation of Swarm Debugging

                              Swarm Debugging Services

                              The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                              Fig 13 The Swarm Debugging Services architecture

                              The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                              We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                              ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                              projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                              and debugging events

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                              Fig 14 The Swarm Debugging metadata [17]

                              ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                              ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                              ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                              ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                              ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                              ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                              The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                              Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                              29 httpprojectsspringiospring-boot

                              44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                              httpswarmdebuggingorgdevelopers

                              searchfindByNamename=petrillo

                              the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                              SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                              Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                              Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                              Fig 15 Swarm Debugging Dashboard

                              30 httpdbswarmdebuggingorg31 httpswwwelasticco

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                              Fig 16 Neo4J Browser - a Cypher query example

                              Graph Querying Console The SDS also persists debugging data in a Neo4J32

                              graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                              Figure 16 shows an example of Cypher query and the resulting graph

                              Swarm Debugging Tracer

                              Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                              After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                              To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                              32 httpneo4jcom

                              46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              Fig 17 The Swarm Tracer architecture [17]

                              Fig 18 The Swarm Manager view

                              Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                              Fig 19 Breakpoint search tool (fuzzy search example)

                              invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                              To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                              Swarm Debugging Views

                              On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                              Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                              Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                              48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              Fig 20 Sequence stack diagram for Bridge design pattern

                              Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                              Breakpoint Search Tool

                              Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                              Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                              Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                              Fig 21 Method call graph for Bridge design pattern [17]

                              StartingEnding Method Search Tool

                              This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                              Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                              StartingPoint = VSP | VSP isin α and VSP isin β

                              EndingPoint = VEP | VEP isin β and VEP isin α

                              Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                              Summary

                              Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                              50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                              graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                              Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                              • 1 Introduction
                              • 2 Background
                              • 3 The Swarm Debugging Approach
                              • 4 SDI in a Nutshell
                              • 5 Using SDI to Understand Debugging Activities
                              • 6 Evaluation of Swarm Debugging using GV
                              • 7 Discussion
                              • 8 Threats to Validity
                              • 9 Related work
                              • 10 Conclusion
                              • 11 Acknowledgment

                                16Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

                                We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

                                MFB =EF

                                ET(1)

                                Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

                                Tasks Average Times (min) Std Devs (min)

                                318 44 64

                                667 28 29

                                669 22 25

                                993 25 25

                                1026 25 17

                                PdfSam 54 18

                                Raptor 59 13

                                Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

                                We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

                                a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 17

                                RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

                                For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

                                f(x) =α

                                xβ(2)

                                where α = 12 and β = 044

                                Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

                                We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

                                This finding also corroborates previous results found with a different set oftasks [17]

                                18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                                We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                                Table 3 Study 1 - Breakpoints per type of statement

                                Statements Numbers of Breakpoints

                                call 111 53

                                if-statement 39 19

                                assignment 36 17

                                return 18 10

                                while-loop 3 1

                                Table 4 Study 2 - Breakpoints per type of statement

                                Statements Numbers of Breakpoints

                                call 43 43

                                if-statement 22 22

                                assignment 27 27

                                return 4 4

                                while-loop 4 4

                                13

                                Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                                RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                                We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                                In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                                JabRefDesktopopenExternalViewer(metaData()

                                linktoString() field)

                                Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                                Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                                Tasks Classes Lines of Code Breakpoints

                                0318 AuthorsFormatter 43 5

                                0318 AuthorsFormatter 131 3

                                0667 BasePanel 935 2

                                0667 BasePanel 969 3

                                0667 JabRefDesktop 430 2

                                0669 OpenDatabaseAction 268 2

                                0669 OpenDatabaseAction 433 4

                                0669 OpenDatabaseAction 451 4

                                0993 EntryEditor 717 2

                                0993 EntryEditor 720 2

                                0993 EntryEditor 723 2

                                0993 BibDatabase 187 2

                                0993 BibDatabase 456 2

                                1026 EntryEditor 1184 2

                                1026 BibtexParser 160 2

                                20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                                Classes Lines of Code Breakpoints

                                PdfReader 230 2

                                PdfReader 806 2

                                PdfReader 1923 2

                                ConsoleServicesFacade 89 2

                                ConsoleClient 81 2

                                PdfUtility 94 2

                                PdfUtility 96 2

                                PdfUtility 102 2

                                Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                                Classes Lines of Code Breakpoints

                                icsUtils 333 3

                                Game 1751 2

                                ExamineController 41 2

                                ExamineController 84 3

                                ExamineController 87 2

                                ExamineController 92 2

                                When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                                We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                                For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                                Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                                Classes Lines of Code Breakpoints

                                BibtexParser 138151159 222

                                160165168 323

                                176198199299 2222

                                EntryEditor 717720721 342

                                723837842 232

                                11841393 32

                                BibDatabase 175187223456 2326

                                OpenDatabaseAction 433450451 424

                                JabRefDesktop 4084430 223

                                SaveDatabaseAction 177188 42

                                BasePanel 935969 25

                                AuthorsFormatter 43131 54

                                EntryTableTransferHandler 346 2

                                FieldTextMenu 84 2

                                JabRefFrame 1119 2

                                JabRefMain 8 5

                                URLUtil 95 2

                                Fig 6 Methods with 5 or more breakpoints

                                Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                                22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                Table 9 Study 1 - Breakpoints by class across different tasks

                                Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                                SaveDatabaseAction Yes Yes Yes 7 2

                                BasePanel Yes Yes Yes Yes 14 7

                                JabRefDesktop Yes Yes 9 4

                                EntryEditor Yes Yes Yes 36 4

                                BibtexParser Yes Yes Yes 44 6

                                OpenDatabaseAction Yes Yes Yes 19 13

                                JabRef Yes Yes Yes 3 3

                                JabRefMain Yes Yes Yes Yes 5 4

                                URLUtil Yes Yes 4 2

                                BibDatabase Yes Yes Yes 19 4

                                points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                                Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                                6 Evaluation of Swarm Debugging using GV

                                To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                                RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                                RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                                61 Study design

                                The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                                611 Subject System

                                For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                                612 Participants

                                Fig 7 Java expertise

                                To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                                Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                                20 httpwwwjabreforg21 httpswwwfreelancercom

                                24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                613 Task Description

                                We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                                614 Artifacts and Working Environment

                                After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                                For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                                615 Study Procedure

                                The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                                The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                                22 The full qualitative evaluation survey is available on httpsgooglforms

                                c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                                group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                                616 Data Collection

                                In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                                In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                                All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                                62 Results

                                We now discuss the results of our evaluation

                                RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                                During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                                25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                                26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                number of participants who could propose a solution and the correctness ofthe solutions

                                For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                Fig 8 GV for Task 0318

                                For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                Fig 9 GV for Task 0667

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                Fig 10 GV for Task 0669

                                Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                13

                                Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                Fig 11 GV usefulness - experimental phase one

                                Fig 12 GV usefulness - experimental phase two

                                The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                Table 10 Results from control and experimental groups (average)

                                Task 0993

                                Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                First breakpoint 000255 000340 -44 126

                                Time to start 000444 000518 -33 112

                                Elapsed time 003008 001605 843 53

                                Task 1026

                                Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                First breakpoint 000242 000448 -126 177

                                Time to start 000402 000343 19 92

                                Elapsed time 002458 002041 257 83

                                63 Comparing Results from the Control and Experimental Groups

                                We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                64 Participantsrsquo Feedback

                                As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                641 Intrinsic Advantage

                                Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                642 Intrinsic Limitations

                                Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                643 Accidental Advantages

                                Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                644 Accidental Limitations

                                Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                645 General Feedback

                                Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                7 Discussion

                                We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                28 httpgithubcomswarmdebugging

                                34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                8 Threats to Validity

                                Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                9 Related work

                                We now summarise works related to debugging to allow better positioning ofour study among the published research

                                Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                10 Conclusion

                                Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                11 Acknowledgment

                                This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                References

                                1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                org107287peerjpreprints2743v1

                                14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                1218575

                                22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                conditional_breakpointhtmampcp=1_3_6_0_5

                                23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                linkspringercom101007s10818-015-9203-6

                                25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                pmcentrezamprendertype=abstract

                                33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                doiacmorg1011452622669

                                37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                Appendix - Implementation of Swarm Debugging

                                Swarm Debugging Services

                                The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                Fig 13 The Swarm Debugging Services architecture

                                The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                and debugging events

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                Fig 14 The Swarm Debugging metadata [17]

                                ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                29 httpprojectsspringiospring-boot

                                44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                httpswarmdebuggingorgdevelopers

                                searchfindByNamename=petrillo

                                the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                Fig 15 Swarm Debugging Dashboard

                                30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                Fig 16 Neo4J Browser - a Cypher query example

                                Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                Figure 16 shows an example of Cypher query and the resulting graph

                                Swarm Debugging Tracer

                                Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                32 httpneo4jcom

                                46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                Fig 17 The Swarm Tracer architecture [17]

                                Fig 18 The Swarm Manager view

                                Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                Fig 19 Breakpoint search tool (fuzzy search example)

                                invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                Swarm Debugging Views

                                On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                Fig 20 Sequence stack diagram for Bridge design pattern

                                Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                Breakpoint Search Tool

                                Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                Fig 21 Method call graph for Bridge design pattern [17]

                                StartingEnding Method Search Tool

                                This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                StartingPoint = VSP | VSP isin α and VSP isin β

                                EndingPoint = VEP | VEP isin β and VEP isin α

                                Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                Summary

                                Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                • 1 Introduction
                                • 2 Background
                                • 3 The Swarm Debugging Approach
                                • 4 SDI in a Nutshell
                                • 5 Using SDI to Understand Debugging Activities
                                • 6 Evaluation of Swarm Debugging using GV
                                • 7 Discussion
                                • 8 Threats to Validity
                                • 9 Related work
                                • 10 Conclusion
                                • 11 Acknowledgment

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 17

                                  RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

                                  For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

                                  f(x) =α

                                  xβ(2)

                                  where α = 12 and β = 044

                                  Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

                                  We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

                                  This finding also corroborates previous results found with a different set oftasks [17]

                                  18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                                  We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                                  Table 3 Study 1 - Breakpoints per type of statement

                                  Statements Numbers of Breakpoints

                                  call 111 53

                                  if-statement 39 19

                                  assignment 36 17

                                  return 18 10

                                  while-loop 3 1

                                  Table 4 Study 2 - Breakpoints per type of statement

                                  Statements Numbers of Breakpoints

                                  call 43 43

                                  if-statement 22 22

                                  assignment 27 27

                                  return 4 4

                                  while-loop 4 4

                                  13

                                  Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                                  RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                                  We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                                  In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                                  JabRefDesktopopenExternalViewer(metaData()

                                  linktoString() field)

                                  Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                                  Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                                  Tasks Classes Lines of Code Breakpoints

                                  0318 AuthorsFormatter 43 5

                                  0318 AuthorsFormatter 131 3

                                  0667 BasePanel 935 2

                                  0667 BasePanel 969 3

                                  0667 JabRefDesktop 430 2

                                  0669 OpenDatabaseAction 268 2

                                  0669 OpenDatabaseAction 433 4

                                  0669 OpenDatabaseAction 451 4

                                  0993 EntryEditor 717 2

                                  0993 EntryEditor 720 2

                                  0993 EntryEditor 723 2

                                  0993 BibDatabase 187 2

                                  0993 BibDatabase 456 2

                                  1026 EntryEditor 1184 2

                                  1026 BibtexParser 160 2

                                  20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                                  Classes Lines of Code Breakpoints

                                  PdfReader 230 2

                                  PdfReader 806 2

                                  PdfReader 1923 2

                                  ConsoleServicesFacade 89 2

                                  ConsoleClient 81 2

                                  PdfUtility 94 2

                                  PdfUtility 96 2

                                  PdfUtility 102 2

                                  Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                                  Classes Lines of Code Breakpoints

                                  icsUtils 333 3

                                  Game 1751 2

                                  ExamineController 41 2

                                  ExamineController 84 3

                                  ExamineController 87 2

                                  ExamineController 92 2

                                  When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                                  We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                                  For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                                  Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                                  Classes Lines of Code Breakpoints

                                  BibtexParser 138151159 222

                                  160165168 323

                                  176198199299 2222

                                  EntryEditor 717720721 342

                                  723837842 232

                                  11841393 32

                                  BibDatabase 175187223456 2326

                                  OpenDatabaseAction 433450451 424

                                  JabRefDesktop 4084430 223

                                  SaveDatabaseAction 177188 42

                                  BasePanel 935969 25

                                  AuthorsFormatter 43131 54

                                  EntryTableTransferHandler 346 2

                                  FieldTextMenu 84 2

                                  JabRefFrame 1119 2

                                  JabRefMain 8 5

                                  URLUtil 95 2

                                  Fig 6 Methods with 5 or more breakpoints

                                  Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                                  22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  Table 9 Study 1 - Breakpoints by class across different tasks

                                  Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                                  SaveDatabaseAction Yes Yes Yes 7 2

                                  BasePanel Yes Yes Yes Yes 14 7

                                  JabRefDesktop Yes Yes 9 4

                                  EntryEditor Yes Yes Yes 36 4

                                  BibtexParser Yes Yes Yes 44 6

                                  OpenDatabaseAction Yes Yes Yes 19 13

                                  JabRef Yes Yes Yes 3 3

                                  JabRefMain Yes Yes Yes Yes 5 4

                                  URLUtil Yes Yes 4 2

                                  BibDatabase Yes Yes Yes 19 4

                                  points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                                  Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                                  6 Evaluation of Swarm Debugging using GV

                                  To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                                  RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                                  RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                                  61 Study design

                                  The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                                  611 Subject System

                                  For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                                  612 Participants

                                  Fig 7 Java expertise

                                  To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                                  Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                                  20 httpwwwjabreforg21 httpswwwfreelancercom

                                  24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  613 Task Description

                                  We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                                  614 Artifacts and Working Environment

                                  After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                                  For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                                  615 Study Procedure

                                  The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                                  The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                                  22 The full qualitative evaluation survey is available on httpsgooglforms

                                  c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                                  group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                                  616 Data Collection

                                  In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                                  In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                                  All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                                  62 Results

                                  We now discuss the results of our evaluation

                                  RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                                  During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                                  25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                                  26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  number of participants who could propose a solution and the correctness ofthe solutions

                                  For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                  Fig 8 GV for Task 0318

                                  For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                  Fig 9 GV for Task 0667

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                  Fig 10 GV for Task 0669

                                  Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                  13

                                  Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                  RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                  We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                  28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  Fig 11 GV usefulness - experimental phase one

                                  Fig 12 GV usefulness - experimental phase two

                                  The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                  Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                  Table 10 Results from control and experimental groups (average)

                                  Task 0993

                                  Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                  First breakpoint 000255 000340 -44 126

                                  Time to start 000444 000518 -33 112

                                  Elapsed time 003008 001605 843 53

                                  Task 1026

                                  Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                  First breakpoint 000242 000448 -126 177

                                  Time to start 000402 000343 19 92

                                  Elapsed time 002458 002041 257 83

                                  63 Comparing Results from the Control and Experimental Groups

                                  We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                  Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                  Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                  30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  64 Participantsrsquo Feedback

                                  As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                  641 Intrinsic Advantage

                                  Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                  Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                  642 Intrinsic Limitations

                                  Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                  However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                  Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                  Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                  One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                  Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                  We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                  Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                  Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                  643 Accidental Advantages

                                  Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                  Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                  Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                  32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  644 Accidental Limitations

                                  Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                  Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                  One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                  Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                  Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                  645 General Feedback

                                  Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                  It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                  This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                  debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                  7 Discussion

                                  We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                  Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                  Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                  Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                  Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                  There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                  28 httpgithubcomswarmdebugging

                                  34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                  8 Threats to Validity

                                  Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                  As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                  Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                  Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                  We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                  Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                  Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                  Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                  External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                  9 Related work

                                  We now summarise works related to debugging to allow better positioning ofour study among the published research

                                  Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                  Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                  36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                  Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                  Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                  DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                  Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                  Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                  Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                  Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                  Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                  Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                  Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                  Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                  10 Conclusion

                                  Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                  To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                  The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                  38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                  Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                  Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                  In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                  Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                  Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                  haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                  Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                  11 Acknowledgment

                                  This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                  References

                                  1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                  2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                  3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                  Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                  rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                  neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                  8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                  9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                  10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                  neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                  on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                  13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                  org107287peerjpreprints2743v1

                                  14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                  40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                  16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                  17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                  18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                  19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                  101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                  oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                  1218575

                                  22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                  neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                  conditional_breakpointhtmampcp=1_3_6_0_5

                                  23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                  24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                  linkspringercom101007s10818-015-9203-6

                                  25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                  (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                  actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                  C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                  29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                  30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                  31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                  32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                  pmcentrezamprendertype=abstract

                                  33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                  34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                  35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                  36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                  doiacmorg1011452622669

                                  37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                  38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                  39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                  40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                  41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                  42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                  43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                  44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                  45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                  46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                  47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                  48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                  49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                  50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                  51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                  52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                  53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                  54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                  55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                  56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                  57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                  58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                  42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  Appendix - Implementation of Swarm Debugging

                                  Swarm Debugging Services

                                  The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                  Fig 13 The Swarm Debugging Services architecture

                                  The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                  We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                  ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                  projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                  and debugging events

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                  Fig 14 The Swarm Debugging metadata [17]

                                  ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                  ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                  ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                  ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                  ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                  ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                  The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                  Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                  29 httpprojectsspringiospring-boot

                                  44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                  httpswarmdebuggingorgdevelopers

                                  searchfindByNamename=petrillo

                                  the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                  SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                  Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                  Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                  Fig 15 Swarm Debugging Dashboard

                                  30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                  Fig 16 Neo4J Browser - a Cypher query example

                                  Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                  graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                  Figure 16 shows an example of Cypher query and the resulting graph

                                  Swarm Debugging Tracer

                                  Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                  After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                  To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                  32 httpneo4jcom

                                  46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  Fig 17 The Swarm Tracer architecture [17]

                                  Fig 18 The Swarm Manager view

                                  Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                  Fig 19 Breakpoint search tool (fuzzy search example)

                                  invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                  To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                  Swarm Debugging Views

                                  On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                  Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                  Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                  48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  Fig 20 Sequence stack diagram for Bridge design pattern

                                  Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                  Breakpoint Search Tool

                                  Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                  Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                  Fig 21 Method call graph for Bridge design pattern [17]

                                  StartingEnding Method Search Tool

                                  This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                  Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                  StartingPoint = VSP | VSP isin α and VSP isin β

                                  EndingPoint = VEP | VEP isin β and VEP isin α

                                  Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                  Summary

                                  Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                  50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                  graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                  Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                  • 1 Introduction
                                  • 2 Background
                                  • 3 The Swarm Debugging Approach
                                  • 4 SDI in a Nutshell
                                  • 5 Using SDI to Understand Debugging Activities
                                  • 6 Evaluation of Swarm Debugging using GV
                                  • 7 Discussion
                                  • 8 Threats to Validity
                                  • 9 Related work
                                  • 10 Conclusion
                                  • 11 Acknowledgment

                                    18Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

                                    We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

                                    Table 3 Study 1 - Breakpoints per type of statement

                                    Statements Numbers of Breakpoints

                                    call 111 53

                                    if-statement 39 19

                                    assignment 36 17

                                    return 18 10

                                    while-loop 3 1

                                    Table 4 Study 2 - Breakpoints per type of statement

                                    Statements Numbers of Breakpoints

                                    call 43 43

                                    if-statement 22 22

                                    assignment 27 27

                                    return 4 4

                                    while-loop 4 4

                                    13

                                    Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                                    RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                                    We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                                    In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                                    JabRefDesktopopenExternalViewer(metaData()

                                    linktoString() field)

                                    Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                                    Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                                    Tasks Classes Lines of Code Breakpoints

                                    0318 AuthorsFormatter 43 5

                                    0318 AuthorsFormatter 131 3

                                    0667 BasePanel 935 2

                                    0667 BasePanel 969 3

                                    0667 JabRefDesktop 430 2

                                    0669 OpenDatabaseAction 268 2

                                    0669 OpenDatabaseAction 433 4

                                    0669 OpenDatabaseAction 451 4

                                    0993 EntryEditor 717 2

                                    0993 EntryEditor 720 2

                                    0993 EntryEditor 723 2

                                    0993 BibDatabase 187 2

                                    0993 BibDatabase 456 2

                                    1026 EntryEditor 1184 2

                                    1026 BibtexParser 160 2

                                    20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                                    Classes Lines of Code Breakpoints

                                    PdfReader 230 2

                                    PdfReader 806 2

                                    PdfReader 1923 2

                                    ConsoleServicesFacade 89 2

                                    ConsoleClient 81 2

                                    PdfUtility 94 2

                                    PdfUtility 96 2

                                    PdfUtility 102 2

                                    Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                                    Classes Lines of Code Breakpoints

                                    icsUtils 333 3

                                    Game 1751 2

                                    ExamineController 41 2

                                    ExamineController 84 3

                                    ExamineController 87 2

                                    ExamineController 92 2

                                    When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                                    We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                                    For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                                    Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                                    Classes Lines of Code Breakpoints

                                    BibtexParser 138151159 222

                                    160165168 323

                                    176198199299 2222

                                    EntryEditor 717720721 342

                                    723837842 232

                                    11841393 32

                                    BibDatabase 175187223456 2326

                                    OpenDatabaseAction 433450451 424

                                    JabRefDesktop 4084430 223

                                    SaveDatabaseAction 177188 42

                                    BasePanel 935969 25

                                    AuthorsFormatter 43131 54

                                    EntryTableTransferHandler 346 2

                                    FieldTextMenu 84 2

                                    JabRefFrame 1119 2

                                    JabRefMain 8 5

                                    URLUtil 95 2

                                    Fig 6 Methods with 5 or more breakpoints

                                    Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                                    22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    Table 9 Study 1 - Breakpoints by class across different tasks

                                    Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                                    SaveDatabaseAction Yes Yes Yes 7 2

                                    BasePanel Yes Yes Yes Yes 14 7

                                    JabRefDesktop Yes Yes 9 4

                                    EntryEditor Yes Yes Yes 36 4

                                    BibtexParser Yes Yes Yes 44 6

                                    OpenDatabaseAction Yes Yes Yes 19 13

                                    JabRef Yes Yes Yes 3 3

                                    JabRefMain Yes Yes Yes Yes 5 4

                                    URLUtil Yes Yes 4 2

                                    BibDatabase Yes Yes Yes 19 4

                                    points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                                    Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                                    6 Evaluation of Swarm Debugging using GV

                                    To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                                    RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                                    RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                                    61 Study design

                                    The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                                    611 Subject System

                                    For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                                    612 Participants

                                    Fig 7 Java expertise

                                    To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                                    Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                                    20 httpwwwjabreforg21 httpswwwfreelancercom

                                    24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    613 Task Description

                                    We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                                    614 Artifacts and Working Environment

                                    After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                                    For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                                    615 Study Procedure

                                    The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                                    The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                                    22 The full qualitative evaluation survey is available on httpsgooglforms

                                    c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                                    group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                                    616 Data Collection

                                    In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                                    In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                                    All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                                    62 Results

                                    We now discuss the results of our evaluation

                                    RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                                    During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                                    25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                                    26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    number of participants who could propose a solution and the correctness ofthe solutions

                                    For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                    Fig 8 GV for Task 0318

                                    For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                    Fig 9 GV for Task 0667

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                    Fig 10 GV for Task 0669

                                    Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                    13

                                    Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                    RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                    We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                    28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    Fig 11 GV usefulness - experimental phase one

                                    Fig 12 GV usefulness - experimental phase two

                                    The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                    Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                    Table 10 Results from control and experimental groups (average)

                                    Task 0993

                                    Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                    First breakpoint 000255 000340 -44 126

                                    Time to start 000444 000518 -33 112

                                    Elapsed time 003008 001605 843 53

                                    Task 1026

                                    Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                    First breakpoint 000242 000448 -126 177

                                    Time to start 000402 000343 19 92

                                    Elapsed time 002458 002041 257 83

                                    63 Comparing Results from the Control and Experimental Groups

                                    We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                    Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                    Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                    30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    64 Participantsrsquo Feedback

                                    As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                    641 Intrinsic Advantage

                                    Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                    Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                    642 Intrinsic Limitations

                                    Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                    However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                    Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                    Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                    One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                    Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                    We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                    Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                    Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                    643 Accidental Advantages

                                    Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                    Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                    Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                    32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    644 Accidental Limitations

                                    Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                    Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                    One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                    Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                    Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                    645 General Feedback

                                    Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                    It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                    This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                    debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                    7 Discussion

                                    We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                    Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                    Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                    Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                    Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                    There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                    28 httpgithubcomswarmdebugging

                                    34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                    8 Threats to Validity

                                    Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                    As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                    Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                    Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                    We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                    Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                    Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                    Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                    External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                    9 Related work

                                    We now summarise works related to debugging to allow better positioning ofour study among the published research

                                    Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                    Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                    36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                    Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                    Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                    DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                    Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                    Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                    Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                    Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                    Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                    Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                    Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                    Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                    10 Conclusion

                                    Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                    To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                    The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                    38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                    Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                    Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                    In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                    Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                    Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                    haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                    Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                    11 Acknowledgment

                                    This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                    References

                                    1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                    2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                    3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                    Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                    rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                    neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                    8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                    9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                    10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                    neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                    on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                    13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                    org107287peerjpreprints2743v1

                                    14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                    40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                    16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                    17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                    18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                    19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                    101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                    oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                    1218575

                                    22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                    neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                    conditional_breakpointhtmampcp=1_3_6_0_5

                                    23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                    24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                    linkspringercom101007s10818-015-9203-6

                                    25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                    (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                    actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                    C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                    29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                    30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                    31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                    32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                    pmcentrezamprendertype=abstract

                                    33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                    34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                    35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                    36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                    doiacmorg1011452622669

                                    37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                    38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                    39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                    40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                    41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                    42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                    43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                    44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                    45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                    46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                    47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                    48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                    49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                    50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                    51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                    52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                    53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                    54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                    55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                    56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                    57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                    58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                    42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    Appendix - Implementation of Swarm Debugging

                                    Swarm Debugging Services

                                    The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                    Fig 13 The Swarm Debugging Services architecture

                                    The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                    We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                    ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                    projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                    and debugging events

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                    Fig 14 The Swarm Debugging metadata [17]

                                    ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                    ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                    ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                    ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                    ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                    ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                    The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                    Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                    29 httpprojectsspringiospring-boot

                                    44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                    httpswarmdebuggingorgdevelopers

                                    searchfindByNamename=petrillo

                                    the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                    SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                    Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                    Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                    Fig 15 Swarm Debugging Dashboard

                                    30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                    Fig 16 Neo4J Browser - a Cypher query example

                                    Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                    graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                    Figure 16 shows an example of Cypher query and the resulting graph

                                    Swarm Debugging Tracer

                                    Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                    After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                    To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                    32 httpneo4jcom

                                    46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    Fig 17 The Swarm Tracer architecture [17]

                                    Fig 18 The Swarm Manager view

                                    Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                    Fig 19 Breakpoint search tool (fuzzy search example)

                                    invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                    To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                    Swarm Debugging Views

                                    On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                    Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                    Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                    48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    Fig 20 Sequence stack diagram for Bridge design pattern

                                    Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                    Breakpoint Search Tool

                                    Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                    Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                    Fig 21 Method call graph for Bridge design pattern [17]

                                    StartingEnding Method Search Tool

                                    This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                    Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                    StartingPoint = VSP | VSP isin α and VSP isin β

                                    EndingPoint = VEP | VEP isin β and VEP isin α

                                    Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                    Summary

                                    Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                    50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                    graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                    Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                    • 1 Introduction
                                    • 2 Background
                                    • 3 The Swarm Debugging Approach
                                    • 4 SDI in a Nutshell
                                    • 5 Using SDI to Understand Debugging Activities
                                    • 6 Evaluation of Swarm Debugging using GV
                                    • 7 Discussion
                                    • 8 Threats to Validity
                                    • 9 Related work
                                    • 10 Conclusion
                                    • 11 Acknowledgment

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 19

                                      RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

                                      We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

                                      In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

                                      JabRefDesktopopenExternalViewer(metaData()

                                      linktoString() field)

                                      Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

                                      Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

                                      Tasks Classes Lines of Code Breakpoints

                                      0318 AuthorsFormatter 43 5

                                      0318 AuthorsFormatter 131 3

                                      0667 BasePanel 935 2

                                      0667 BasePanel 969 3

                                      0667 JabRefDesktop 430 2

                                      0669 OpenDatabaseAction 268 2

                                      0669 OpenDatabaseAction 433 4

                                      0669 OpenDatabaseAction 451 4

                                      0993 EntryEditor 717 2

                                      0993 EntryEditor 720 2

                                      0993 EntryEditor 723 2

                                      0993 BibDatabase 187 2

                                      0993 BibDatabase 456 2

                                      1026 EntryEditor 1184 2

                                      1026 BibtexParser 160 2

                                      20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                                      Classes Lines of Code Breakpoints

                                      PdfReader 230 2

                                      PdfReader 806 2

                                      PdfReader 1923 2

                                      ConsoleServicesFacade 89 2

                                      ConsoleClient 81 2

                                      PdfUtility 94 2

                                      PdfUtility 96 2

                                      PdfUtility 102 2

                                      Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                                      Classes Lines of Code Breakpoints

                                      icsUtils 333 3

                                      Game 1751 2

                                      ExamineController 41 2

                                      ExamineController 84 3

                                      ExamineController 87 2

                                      ExamineController 92 2

                                      When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                                      We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                                      For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                                      Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                                      Classes Lines of Code Breakpoints

                                      BibtexParser 138151159 222

                                      160165168 323

                                      176198199299 2222

                                      EntryEditor 717720721 342

                                      723837842 232

                                      11841393 32

                                      BibDatabase 175187223456 2326

                                      OpenDatabaseAction 433450451 424

                                      JabRefDesktop 4084430 223

                                      SaveDatabaseAction 177188 42

                                      BasePanel 935969 25

                                      AuthorsFormatter 43131 54

                                      EntryTableTransferHandler 346 2

                                      FieldTextMenu 84 2

                                      JabRefFrame 1119 2

                                      JabRefMain 8 5

                                      URLUtil 95 2

                                      Fig 6 Methods with 5 or more breakpoints

                                      Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                                      22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      Table 9 Study 1 - Breakpoints by class across different tasks

                                      Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                                      SaveDatabaseAction Yes Yes Yes 7 2

                                      BasePanel Yes Yes Yes Yes 14 7

                                      JabRefDesktop Yes Yes 9 4

                                      EntryEditor Yes Yes Yes 36 4

                                      BibtexParser Yes Yes Yes 44 6

                                      OpenDatabaseAction Yes Yes Yes 19 13

                                      JabRef Yes Yes Yes 3 3

                                      JabRefMain Yes Yes Yes Yes 5 4

                                      URLUtil Yes Yes 4 2

                                      BibDatabase Yes Yes Yes 19 4

                                      points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                                      Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                                      6 Evaluation of Swarm Debugging using GV

                                      To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                                      RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                                      RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                                      61 Study design

                                      The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                                      611 Subject System

                                      For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                                      612 Participants

                                      Fig 7 Java expertise

                                      To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                                      Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                                      20 httpwwwjabreforg21 httpswwwfreelancercom

                                      24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      613 Task Description

                                      We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                                      614 Artifacts and Working Environment

                                      After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                                      For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                                      615 Study Procedure

                                      The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                                      The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                                      22 The full qualitative evaluation survey is available on httpsgooglforms

                                      c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                                      group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                                      616 Data Collection

                                      In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                                      In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                                      All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                                      62 Results

                                      We now discuss the results of our evaluation

                                      RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                                      During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                                      25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                                      26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      number of participants who could propose a solution and the correctness ofthe solutions

                                      For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                      Fig 8 GV for Task 0318

                                      For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                      Fig 9 GV for Task 0667

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                      Fig 10 GV for Task 0669

                                      Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                      13

                                      Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                      RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                      We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                      28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      Fig 11 GV usefulness - experimental phase one

                                      Fig 12 GV usefulness - experimental phase two

                                      The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                      Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                      Table 10 Results from control and experimental groups (average)

                                      Task 0993

                                      Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                      First breakpoint 000255 000340 -44 126

                                      Time to start 000444 000518 -33 112

                                      Elapsed time 003008 001605 843 53

                                      Task 1026

                                      Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                      First breakpoint 000242 000448 -126 177

                                      Time to start 000402 000343 19 92

                                      Elapsed time 002458 002041 257 83

                                      63 Comparing Results from the Control and Experimental Groups

                                      We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                      Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                      Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                      30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      64 Participantsrsquo Feedback

                                      As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                      641 Intrinsic Advantage

                                      Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                      Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                      642 Intrinsic Limitations

                                      Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                      However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                      Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                      Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                      One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                      Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                      We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                      Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                      Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                      643 Accidental Advantages

                                      Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                      Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                      Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                      32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      644 Accidental Limitations

                                      Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                      Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                      One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                      Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                      Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                      645 General Feedback

                                      Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                      It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                      This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                      debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                      7 Discussion

                                      We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                      Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                      Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                      Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                      Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                      There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                      28 httpgithubcomswarmdebugging

                                      34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                      8 Threats to Validity

                                      Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                      As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                      Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                      Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                      We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                      Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                      Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                      Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                      External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                      9 Related work

                                      We now summarise works related to debugging to allow better positioning ofour study among the published research

                                      Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                      Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                      36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                      Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                      Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                      DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                      Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                      Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                      Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                      Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                      Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                      Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                      Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                      Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                      10 Conclusion

                                      Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                      To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                      The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                      38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                      Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                      Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                      In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                      Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                      Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                      haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                      Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                      11 Acknowledgment

                                      This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                      References

                                      1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                      2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                      3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                      Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                      rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                      neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                      8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                      9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                      10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                      neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                      on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                      13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                      org107287peerjpreprints2743v1

                                      14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                      40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                      16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                      17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                      18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                      19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                      101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                      oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                      1218575

                                      22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                      neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                      conditional_breakpointhtmampcp=1_3_6_0_5

                                      23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                      24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                      linkspringercom101007s10818-015-9203-6

                                      25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                      (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                      actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                      C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                      29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                      30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                      31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                      32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                      pmcentrezamprendertype=abstract

                                      33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                      34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                      35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                      36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                      doiacmorg1011452622669

                                      37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                      38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                      39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                      40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                      41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                      42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                      43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                      44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                      45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                      46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                      47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                      48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                      49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                      50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                      51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                      52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                      53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                      54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                      55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                      56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                      57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                      58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                      42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      Appendix - Implementation of Swarm Debugging

                                      Swarm Debugging Services

                                      The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                      Fig 13 The Swarm Debugging Services architecture

                                      The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                      We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                      ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                      projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                      and debugging events

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                      Fig 14 The Swarm Debugging metadata [17]

                                      ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                      ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                      ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                      ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                      ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                      ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                      The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                      Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                      29 httpprojectsspringiospring-boot

                                      44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                      httpswarmdebuggingorgdevelopers

                                      searchfindByNamename=petrillo

                                      the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                      SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                      Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                      Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                      Fig 15 Swarm Debugging Dashboard

                                      30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                      Fig 16 Neo4J Browser - a Cypher query example

                                      Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                      graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                      Figure 16 shows an example of Cypher query and the resulting graph

                                      Swarm Debugging Tracer

                                      Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                      After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                      To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                      32 httpneo4jcom

                                      46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      Fig 17 The Swarm Tracer architecture [17]

                                      Fig 18 The Swarm Manager view

                                      Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                      Fig 19 Breakpoint search tool (fuzzy search example)

                                      invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                      To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                      Swarm Debugging Views

                                      On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                      Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                      Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                      48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      Fig 20 Sequence stack diagram for Bridge design pattern

                                      Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                      Breakpoint Search Tool

                                      Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                      Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                      Fig 21 Method call graph for Bridge design pattern [17]

                                      StartingEnding Method Search Tool

                                      This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                      Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                      StartingPoint = VSP | VSP isin α and VSP isin β

                                      EndingPoint = VEP | VEP isin β and VEP isin α

                                      Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                      Summary

                                      Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                      50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                      graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                      Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                      • 1 Introduction
                                      • 2 Background
                                      • 3 The Swarm Debugging Approach
                                      • 4 SDI in a Nutshell
                                      • 5 Using SDI to Understand Debugging Activities
                                      • 6 Evaluation of Swarm Debugging using GV
                                      • 7 Discussion
                                      • 8 Threats to Validity
                                      • 9 Related work
                                      • 10 Conclusion
                                      • 11 Acknowledgment

                                        20Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

                                        Classes Lines of Code Breakpoints

                                        PdfReader 230 2

                                        PdfReader 806 2

                                        PdfReader 1923 2

                                        ConsoleServicesFacade 89 2

                                        ConsoleClient 81 2

                                        PdfUtility 94 2

                                        PdfUtility 96 2

                                        PdfUtility 102 2

                                        Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

                                        Classes Lines of Code Breakpoints

                                        icsUtils 333 3

                                        Game 1751 2

                                        ExamineController 41 2

                                        ExamineController 84 3

                                        ExamineController 87 2

                                        ExamineController 92 2

                                        When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

                                        We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

                                        For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                                        Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                                        Classes Lines of Code Breakpoints

                                        BibtexParser 138151159 222

                                        160165168 323

                                        176198199299 2222

                                        EntryEditor 717720721 342

                                        723837842 232

                                        11841393 32

                                        BibDatabase 175187223456 2326

                                        OpenDatabaseAction 433450451 424

                                        JabRefDesktop 4084430 223

                                        SaveDatabaseAction 177188 42

                                        BasePanel 935969 25

                                        AuthorsFormatter 43131 54

                                        EntryTableTransferHandler 346 2

                                        FieldTextMenu 84 2

                                        JabRefFrame 1119 2

                                        JabRefMain 8 5

                                        URLUtil 95 2

                                        Fig 6 Methods with 5 or more breakpoints

                                        Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                                        22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        Table 9 Study 1 - Breakpoints by class across different tasks

                                        Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                                        SaveDatabaseAction Yes Yes Yes 7 2

                                        BasePanel Yes Yes Yes Yes 14 7

                                        JabRefDesktop Yes Yes 9 4

                                        EntryEditor Yes Yes Yes 36 4

                                        BibtexParser Yes Yes Yes 44 6

                                        OpenDatabaseAction Yes Yes Yes 19 13

                                        JabRef Yes Yes Yes 3 3

                                        JabRefMain Yes Yes Yes Yes 5 4

                                        URLUtil Yes Yes 4 2

                                        BibDatabase Yes Yes Yes 19 4

                                        points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                                        Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                                        6 Evaluation of Swarm Debugging using GV

                                        To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                                        RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                                        RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                                        61 Study design

                                        The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                                        611 Subject System

                                        For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                                        612 Participants

                                        Fig 7 Java expertise

                                        To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                                        Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                                        20 httpwwwjabreforg21 httpswwwfreelancercom

                                        24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        613 Task Description

                                        We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                                        614 Artifacts and Working Environment

                                        After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                                        For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                                        615 Study Procedure

                                        The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                                        The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                                        22 The full qualitative evaluation survey is available on httpsgooglforms

                                        c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                                        group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                                        616 Data Collection

                                        In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                                        In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                                        All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                                        62 Results

                                        We now discuss the results of our evaluation

                                        RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                                        During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                                        25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                                        26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        number of participants who could propose a solution and the correctness ofthe solutions

                                        For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                        Fig 8 GV for Task 0318

                                        For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                        Fig 9 GV for Task 0667

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                        Fig 10 GV for Task 0669

                                        Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                        13

                                        Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                        RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                        We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                        28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        Fig 11 GV usefulness - experimental phase one

                                        Fig 12 GV usefulness - experimental phase two

                                        The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                        Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                        Table 10 Results from control and experimental groups (average)

                                        Task 0993

                                        Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                        First breakpoint 000255 000340 -44 126

                                        Time to start 000444 000518 -33 112

                                        Elapsed time 003008 001605 843 53

                                        Task 1026

                                        Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                        First breakpoint 000242 000448 -126 177

                                        Time to start 000402 000343 19 92

                                        Elapsed time 002458 002041 257 83

                                        63 Comparing Results from the Control and Experimental Groups

                                        We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                        Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                        Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                        30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        64 Participantsrsquo Feedback

                                        As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                        641 Intrinsic Advantage

                                        Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                        Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                        642 Intrinsic Limitations

                                        Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                        However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                        Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                        Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                        One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                        Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                        We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                        Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                        Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                        643 Accidental Advantages

                                        Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                        Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                        Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                        32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        644 Accidental Limitations

                                        Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                        Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                        One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                        Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                        Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                        645 General Feedback

                                        Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                        It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                        This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                        debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                        7 Discussion

                                        We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                        Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                        Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                        Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                        Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                        There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                        28 httpgithubcomswarmdebugging

                                        34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                        8 Threats to Validity

                                        Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                        As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                        Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                        Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                        We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                        Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                        Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                        Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                        External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                        9 Related work

                                        We now summarise works related to debugging to allow better positioning ofour study among the published research

                                        Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                        Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                        36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                        Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                        Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                        DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                        Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                        Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                        Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                        Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                        Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                        Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                        Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                        Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                        10 Conclusion

                                        Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                        To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                        The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                        38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                        Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                        Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                        In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                        Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                        Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                        haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                        Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                        11 Acknowledgment

                                        This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                        References

                                        1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                        2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                        3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                        Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                        rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                        neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                        8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                        9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                        10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                        neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                        on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                        13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                        org107287peerjpreprints2743v1

                                        14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                        40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                        16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                        17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                        18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                        19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                        101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                        oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                        1218575

                                        22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                        neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                        conditional_breakpointhtmampcp=1_3_6_0_5

                                        23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                        24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                        linkspringercom101007s10818-015-9203-6

                                        25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                        (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                        actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                        C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                        29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                        30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                        31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                        32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                        pmcentrezamprendertype=abstract

                                        33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                        34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                        35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                        36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                        doiacmorg1011452622669

                                        37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                        38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                        39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                        40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                        41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                        42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                        43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                        44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                        45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                        46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                        47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                        48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                        49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                        50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                        51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                        52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                        53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                        54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                        55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                        56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                        57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                        58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                        42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        Appendix - Implementation of Swarm Debugging

                                        Swarm Debugging Services

                                        The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                        Fig 13 The Swarm Debugging Services architecture

                                        The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                        We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                        ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                        projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                        and debugging events

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                        Fig 14 The Swarm Debugging metadata [17]

                                        ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                        ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                        ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                        ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                        ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                        ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                        The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                        Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                        29 httpprojectsspringiospring-boot

                                        44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                        httpswarmdebuggingorgdevelopers

                                        searchfindByNamename=petrillo

                                        the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                        SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                        Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                        Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                        Fig 15 Swarm Debugging Dashboard

                                        30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                        Fig 16 Neo4J Browser - a Cypher query example

                                        Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                        graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                        Figure 16 shows an example of Cypher query and the resulting graph

                                        Swarm Debugging Tracer

                                        Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                        After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                        To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                        32 httpneo4jcom

                                        46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        Fig 17 The Swarm Tracer architecture [17]

                                        Fig 18 The Swarm Manager view

                                        Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                        Fig 19 Breakpoint search tool (fuzzy search example)

                                        invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                        To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                        Swarm Debugging Views

                                        On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                        Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                        Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                        48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        Fig 20 Sequence stack diagram for Bridge design pattern

                                        Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                        Breakpoint Search Tool

                                        Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                        Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                        Fig 21 Method call graph for Bridge design pattern [17]

                                        StartingEnding Method Search Tool

                                        This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                        Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                        StartingPoint = VSP | VSP isin α and VSP isin β

                                        EndingPoint = VEP | VEP isin β and VEP isin α

                                        Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                        Summary

                                        Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                        50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                        graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                        Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                        • 1 Introduction
                                        • 2 Background
                                        • 3 The Swarm Debugging Approach
                                        • 4 SDI in a Nutshell
                                        • 5 Using SDI to Understand Debugging Activities
                                        • 6 Evaluation of Swarm Debugging using GV
                                        • 7 Discussion
                                        • 8 Threats to Validity
                                        • 9 Related work
                                        • 10 Conclusion
                                        • 11 Acknowledgment

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 21

                                          Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

                                          Classes Lines of Code Breakpoints

                                          BibtexParser 138151159 222

                                          160165168 323

                                          176198199299 2222

                                          EntryEditor 717720721 342

                                          723837842 232

                                          11841393 32

                                          BibDatabase 175187223456 2326

                                          OpenDatabaseAction 433450451 424

                                          JabRefDesktop 4084430 223

                                          SaveDatabaseAction 177188 42

                                          BasePanel 935969 25

                                          AuthorsFormatter 43131 54

                                          EntryTableTransferHandler 346 2

                                          FieldTextMenu 84 2

                                          JabRefFrame 1119 2

                                          JabRefMain 8 5

                                          URLUtil 95 2

                                          Fig 6 Methods with 5 or more breakpoints

                                          Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

                                          22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          Table 9 Study 1 - Breakpoints by class across different tasks

                                          Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                                          SaveDatabaseAction Yes Yes Yes 7 2

                                          BasePanel Yes Yes Yes Yes 14 7

                                          JabRefDesktop Yes Yes 9 4

                                          EntryEditor Yes Yes Yes 36 4

                                          BibtexParser Yes Yes Yes 44 6

                                          OpenDatabaseAction Yes Yes Yes 19 13

                                          JabRef Yes Yes Yes 3 3

                                          JabRefMain Yes Yes Yes Yes 5 4

                                          URLUtil Yes Yes 4 2

                                          BibDatabase Yes Yes Yes 19 4

                                          points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                                          Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                                          6 Evaluation of Swarm Debugging using GV

                                          To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                                          RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                                          RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                                          61 Study design

                                          The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                                          611 Subject System

                                          For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                                          612 Participants

                                          Fig 7 Java expertise

                                          To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                                          Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                                          20 httpwwwjabreforg21 httpswwwfreelancercom

                                          24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          613 Task Description

                                          We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                                          614 Artifacts and Working Environment

                                          After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                                          For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                                          615 Study Procedure

                                          The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                                          The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                                          22 The full qualitative evaluation survey is available on httpsgooglforms

                                          c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                                          group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                                          616 Data Collection

                                          In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                                          In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                                          All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                                          62 Results

                                          We now discuss the results of our evaluation

                                          RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                                          During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                                          25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                                          26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          number of participants who could propose a solution and the correctness ofthe solutions

                                          For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                          Fig 8 GV for Task 0318

                                          For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                          Fig 9 GV for Task 0667

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                          Fig 10 GV for Task 0669

                                          Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                          13

                                          Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                          RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                          We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                          28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          Fig 11 GV usefulness - experimental phase one

                                          Fig 12 GV usefulness - experimental phase two

                                          The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                          Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                          Table 10 Results from control and experimental groups (average)

                                          Task 0993

                                          Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                          First breakpoint 000255 000340 -44 126

                                          Time to start 000444 000518 -33 112

                                          Elapsed time 003008 001605 843 53

                                          Task 1026

                                          Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                          First breakpoint 000242 000448 -126 177

                                          Time to start 000402 000343 19 92

                                          Elapsed time 002458 002041 257 83

                                          63 Comparing Results from the Control and Experimental Groups

                                          We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                          Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                          Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                          30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          64 Participantsrsquo Feedback

                                          As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                          641 Intrinsic Advantage

                                          Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                          Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                          642 Intrinsic Limitations

                                          Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                          However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                          Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                          Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                          One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                          Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                          We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                          Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                          Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                          643 Accidental Advantages

                                          Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                          Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                          Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                          32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          644 Accidental Limitations

                                          Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                          Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                          One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                          Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                          Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                          645 General Feedback

                                          Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                          It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                          This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                          debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                          7 Discussion

                                          We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                          Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                          Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                          Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                          Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                          There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                          28 httpgithubcomswarmdebugging

                                          34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                          8 Threats to Validity

                                          Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                          As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                          Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                          Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                          We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                          Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                          Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                          Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                          External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                          9 Related work

                                          We now summarise works related to debugging to allow better positioning ofour study among the published research

                                          Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                          Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                          36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                          Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                          Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                          DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                          Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                          Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                          Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                          Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                          Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                          Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                          Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                          Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                          10 Conclusion

                                          Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                          To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                          The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                          38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                          Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                          Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                          In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                          Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                          Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                          haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                          Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                          11 Acknowledgment

                                          This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                          References

                                          1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                          2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                          3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                          Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                          rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                          neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                          8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                          9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                          10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                          neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                          on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                          13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                          org107287peerjpreprints2743v1

                                          14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                          40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                          16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                          17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                          18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                          19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                          101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                          oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                          1218575

                                          22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                          neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                          conditional_breakpointhtmampcp=1_3_6_0_5

                                          23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                          24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                          linkspringercom101007s10818-015-9203-6

                                          25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                          (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                          actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                          C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                          29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                          30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                          31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                          32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                          pmcentrezamprendertype=abstract

                                          33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                          34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                          35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                          36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                          doiacmorg1011452622669

                                          37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                          38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                          39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                          40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                          41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                          42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                          43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                          44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                          45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                          46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                          47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                          48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                          49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                          50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                          51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                          52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                          53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                          54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                          55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                          56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                          57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                          58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                          42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          Appendix - Implementation of Swarm Debugging

                                          Swarm Debugging Services

                                          The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                          Fig 13 The Swarm Debugging Services architecture

                                          The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                          We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                          ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                          projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                          and debugging events

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                          Fig 14 The Swarm Debugging metadata [17]

                                          ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                          ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                          ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                          ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                          ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                          ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                          The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                          Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                          29 httpprojectsspringiospring-boot

                                          44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                          httpswarmdebuggingorgdevelopers

                                          searchfindByNamename=petrillo

                                          the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                          SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                          Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                          Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                          Fig 15 Swarm Debugging Dashboard

                                          30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                          Fig 16 Neo4J Browser - a Cypher query example

                                          Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                          graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                          Figure 16 shows an example of Cypher query and the resulting graph

                                          Swarm Debugging Tracer

                                          Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                          After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                          To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                          32 httpneo4jcom

                                          46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          Fig 17 The Swarm Tracer architecture [17]

                                          Fig 18 The Swarm Manager view

                                          Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                          Fig 19 Breakpoint search tool (fuzzy search example)

                                          invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                          To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                          Swarm Debugging Views

                                          On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                          Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                          Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                          48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          Fig 20 Sequence stack diagram for Bridge design pattern

                                          Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                          Breakpoint Search Tool

                                          Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                          Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                          Fig 21 Method call graph for Bridge design pattern [17]

                                          StartingEnding Method Search Tool

                                          This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                          Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                          StartingPoint = VSP | VSP isin α and VSP isin β

                                          EndingPoint = VEP | VEP isin β and VEP isin α

                                          Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                          Summary

                                          Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                          50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                          graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                          Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                          • 1 Introduction
                                          • 2 Background
                                          • 3 The Swarm Debugging Approach
                                          • 4 SDI in a Nutshell
                                          • 5 Using SDI to Understand Debugging Activities
                                          • 6 Evaluation of Swarm Debugging using GV
                                          • 7 Discussion
                                          • 8 Threats to Validity
                                          • 9 Related work
                                          • 10 Conclusion
                                          • 11 Acknowledgment

                                            22Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            Table 9 Study 1 - Breakpoints by class across different tasks

                                            Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

                                            SaveDatabaseAction Yes Yes Yes 7 2

                                            BasePanel Yes Yes Yes Yes 14 7

                                            JabRefDesktop Yes Yes 9 4

                                            EntryEditor Yes Yes Yes 36 4

                                            BibtexParser Yes Yes Yes 44 6

                                            OpenDatabaseAction Yes Yes Yes 19 13

                                            JabRef Yes Yes Yes 3 3

                                            JabRefMain Yes Yes Yes Yes 5 4

                                            URLUtil Yes Yes 4 2

                                            BibDatabase Yes Yes Yes 19 4

                                            points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

                                            Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

                                            6 Evaluation of Swarm Debugging using GV

                                            To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

                                            RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

                                            RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                                            61 Study design

                                            The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                                            611 Subject System

                                            For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                                            612 Participants

                                            Fig 7 Java expertise

                                            To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                                            Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                                            20 httpwwwjabreforg21 httpswwwfreelancercom

                                            24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            613 Task Description

                                            We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                                            614 Artifacts and Working Environment

                                            After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                                            For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                                            615 Study Procedure

                                            The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                                            The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                                            22 The full qualitative evaluation survey is available on httpsgooglforms

                                            c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                                            group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                                            616 Data Collection

                                            In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                                            In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                                            All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                                            62 Results

                                            We now discuss the results of our evaluation

                                            RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                                            During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                                            25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                                            26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            number of participants who could propose a solution and the correctness ofthe solutions

                                            For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                            Fig 8 GV for Task 0318

                                            For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                            Fig 9 GV for Task 0667

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                            Fig 10 GV for Task 0669

                                            Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                            13

                                            Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                            RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                            We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                            28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            Fig 11 GV usefulness - experimental phase one

                                            Fig 12 GV usefulness - experimental phase two

                                            The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                            Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                            Table 10 Results from control and experimental groups (average)

                                            Task 0993

                                            Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                            First breakpoint 000255 000340 -44 126

                                            Time to start 000444 000518 -33 112

                                            Elapsed time 003008 001605 843 53

                                            Task 1026

                                            Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                            First breakpoint 000242 000448 -126 177

                                            Time to start 000402 000343 19 92

                                            Elapsed time 002458 002041 257 83

                                            63 Comparing Results from the Control and Experimental Groups

                                            We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                            Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                            Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                            30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            64 Participantsrsquo Feedback

                                            As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                            641 Intrinsic Advantage

                                            Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                            Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                            642 Intrinsic Limitations

                                            Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                            However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                            Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                            Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                            One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                            Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                            We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                            Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                            Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                            643 Accidental Advantages

                                            Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                            Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                            Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                            32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            644 Accidental Limitations

                                            Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                            Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                            One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                            Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                            Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                            645 General Feedback

                                            Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                            It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                            This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                            debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                            7 Discussion

                                            We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                            Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                            Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                            Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                            Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                            There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                            28 httpgithubcomswarmdebugging

                                            34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                            8 Threats to Validity

                                            Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                            As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                            Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                            Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                            We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                            Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                            Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                            Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                            External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                            9 Related work

                                            We now summarise works related to debugging to allow better positioning ofour study among the published research

                                            Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                            Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                            36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                            Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                            Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                            DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                            Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                            Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                            Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                            Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                            Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                            Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                            Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                            Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                            10 Conclusion

                                            Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                            To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                            The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                            38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                            Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                            Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                            In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                            Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                            Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                            haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                            Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                            11 Acknowledgment

                                            This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                            References

                                            1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                            2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                            3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                            Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                            rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                            neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                            8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                            9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                            10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                            neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                            on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                            13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                            org107287peerjpreprints2743v1

                                            14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                            40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                            16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                            17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                            18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                            19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                            101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                            oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                            1218575

                                            22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                            neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                            conditional_breakpointhtmampcp=1_3_6_0_5

                                            23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                            24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                            linkspringercom101007s10818-015-9203-6

                                            25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                            (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                            actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                            C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                            29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                            30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                            31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                            32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                            pmcentrezamprendertype=abstract

                                            33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                            34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                            35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                            36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                            doiacmorg1011452622669

                                            37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                            38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                            39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                            40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                            41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                            42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                            43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                            44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                            45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                            46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                            47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                            48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                            49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                            50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                            51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                            52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                            53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                            54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                            55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                            56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                            57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                            58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                            42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            Appendix - Implementation of Swarm Debugging

                                            Swarm Debugging Services

                                            The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                            Fig 13 The Swarm Debugging Services architecture

                                            The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                            We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                            ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                            projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                            and debugging events

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                            Fig 14 The Swarm Debugging metadata [17]

                                            ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                            ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                            ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                            ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                            ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                            ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                            The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                            Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                            29 httpprojectsspringiospring-boot

                                            44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                            httpswarmdebuggingorgdevelopers

                                            searchfindByNamename=petrillo

                                            the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                            SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                            Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                            Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                            Fig 15 Swarm Debugging Dashboard

                                            30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                            Fig 16 Neo4J Browser - a Cypher query example

                                            Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                            graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                            Figure 16 shows an example of Cypher query and the resulting graph

                                            Swarm Debugging Tracer

                                            Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                            After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                            To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                            32 httpneo4jcom

                                            46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            Fig 17 The Swarm Tracer architecture [17]

                                            Fig 18 The Swarm Manager view

                                            Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                            Fig 19 Breakpoint search tool (fuzzy search example)

                                            invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                            To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                            Swarm Debugging Views

                                            On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                            Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                            Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                            48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            Fig 20 Sequence stack diagram for Bridge design pattern

                                            Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                            Breakpoint Search Tool

                                            Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                            Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                            Fig 21 Method call graph for Bridge design pattern [17]

                                            StartingEnding Method Search Tool

                                            This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                            Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                            StartingPoint = VSP | VSP isin α and VSP isin β

                                            EndingPoint = VEP | VEP isin β and VEP isin α

                                            Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                            Summary

                                            Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                            50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                            graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                            Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                            • 1 Introduction
                                            • 2 Background
                                            • 3 The Swarm Debugging Approach
                                            • 4 SDI in a Nutshell
                                            • 5 Using SDI to Understand Debugging Activities
                                            • 6 Evaluation of Swarm Debugging using GV
                                            • 7 Discussion
                                            • 8 Threats to Validity
                                            • 9 Related work
                                            • 10 Conclusion
                                            • 11 Acknowledgment

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 23

                                              61 Study design

                                              The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

                                              611 Subject System

                                              For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

                                              612 Participants

                                              Fig 7 Java expertise

                                              To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

                                              Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

                                              20 httpwwwjabreforg21 httpswwwfreelancercom

                                              24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              613 Task Description

                                              We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                                              614 Artifacts and Working Environment

                                              After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                                              For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                                              615 Study Procedure

                                              The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                                              The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                                              22 The full qualitative evaluation survey is available on httpsgooglforms

                                              c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                                              group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                                              616 Data Collection

                                              In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                                              In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                                              All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                                              62 Results

                                              We now discuss the results of our evaluation

                                              RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                                              During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                                              25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                                              26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              number of participants who could propose a solution and the correctness ofthe solutions

                                              For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                              Fig 8 GV for Task 0318

                                              For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                              Fig 9 GV for Task 0667

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                              Fig 10 GV for Task 0669

                                              Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                              13

                                              Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                              RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                              We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                              28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              Fig 11 GV usefulness - experimental phase one

                                              Fig 12 GV usefulness - experimental phase two

                                              The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                              Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                              Table 10 Results from control and experimental groups (average)

                                              Task 0993

                                              Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                              First breakpoint 000255 000340 -44 126

                                              Time to start 000444 000518 -33 112

                                              Elapsed time 003008 001605 843 53

                                              Task 1026

                                              Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                              First breakpoint 000242 000448 -126 177

                                              Time to start 000402 000343 19 92

                                              Elapsed time 002458 002041 257 83

                                              63 Comparing Results from the Control and Experimental Groups

                                              We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                              Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                              Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                              30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              64 Participantsrsquo Feedback

                                              As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                              641 Intrinsic Advantage

                                              Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                              Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                              642 Intrinsic Limitations

                                              Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                              However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                              Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                              Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                              One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                              Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                              We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                              Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                              Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                              643 Accidental Advantages

                                              Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                              Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                              Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                              32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              644 Accidental Limitations

                                              Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                              Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                              One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                              Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                              Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                              645 General Feedback

                                              Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                              It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                              This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                              debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                              7 Discussion

                                              We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                              Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                              Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                              Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                              Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                              There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                              28 httpgithubcomswarmdebugging

                                              34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                              8 Threats to Validity

                                              Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                              As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                              Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                              Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                              We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                              Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                              Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                              Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                              External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                              9 Related work

                                              We now summarise works related to debugging to allow better positioning ofour study among the published research

                                              Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                              Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                              36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                              Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                              Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                              DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                              Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                              Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                              Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                              Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                              Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                              Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                              Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                              Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                              10 Conclusion

                                              Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                              To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                              The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                              38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                              Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                              Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                              In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                              Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                              Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                              haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                              Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                              11 Acknowledgment

                                              This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                              References

                                              1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                              2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                              3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                              Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                              rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                              neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                              8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                              9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                              10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                              neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                              on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                              13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                              org107287peerjpreprints2743v1

                                              14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                              40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                              16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                              17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                              18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                              19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                              101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                              oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                              1218575

                                              22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                              neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                              conditional_breakpointhtmampcp=1_3_6_0_5

                                              23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                              24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                              linkspringercom101007s10818-015-9203-6

                                              25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                              (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                              actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                              C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                              29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                              30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                              31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                              32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                              pmcentrezamprendertype=abstract

                                              33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                              34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                              35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                              36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                              doiacmorg1011452622669

                                              37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                              38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                              39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                              40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                              41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                              42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                              43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                              44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                              45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                              46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                              47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                              48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                              49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                              50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                              51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                              52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                              53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                              54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                              55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                              56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                              57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                              58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                              42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              Appendix - Implementation of Swarm Debugging

                                              Swarm Debugging Services

                                              The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                              Fig 13 The Swarm Debugging Services architecture

                                              The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                              We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                              ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                              projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                              and debugging events

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                              Fig 14 The Swarm Debugging metadata [17]

                                              ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                              ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                              ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                              ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                              ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                              ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                              The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                              Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                              29 httpprojectsspringiospring-boot

                                              44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                              httpswarmdebuggingorgdevelopers

                                              searchfindByNamename=petrillo

                                              the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                              SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                              Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                              Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                              Fig 15 Swarm Debugging Dashboard

                                              30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                              Fig 16 Neo4J Browser - a Cypher query example

                                              Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                              graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                              Figure 16 shows an example of Cypher query and the resulting graph

                                              Swarm Debugging Tracer

                                              Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                              After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                              To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                              32 httpneo4jcom

                                              46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              Fig 17 The Swarm Tracer architecture [17]

                                              Fig 18 The Swarm Manager view

                                              Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                              Fig 19 Breakpoint search tool (fuzzy search example)

                                              invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                              To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                              Swarm Debugging Views

                                              On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                              Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                              Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                              48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              Fig 20 Sequence stack diagram for Bridge design pattern

                                              Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                              Breakpoint Search Tool

                                              Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                              Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                              Fig 21 Method call graph for Bridge design pattern [17]

                                              StartingEnding Method Search Tool

                                              This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                              Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                              StartingPoint = VSP | VSP isin α and VSP isin β

                                              EndingPoint = VEP | VEP isin β and VEP isin α

                                              Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                              Summary

                                              Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                              50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                              graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                              Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                              • 1 Introduction
                                              • 2 Background
                                              • 3 The Swarm Debugging Approach
                                              • 4 SDI in a Nutshell
                                              • 5 Using SDI to Understand Debugging Activities
                                              • 6 Evaluation of Swarm Debugging using GV
                                              • 7 Discussion
                                              • 8 Threats to Validity
                                              • 9 Related work
                                              • 10 Conclusion
                                              • 11 Acknowledgment

                                                24Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                613 Task Description

                                                We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

                                                614 Artifacts and Working Environment

                                                After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

                                                For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

                                                615 Study Procedure

                                                The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

                                                The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

                                                22 The full qualitative evaluation survey is available on httpsgooglforms

                                                c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                                                group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                                                616 Data Collection

                                                In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                                                In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                                                All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                                                62 Results

                                                We now discuss the results of our evaluation

                                                RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                                                During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                                                25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                                                26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                number of participants who could propose a solution and the correctness ofthe solutions

                                                For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                                Fig 8 GV for Task 0318

                                                For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                                Fig 9 GV for Task 0667

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                                Fig 10 GV for Task 0669

                                                Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                                13

                                                Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                                RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                                We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                                28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                Fig 11 GV usefulness - experimental phase one

                                                Fig 12 GV usefulness - experimental phase two

                                                The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                                Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                                Table 10 Results from control and experimental groups (average)

                                                Task 0993

                                                Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                First breakpoint 000255 000340 -44 126

                                                Time to start 000444 000518 -33 112

                                                Elapsed time 003008 001605 843 53

                                                Task 1026

                                                Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                First breakpoint 000242 000448 -126 177

                                                Time to start 000402 000343 19 92

                                                Elapsed time 002458 002041 257 83

                                                63 Comparing Results from the Control and Experimental Groups

                                                We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                                Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                                Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                                30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                64 Participantsrsquo Feedback

                                                As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                                641 Intrinsic Advantage

                                                Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                                Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                                642 Intrinsic Limitations

                                                Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                                However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                                Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                                Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                                One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                                Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                                We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                                Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                                Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                                643 Accidental Advantages

                                                Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                                Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                                Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                                32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                644 Accidental Limitations

                                                Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                                Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                                One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                                Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                                Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                                645 General Feedback

                                                Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                                It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                                This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                                debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                                7 Discussion

                                                We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                                Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                                Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                                Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                                Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                                There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                                28 httpgithubcomswarmdebugging

                                                34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                8 Threats to Validity

                                                Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                9 Related work

                                                We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                10 Conclusion

                                                Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                11 Acknowledgment

                                                This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                References

                                                1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                org107287peerjpreprints2743v1

                                                14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                1218575

                                                22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                conditional_breakpointhtmampcp=1_3_6_0_5

                                                23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                linkspringercom101007s10818-015-9203-6

                                                25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                pmcentrezamprendertype=abstract

                                                33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                doiacmorg1011452622669

                                                37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                Appendix - Implementation of Swarm Debugging

                                                Swarm Debugging Services

                                                The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                Fig 13 The Swarm Debugging Services architecture

                                                The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                and debugging events

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                Fig 14 The Swarm Debugging metadata [17]

                                                ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                29 httpprojectsspringiospring-boot

                                                44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                httpswarmdebuggingorgdevelopers

                                                searchfindByNamename=petrillo

                                                the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                Fig 15 Swarm Debugging Dashboard

                                                30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                Fig 16 Neo4J Browser - a Cypher query example

                                                Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                Figure 16 shows an example of Cypher query and the resulting graph

                                                Swarm Debugging Tracer

                                                Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                32 httpneo4jcom

                                                46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                Fig 17 The Swarm Tracer architecture [17]

                                                Fig 18 The Swarm Manager view

                                                Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                Fig 19 Breakpoint search tool (fuzzy search example)

                                                invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                Swarm Debugging Views

                                                On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                Fig 20 Sequence stack diagram for Bridge design pattern

                                                Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                Breakpoint Search Tool

                                                Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                Fig 21 Method call graph for Bridge design pattern [17]

                                                StartingEnding Method Search Tool

                                                This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                StartingPoint = VSP | VSP isin α and VSP isin β

                                                EndingPoint = VEP | VEP isin β and VEP isin α

                                                Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                Summary

                                                Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                • 1 Introduction
                                                • 2 Background
                                                • 3 The Swarm Debugging Approach
                                                • 4 SDI in a Nutshell
                                                • 5 Using SDI to Understand Debugging Activities
                                                • 6 Evaluation of Swarm Debugging using GV
                                                • 7 Discussion
                                                • 8 Threats to Validity
                                                • 9 Related work
                                                • 10 Conclusion
                                                • 11 Acknowledgment

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 25

                                                  group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

                                                  616 Data Collection

                                                  In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

                                                  In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

                                                  All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

                                                  62 Results

                                                  We now discuss the results of our evaluation

                                                  RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

                                                  During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

                                                  25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

                                                  26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  number of participants who could propose a solution and the correctness ofthe solutions

                                                  For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                                  Fig 8 GV for Task 0318

                                                  For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                                  Fig 9 GV for Task 0667

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                                  Fig 10 GV for Task 0669

                                                  Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                                  13

                                                  Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                                  RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                                  We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                                  28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  Fig 11 GV usefulness - experimental phase one

                                                  Fig 12 GV usefulness - experimental phase two

                                                  The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                                  Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                                  Table 10 Results from control and experimental groups (average)

                                                  Task 0993

                                                  Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                  First breakpoint 000255 000340 -44 126

                                                  Time to start 000444 000518 -33 112

                                                  Elapsed time 003008 001605 843 53

                                                  Task 1026

                                                  Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                  First breakpoint 000242 000448 -126 177

                                                  Time to start 000402 000343 19 92

                                                  Elapsed time 002458 002041 257 83

                                                  63 Comparing Results from the Control and Experimental Groups

                                                  We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                                  Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                                  Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                                  30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  64 Participantsrsquo Feedback

                                                  As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                                  641 Intrinsic Advantage

                                                  Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                                  Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                                  642 Intrinsic Limitations

                                                  Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                                  However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                                  Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                                  Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                                  One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                                  Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                                  We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                                  Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                                  Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                                  643 Accidental Advantages

                                                  Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                                  Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                                  Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                                  32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  644 Accidental Limitations

                                                  Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                                  Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                                  One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                                  Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                                  Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                                  645 General Feedback

                                                  Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                                  It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                                  This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                                  debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                                  7 Discussion

                                                  We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                                  Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                                  Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                                  Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                                  Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                                  There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                                  28 httpgithubcomswarmdebugging

                                                  34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                  8 Threats to Validity

                                                  Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                  As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                  Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                  Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                  We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                  Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                  Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                  Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                  External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                  9 Related work

                                                  We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                  Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                  Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                  36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                  Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                  Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                  DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                  Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                  Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                  Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                  Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                  Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                  Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                  Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                  Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                  10 Conclusion

                                                  Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                  To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                  The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                  38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                  Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                  Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                  In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                  Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                  Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                  haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                  Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                  11 Acknowledgment

                                                  This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                  References

                                                  1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                  2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                  3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                  Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                  rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                  neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                  8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                  9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                  10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                  neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                  on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                  13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                  org107287peerjpreprints2743v1

                                                  14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                  40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                  16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                  17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                  18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                  19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                  101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                  oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                  1218575

                                                  22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                  neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                  conditional_breakpointhtmampcp=1_3_6_0_5

                                                  23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                  24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                  linkspringercom101007s10818-015-9203-6

                                                  25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                  (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                  actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                  C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                  29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                  30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                  31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                  32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                  pmcentrezamprendertype=abstract

                                                  33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                  34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                  35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                  36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                  doiacmorg1011452622669

                                                  37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                  38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                  39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                  40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                  41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                  42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                  43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                  44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                  45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                  46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                  47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                  48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                  49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                  50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                  51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                  52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                  53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                  54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                  55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                  56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                  57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                  58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                  42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  Appendix - Implementation of Swarm Debugging

                                                  Swarm Debugging Services

                                                  The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                  Fig 13 The Swarm Debugging Services architecture

                                                  The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                  We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                  ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                  projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                  and debugging events

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                  Fig 14 The Swarm Debugging metadata [17]

                                                  ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                  ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                  ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                  ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                  ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                  ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                  The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                  Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                  29 httpprojectsspringiospring-boot

                                                  44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                  httpswarmdebuggingorgdevelopers

                                                  searchfindByNamename=petrillo

                                                  the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                  SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                  Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                  Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                  Fig 15 Swarm Debugging Dashboard

                                                  30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                  Fig 16 Neo4J Browser - a Cypher query example

                                                  Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                  graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                  Figure 16 shows an example of Cypher query and the resulting graph

                                                  Swarm Debugging Tracer

                                                  Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                  After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                  To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                  32 httpneo4jcom

                                                  46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  Fig 17 The Swarm Tracer architecture [17]

                                                  Fig 18 The Swarm Manager view

                                                  Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                  Fig 19 Breakpoint search tool (fuzzy search example)

                                                  invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                  To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                  Swarm Debugging Views

                                                  On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                  Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                  Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                  48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  Fig 20 Sequence stack diagram for Bridge design pattern

                                                  Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                  Breakpoint Search Tool

                                                  Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                  Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                  Fig 21 Method call graph for Bridge design pattern [17]

                                                  StartingEnding Method Search Tool

                                                  This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                  Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                  StartingPoint = VSP | VSP isin α and VSP isin β

                                                  EndingPoint = VEP | VEP isin β and VEP isin α

                                                  Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                  Summary

                                                  Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                  50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                  graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                  Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                  • 1 Introduction
                                                  • 2 Background
                                                  • 3 The Swarm Debugging Approach
                                                  • 4 SDI in a Nutshell
                                                  • 5 Using SDI to Understand Debugging Activities
                                                  • 6 Evaluation of Swarm Debugging using GV
                                                  • 7 Discussion
                                                  • 8 Threats to Validity
                                                  • 9 Related work
                                                  • 10 Conclusion
                                                  • 11 Acknowledgment

                                                    26Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    number of participants who could propose a solution and the correctness ofthe solutions

                                                    For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

                                                    Fig 8 GV for Task 0318

                                                    For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

                                                    Fig 9 GV for Task 0667

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                                    Fig 10 GV for Task 0669

                                                    Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                                    13

                                                    Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                                    RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                                    We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                                    28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    Fig 11 GV usefulness - experimental phase one

                                                    Fig 12 GV usefulness - experimental phase two

                                                    The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                                    Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                                    Table 10 Results from control and experimental groups (average)

                                                    Task 0993

                                                    Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                    First breakpoint 000255 000340 -44 126

                                                    Time to start 000444 000518 -33 112

                                                    Elapsed time 003008 001605 843 53

                                                    Task 1026

                                                    Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                    First breakpoint 000242 000448 -126 177

                                                    Time to start 000402 000343 19 92

                                                    Elapsed time 002458 002041 257 83

                                                    63 Comparing Results from the Control and Experimental Groups

                                                    We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                                    Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                                    Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                                    30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    64 Participantsrsquo Feedback

                                                    As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                                    641 Intrinsic Advantage

                                                    Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                                    Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                                    642 Intrinsic Limitations

                                                    Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                                    However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                                    Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                                    Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                                    One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                                    Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                                    We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                                    Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                                    Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                                    643 Accidental Advantages

                                                    Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                                    Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                                    Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                                    32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    644 Accidental Limitations

                                                    Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                                    Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                                    One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                                    Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                                    Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                                    645 General Feedback

                                                    Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                                    It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                                    This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                                    debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                                    7 Discussion

                                                    We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                                    Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                                    Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                                    Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                                    Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                                    There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                                    28 httpgithubcomswarmdebugging

                                                    34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                    8 Threats to Validity

                                                    Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                    As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                    Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                    Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                    We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                    Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                    Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                    Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                    External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                    9 Related work

                                                    We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                    Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                    Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                    36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                    Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                    Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                    DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                    Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                    Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                    Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                    Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                    Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                    Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                    Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                    Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                    10 Conclusion

                                                    Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                    To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                    The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                    38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                    Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                    Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                    In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                    Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                    Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                    haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                    Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                    11 Acknowledgment

                                                    This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                    References

                                                    1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                    2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                    3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                    Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                    rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                    neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                    8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                    9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                    10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                    neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                    on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                    13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                    org107287peerjpreprints2743v1

                                                    14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                    40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                    16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                    17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                    18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                    19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                    101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                    oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                    1218575

                                                    22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                    neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                    conditional_breakpointhtmampcp=1_3_6_0_5

                                                    23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                    24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                    linkspringercom101007s10818-015-9203-6

                                                    25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                    (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                    actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                    C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                    29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                    30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                    31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                    32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                    pmcentrezamprendertype=abstract

                                                    33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                    34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                    35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                    36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                    doiacmorg1011452622669

                                                    37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                    38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                    39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                    40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                    41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                    42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                    43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                    44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                    45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                    46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                    47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                    48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                    49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                    50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                    51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                    52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                    53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                    54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                    55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                    56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                    57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                    58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                    42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    Appendix - Implementation of Swarm Debugging

                                                    Swarm Debugging Services

                                                    The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                    Fig 13 The Swarm Debugging Services architecture

                                                    The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                    We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                    ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                    projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                    and debugging events

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                    Fig 14 The Swarm Debugging metadata [17]

                                                    ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                    ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                    ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                    ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                    ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                    ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                    The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                    Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                    29 httpprojectsspringiospring-boot

                                                    44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                    httpswarmdebuggingorgdevelopers

                                                    searchfindByNamename=petrillo

                                                    the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                    SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                    Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                    Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                    Fig 15 Swarm Debugging Dashboard

                                                    30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                    Fig 16 Neo4J Browser - a Cypher query example

                                                    Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                    graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                    Figure 16 shows an example of Cypher query and the resulting graph

                                                    Swarm Debugging Tracer

                                                    Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                    After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                    To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                    32 httpneo4jcom

                                                    46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    Fig 17 The Swarm Tracer architecture [17]

                                                    Fig 18 The Swarm Manager view

                                                    Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                    Fig 19 Breakpoint search tool (fuzzy search example)

                                                    invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                    To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                    Swarm Debugging Views

                                                    On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                    Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                    Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                    48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    Fig 20 Sequence stack diagram for Bridge design pattern

                                                    Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                    Breakpoint Search Tool

                                                    Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                    Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                    Fig 21 Method call graph for Bridge design pattern [17]

                                                    StartingEnding Method Search Tool

                                                    This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                    Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                    StartingPoint = VSP | VSP isin α and VSP isin β

                                                    EndingPoint = VEP | VEP isin β and VEP isin α

                                                    Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                    Summary

                                                    Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                    50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                    graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                    Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                    • 1 Introduction
                                                    • 2 Background
                                                    • 3 The Swarm Debugging Approach
                                                    • 4 SDI in a Nutshell
                                                    • 5 Using SDI to Understand Debugging Activities
                                                    • 6 Evaluation of Swarm Debugging using GV
                                                    • 7 Discussion
                                                    • 8 Threats to Validity
                                                    • 9 Related work
                                                    • 10 Conclusion
                                                    • 11 Acknowledgment

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 27

                                                      Fig 10 GV for Task 0669

                                                      Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

                                                      13

                                                      Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

                                                      RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

                                                      We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

                                                      28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      Fig 11 GV usefulness - experimental phase one

                                                      Fig 12 GV usefulness - experimental phase two

                                                      The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                                      Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                                      Table 10 Results from control and experimental groups (average)

                                                      Task 0993

                                                      Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                      First breakpoint 000255 000340 -44 126

                                                      Time to start 000444 000518 -33 112

                                                      Elapsed time 003008 001605 843 53

                                                      Task 1026

                                                      Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                      First breakpoint 000242 000448 -126 177

                                                      Time to start 000402 000343 19 92

                                                      Elapsed time 002458 002041 257 83

                                                      63 Comparing Results from the Control and Experimental Groups

                                                      We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                                      Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                                      Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                                      30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      64 Participantsrsquo Feedback

                                                      As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                                      641 Intrinsic Advantage

                                                      Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                                      Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                                      642 Intrinsic Limitations

                                                      Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                                      However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                                      Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                                      Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                                      One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                                      Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                                      We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                                      Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                                      Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                                      643 Accidental Advantages

                                                      Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                                      Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                                      Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                                      32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      644 Accidental Limitations

                                                      Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                                      Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                                      One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                                      Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                                      Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                                      645 General Feedback

                                                      Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                                      It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                                      This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                                      debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                                      7 Discussion

                                                      We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                                      Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                                      Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                                      Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                                      Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                                      There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                                      28 httpgithubcomswarmdebugging

                                                      34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                      8 Threats to Validity

                                                      Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                      As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                      Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                      Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                      We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                      Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                      Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                      Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                      External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                      9 Related work

                                                      We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                      Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                      Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                      36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                      Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                      Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                      DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                      Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                      Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                      Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                      Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                      Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                      Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                      Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                      Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                      10 Conclusion

                                                      Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                      To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                      The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                      38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                      Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                      Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                      In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                      Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                      Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                      haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                      Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                      11 Acknowledgment

                                                      This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                      References

                                                      1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                      2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                      3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                      Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                      rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                      neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                      8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                      9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                      10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                      neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                      on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                      13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                      org107287peerjpreprints2743v1

                                                      14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                      40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                      16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                      17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                      18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                      19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                      101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                      oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                      1218575

                                                      22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                      neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                      conditional_breakpointhtmampcp=1_3_6_0_5

                                                      23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                      24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                      linkspringercom101007s10818-015-9203-6

                                                      25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                      (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                      actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                      C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                      29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                      30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                      31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                      32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                      pmcentrezamprendertype=abstract

                                                      33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                      34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                      35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                      36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                      doiacmorg1011452622669

                                                      37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                      38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                      39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                      40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                      41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                      42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                      43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                      44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                      45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                      46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                      47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                      48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                      49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                      50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                      51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                      52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                      53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                      54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                      55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                      56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                      57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                      58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                      42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      Appendix - Implementation of Swarm Debugging

                                                      Swarm Debugging Services

                                                      The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                      Fig 13 The Swarm Debugging Services architecture

                                                      The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                      We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                      ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                      projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                      and debugging events

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                      Fig 14 The Swarm Debugging metadata [17]

                                                      ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                      ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                      ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                      ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                      ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                      ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                      The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                      Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                      29 httpprojectsspringiospring-boot

                                                      44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                      httpswarmdebuggingorgdevelopers

                                                      searchfindByNamename=petrillo

                                                      the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                      SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                      Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                      Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                      Fig 15 Swarm Debugging Dashboard

                                                      30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                      Fig 16 Neo4J Browser - a Cypher query example

                                                      Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                      graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                      Figure 16 shows an example of Cypher query and the resulting graph

                                                      Swarm Debugging Tracer

                                                      Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                      After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                      To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                      32 httpneo4jcom

                                                      46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      Fig 17 The Swarm Tracer architecture [17]

                                                      Fig 18 The Swarm Manager view

                                                      Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                      Fig 19 Breakpoint search tool (fuzzy search example)

                                                      invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                      To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                      Swarm Debugging Views

                                                      On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                      Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                      Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                      48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      Fig 20 Sequence stack diagram for Bridge design pattern

                                                      Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                      Breakpoint Search Tool

                                                      Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                      Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                      Fig 21 Method call graph for Bridge design pattern [17]

                                                      StartingEnding Method Search Tool

                                                      This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                      Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                      StartingPoint = VSP | VSP isin α and VSP isin β

                                                      EndingPoint = VEP | VEP isin β and VEP isin α

                                                      Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                      Summary

                                                      Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                      50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                      graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                      Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                      • 1 Introduction
                                                      • 2 Background
                                                      • 3 The Swarm Debugging Approach
                                                      • 4 SDI in a Nutshell
                                                      • 5 Using SDI to Understand Debugging Activities
                                                      • 6 Evaluation of Swarm Debugging using GV
                                                      • 7 Discussion
                                                      • 8 Threats to Validity
                                                      • 9 Related work
                                                      • 10 Conclusion
                                                      • 11 Acknowledgment

                                                        28Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        Fig 11 GV usefulness - experimental phase one

                                                        Fig 12 GV usefulness - experimental phase two

                                                        The analysis of our results suggests that GV is useful to support software-maintenance tasks

                                                        Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                                        Table 10 Results from control and experimental groups (average)

                                                        Task 0993

                                                        Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                        First breakpoint 000255 000340 -44 126

                                                        Time to start 000444 000518 -33 112

                                                        Elapsed time 003008 001605 843 53

                                                        Task 1026

                                                        Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                        First breakpoint 000242 000448 -126 177

                                                        Time to start 000402 000343 19 92

                                                        Elapsed time 002458 002041 257 83

                                                        63 Comparing Results from the Control and Experimental Groups

                                                        We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                                        Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                                        Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                                        30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        64 Participantsrsquo Feedback

                                                        As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                                        641 Intrinsic Advantage

                                                        Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                                        Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                                        642 Intrinsic Limitations

                                                        Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                                        However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                                        Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                                        Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                                        One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                                        Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                                        We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                                        Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                                        Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                                        643 Accidental Advantages

                                                        Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                                        Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                                        Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                                        32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        644 Accidental Limitations

                                                        Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                                        Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                                        One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                                        Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                                        Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                                        645 General Feedback

                                                        Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                                        It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                                        This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                                        debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                                        7 Discussion

                                                        We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                                        Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                                        Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                                        Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                                        Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                                        There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                                        28 httpgithubcomswarmdebugging

                                                        34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                        8 Threats to Validity

                                                        Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                        As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                        Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                        Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                        We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                        Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                        Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                        Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                        External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                        9 Related work

                                                        We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                        Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                        Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                        36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                        Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                        Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                        DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                        Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                        Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                        Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                        Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                        Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                        Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                        Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                        Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                        10 Conclusion

                                                        Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                        To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                        The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                        38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                        Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                        Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                        In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                        Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                        Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                        haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                        Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                        11 Acknowledgment

                                                        This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                        References

                                                        1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                        2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                        3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                        Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                        rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                        neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                        8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                        9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                        10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                        neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                        on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                        13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                        org107287peerjpreprints2743v1

                                                        14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                        40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                        16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                        17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                        18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                        19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                        101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                        oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                        1218575

                                                        22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                        neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                        conditional_breakpointhtmampcp=1_3_6_0_5

                                                        23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                        24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                        linkspringercom101007s10818-015-9203-6

                                                        25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                        (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                        actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                        C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                        29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                        30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                        31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                        32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                        pmcentrezamprendertype=abstract

                                                        33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                        34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                        35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                        36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                        doiacmorg1011452622669

                                                        37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                        38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                        39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                        40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                        41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                        42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                        43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                        44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                        45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                        46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                        47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                        48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                        49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                        50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                        51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                        52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                        53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                        54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                        55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                        56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                        57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                        58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                        42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        Appendix - Implementation of Swarm Debugging

                                                        Swarm Debugging Services

                                                        The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                        Fig 13 The Swarm Debugging Services architecture

                                                        The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                        We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                        ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                        projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                        and debugging events

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                        Fig 14 The Swarm Debugging metadata [17]

                                                        ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                        ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                        ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                        ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                        ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                        ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                        The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                        Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                        29 httpprojectsspringiospring-boot

                                                        44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                        httpswarmdebuggingorgdevelopers

                                                        searchfindByNamename=petrillo

                                                        the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                        SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                        Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                        Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                        Fig 15 Swarm Debugging Dashboard

                                                        30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                        Fig 16 Neo4J Browser - a Cypher query example

                                                        Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                        graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                        Figure 16 shows an example of Cypher query and the resulting graph

                                                        Swarm Debugging Tracer

                                                        Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                        After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                        To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                        32 httpneo4jcom

                                                        46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        Fig 17 The Swarm Tracer architecture [17]

                                                        Fig 18 The Swarm Manager view

                                                        Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                        Fig 19 Breakpoint search tool (fuzzy search example)

                                                        invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                        To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                        Swarm Debugging Views

                                                        On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                        Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                        Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                        48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        Fig 20 Sequence stack diagram for Bridge design pattern

                                                        Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                        Breakpoint Search Tool

                                                        Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                        Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                        Fig 21 Method call graph for Bridge design pattern [17]

                                                        StartingEnding Method Search Tool

                                                        This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                        Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                        StartingPoint = VSP | VSP isin α and VSP isin β

                                                        EndingPoint = VEP | VEP isin β and VEP isin α

                                                        Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                        Summary

                                                        Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                        50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                        graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                        Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                        • 1 Introduction
                                                        • 2 Background
                                                        • 3 The Swarm Debugging Approach
                                                        • 4 SDI in a Nutshell
                                                        • 5 Using SDI to Understand Debugging Activities
                                                        • 6 Evaluation of Swarm Debugging using GV
                                                        • 7 Discussion
                                                        • 8 Threats to Validity
                                                        • 9 Related work
                                                        • 10 Conclusion
                                                        • 11 Acknowledgment

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 29

                                                          Table 10 Results from control and experimental groups (average)

                                                          Task 0993

                                                          Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                          First breakpoint 000255 000340 -44 126

                                                          Time to start 000444 000518 -33 112

                                                          Elapsed time 003008 001605 843 53

                                                          Task 1026

                                                          Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

                                                          First breakpoint 000242 000448 -126 177

                                                          Time to start 000402 000343 19 92

                                                          Elapsed time 002458 002041 257 83

                                                          63 Comparing Results from the Control and Experimental Groups

                                                          We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

                                                          Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

                                                          Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

                                                          30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          64 Participantsrsquo Feedback

                                                          As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                                          641 Intrinsic Advantage

                                                          Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                                          Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                                          642 Intrinsic Limitations

                                                          Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                                          However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                                          Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                                          Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                                          One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                                          Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                                          We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                                          Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                                          Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                                          643 Accidental Advantages

                                                          Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                                          Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                                          Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                                          32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          644 Accidental Limitations

                                                          Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                                          Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                                          One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                                          Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                                          Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                                          645 General Feedback

                                                          Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                                          It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                                          This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                                          debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                                          7 Discussion

                                                          We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                                          Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                                          Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                                          Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                                          Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                                          There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                                          28 httpgithubcomswarmdebugging

                                                          34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                          8 Threats to Validity

                                                          Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                          As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                          Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                          Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                          We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                          Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                          Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                          Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                          External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                          9 Related work

                                                          We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                          Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                          Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                          36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                          Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                          Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                          DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                          Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                          Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                          Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                          Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                          Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                          Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                          Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                          Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                          10 Conclusion

                                                          Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                          To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                          The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                          38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                          Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                          Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                          In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                          Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                          Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                          haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                          Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                          11 Acknowledgment

                                                          This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                          References

                                                          1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                          2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                          3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                          Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                          rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                          neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                          8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                          9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                          10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                          neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                          on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                          13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                          org107287peerjpreprints2743v1

                                                          14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                          40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                          16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                          17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                          18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                          19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                          101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                          oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                          1218575

                                                          22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                          neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                          conditional_breakpointhtmampcp=1_3_6_0_5

                                                          23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                          24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                          linkspringercom101007s10818-015-9203-6

                                                          25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                          (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                          actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                          C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                          29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                          30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                          31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                          32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                          pmcentrezamprendertype=abstract

                                                          33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                          34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                          35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                          36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                          doiacmorg1011452622669

                                                          37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                          38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                          39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                          40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                          41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                          42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                          43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                          44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                          45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                          46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                          47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                          48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                          49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                          50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                          51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                          52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                          53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                          54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                          55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                          56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                          57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                          58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                          42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          Appendix - Implementation of Swarm Debugging

                                                          Swarm Debugging Services

                                                          The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                          Fig 13 The Swarm Debugging Services architecture

                                                          The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                          We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                          ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                          projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                          and debugging events

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                          Fig 14 The Swarm Debugging metadata [17]

                                                          ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                          ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                          ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                          ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                          ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                          ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                          The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                          Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                          29 httpprojectsspringiospring-boot

                                                          44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                          httpswarmdebuggingorgdevelopers

                                                          searchfindByNamename=petrillo

                                                          the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                          SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                          Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                          Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                          Fig 15 Swarm Debugging Dashboard

                                                          30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                          Fig 16 Neo4J Browser - a Cypher query example

                                                          Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                          graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                          Figure 16 shows an example of Cypher query and the resulting graph

                                                          Swarm Debugging Tracer

                                                          Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                          After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                          To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                          32 httpneo4jcom

                                                          46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          Fig 17 The Swarm Tracer architecture [17]

                                                          Fig 18 The Swarm Manager view

                                                          Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                          Fig 19 Breakpoint search tool (fuzzy search example)

                                                          invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                          To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                          Swarm Debugging Views

                                                          On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                          Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                          Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                          48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          Fig 20 Sequence stack diagram for Bridge design pattern

                                                          Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                          Breakpoint Search Tool

                                                          Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                          Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                          Fig 21 Method call graph for Bridge design pattern [17]

                                                          StartingEnding Method Search Tool

                                                          This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                          Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                          StartingPoint = VSP | VSP isin α and VSP isin β

                                                          EndingPoint = VEP | VEP isin β and VEP isin α

                                                          Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                          Summary

                                                          Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                          50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                          graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                          Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                          • 1 Introduction
                                                          • 2 Background
                                                          • 3 The Swarm Debugging Approach
                                                          • 4 SDI in a Nutshell
                                                          • 5 Using SDI to Understand Debugging Activities
                                                          • 6 Evaluation of Swarm Debugging using GV
                                                          • 7 Discussion
                                                          • 8 Threats to Validity
                                                          • 9 Related work
                                                          • 10 Conclusion
                                                          • 11 Acknowledgment

                                                            30Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            64 Participantsrsquo Feedback

                                                            As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

                                                            641 Intrinsic Advantage

                                                            Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

                                                            Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

                                                            642 Intrinsic Limitations

                                                            Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

                                                            However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

                                                            Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                                            Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                                            One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                                            Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                                            We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                                            Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                                            Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                                            643 Accidental Advantages

                                                            Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                                            Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                                            Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                                            32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            644 Accidental Limitations

                                                            Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                                            Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                                            One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                                            Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                                            Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                                            645 General Feedback

                                                            Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                                            It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                                            This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                                            debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                                            7 Discussion

                                                            We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                                            Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                                            Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                                            Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                                            Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                                            There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                                            28 httpgithubcomswarmdebugging

                                                            34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                            8 Threats to Validity

                                                            Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                            As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                            Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                            Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                            We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                            Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                            Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                            Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                            External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                            9 Related work

                                                            We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                            Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                            Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                            36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                            Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                            Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                            DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                            Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                            Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                            Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                            Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                            Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                            Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                            Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                            Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                            10 Conclusion

                                                            Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                            To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                            The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                            38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                            Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                            Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                            In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                            Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                            Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                            haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                            Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                            11 Acknowledgment

                                                            This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                            References

                                                            1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                            2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                            3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                            Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                            rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                            neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                            8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                            9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                            10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                            neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                            on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                            13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                            org107287peerjpreprints2743v1

                                                            14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                            40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                            16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                            17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                            18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                            19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                            101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                            oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                            1218575

                                                            22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                            neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                            conditional_breakpointhtmampcp=1_3_6_0_5

                                                            23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                            24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                            linkspringercom101007s10818-015-9203-6

                                                            25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                            (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                            actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                            C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                            29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                            30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                            31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                            32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                            pmcentrezamprendertype=abstract

                                                            33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                            34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                            35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                            36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                            doiacmorg1011452622669

                                                            37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                            38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                            39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                            40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                            41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                            42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                            43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                            44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                            45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                            46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                            47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                            48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                            49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                            50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                            51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                            52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                            53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                            54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                            55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                            56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                            57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                            58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                            42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            Appendix - Implementation of Swarm Debugging

                                                            Swarm Debugging Services

                                                            The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                            Fig 13 The Swarm Debugging Services architecture

                                                            The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                            We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                            ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                            projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                            and debugging events

                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                            Fig 14 The Swarm Debugging metadata [17]

                                                            ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                            ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                            ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                            ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                            ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                            ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                            The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                            Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                            29 httpprojectsspringiospring-boot

                                                            44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                            httpswarmdebuggingorgdevelopers

                                                            searchfindByNamename=petrillo

                                                            the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                            SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                            Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                            Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                            Fig 15 Swarm Debugging Dashboard

                                                            30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                            Fig 16 Neo4J Browser - a Cypher query example

                                                            Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                            graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                            Figure 16 shows an example of Cypher query and the resulting graph

                                                            Swarm Debugging Tracer

                                                            Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                            After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                            To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                            32 httpneo4jcom

                                                            46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            Fig 17 The Swarm Tracer architecture [17]

                                                            Fig 18 The Swarm Manager view

                                                            Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                            Fig 19 Breakpoint search tool (fuzzy search example)

                                                            invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                            To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                            Swarm Debugging Views

                                                            On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                            Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                            Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                            48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            Fig 20 Sequence stack diagram for Bridge design pattern

                                                            Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                            Breakpoint Search Tool

                                                            Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                            Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                            Fig 21 Method call graph for Bridge design pattern [17]

                                                            StartingEnding Method Search Tool

                                                            This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                            Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                            StartingPoint = VSP | VSP isin α and VSP isin β

                                                            EndingPoint = VEP | VEP isin β and VEP isin α

                                                            Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                            Summary

                                                            Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                            50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                            graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                            Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                            • 1 Introduction
                                                            • 2 Background
                                                            • 3 The Swarm Debugging Approach
                                                            • 4 SDI in a Nutshell
                                                            • 5 Using SDI to Understand Debugging Activities
                                                            • 6 Evaluation of Swarm Debugging using GV
                                                            • 7 Discussion
                                                            • 8 Threats to Validity
                                                            • 9 Related work
                                                            • 10 Conclusion
                                                            • 11 Acknowledgment

                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 31

                                                              Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

                                                              One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

                                                              Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

                                                              We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

                                                              Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

                                                              Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

                                                              643 Accidental Advantages

                                                              Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

                                                              Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

                                                              Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

                                                              32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                              644 Accidental Limitations

                                                              Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                                              Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                                              One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                                              Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                                              Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                                              645 General Feedback

                                                              Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                                              It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                                              This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                                              debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                                              7 Discussion

                                                              We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                                              Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                                              Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                                              Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                                              Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                                              There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                                              28 httpgithubcomswarmdebugging

                                                              34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                              We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                              8 Threats to Validity

                                                              Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                              As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                              Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                              Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                              We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                              Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                              Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                              Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                              External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                              9 Related work

                                                              We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                              Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                              Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                              36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                              which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                              Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                              Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                              DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                              Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                              Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                              Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                              Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                              Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                              Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                              Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                              Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                              10 Conclusion

                                                              Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                              To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                              The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                              38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                              breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                              Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                              Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                              In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                              Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                              Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                              haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                              Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                              11 Acknowledgment

                                                              This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                              References

                                                              1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                              2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                              3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                              Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                              rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                              neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                              8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                              9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                              10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                              neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                              on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                              13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                              org107287peerjpreprints2743v1

                                                              14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                              40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                              15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                              16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                              17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                              18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                              19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                              101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                              oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                              1218575

                                                              22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                              neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                              conditional_breakpointhtmampcp=1_3_6_0_5

                                                              23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                              24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                              linkspringercom101007s10818-015-9203-6

                                                              25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                              (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                              actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                              C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                              29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                              30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                              31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                              32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                              pmcentrezamprendertype=abstract

                                                              33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                              34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                              35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                              36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                              doiacmorg1011452622669

                                                              37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                              38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                              39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                              40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                              41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                              42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                              43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                              44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                              45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                              46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                              47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                              48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                              49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                              50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                              51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                              52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                              53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                              54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                              55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                              56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                              57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                              58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                              42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                              Appendix - Implementation of Swarm Debugging

                                                              Swarm Debugging Services

                                                              The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                              Fig 13 The Swarm Debugging Services architecture

                                                              The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                              We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                              ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                              projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                              and debugging events

                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                              Fig 14 The Swarm Debugging metadata [17]

                                                              ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                              ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                              ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                              ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                              ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                              ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                              The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                              Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                              29 httpprojectsspringiospring-boot

                                                              44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                              and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                              httpswarmdebuggingorgdevelopers

                                                              searchfindByNamename=petrillo

                                                              the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                              SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                              Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                              Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                              Fig 15 Swarm Debugging Dashboard

                                                              30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                              Fig 16 Neo4J Browser - a Cypher query example

                                                              Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                              graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                              Figure 16 shows an example of Cypher query and the resulting graph

                                                              Swarm Debugging Tracer

                                                              Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                              After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                              To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                              32 httpneo4jcom

                                                              46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                              Fig 17 The Swarm Tracer architecture [17]

                                                              Fig 18 The Swarm Manager view

                                                              Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                              Fig 19 Breakpoint search tool (fuzzy search example)

                                                              invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                              To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                              Swarm Debugging Views

                                                              On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                              Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                              Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                              48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                              Fig 20 Sequence stack diagram for Bridge design pattern

                                                              Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                              Breakpoint Search Tool

                                                              Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                              Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                              Fig 21 Method call graph for Bridge design pattern [17]

                                                              StartingEnding Method Search Tool

                                                              This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                              Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                              StartingPoint = VSP | VSP isin α and VSP isin β

                                                              EndingPoint = VEP | VEP isin β and VEP isin α

                                                              Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                              Summary

                                                              Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                              50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                              graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                              Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                              • 1 Introduction
                                                              • 2 Background
                                                              • 3 The Swarm Debugging Approach
                                                              • 4 SDI in a Nutshell
                                                              • 5 Using SDI to Understand Debugging Activities
                                                              • 6 Evaluation of Swarm Debugging using GV
                                                              • 7 Discussion
                                                              • 8 Threats to Validity
                                                              • 9 Related work
                                                              • 10 Conclusion
                                                              • 11 Acknowledgment

                                                                32Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                644 Accidental Limitations

                                                                Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

                                                                Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

                                                                One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

                                                                Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

                                                                Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

                                                                645 General Feedback

                                                                Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

                                                                It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

                                                                This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                                                debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                                                7 Discussion

                                                                We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                                                Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                                                Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                                                Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                                                Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                                                There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                                                28 httpgithubcomswarmdebugging

                                                                34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                                8 Threats to Validity

                                                                Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                                As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                                Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                                Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                                We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                                Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                                Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                                Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                                External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                                9 Related work

                                                                We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                                Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                                Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                                36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                                Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                                Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                                DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                                Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                                Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                                Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                                Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                                Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                                Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                                Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                                Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                                10 Conclusion

                                                                Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                                To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                                The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                                38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                                Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                                Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                                In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                                Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                                Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                                haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                                Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                                11 Acknowledgment

                                                                This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                                References

                                                                1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                                2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                                3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                                Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                                rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                                neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                                8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                                9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                                10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                                neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                                on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                                13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                                org107287peerjpreprints2743v1

                                                                14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                                40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                                16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                                17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                                18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                                19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                                101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                                oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                                1218575

                                                                22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                                neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                                conditional_breakpointhtmampcp=1_3_6_0_5

                                                                23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                                24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                                linkspringercom101007s10818-015-9203-6

                                                                25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                                (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                                actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                                C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                                29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                                30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                                31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                                32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                                pmcentrezamprendertype=abstract

                                                                33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                                34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                                35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                                36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                                doiacmorg1011452622669

                                                                37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                                38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                                39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                                40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                                41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                                42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                                43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                                44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                                45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                                46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                                47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                                48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                                49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                                50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                                51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                                52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                                53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                                54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                                55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                                56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                                57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                                58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                                42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                Appendix - Implementation of Swarm Debugging

                                                                Swarm Debugging Services

                                                                The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                Fig 13 The Swarm Debugging Services architecture

                                                                The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                and debugging events

                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                Fig 14 The Swarm Debugging metadata [17]

                                                                ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                29 httpprojectsspringiospring-boot

                                                                44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                httpswarmdebuggingorgdevelopers

                                                                searchfindByNamename=petrillo

                                                                the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                Fig 15 Swarm Debugging Dashboard

                                                                30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                Fig 16 Neo4J Browser - a Cypher query example

                                                                Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                Figure 16 shows an example of Cypher query and the resulting graph

                                                                Swarm Debugging Tracer

                                                                Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                32 httpneo4jcom

                                                                46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                Fig 17 The Swarm Tracer architecture [17]

                                                                Fig 18 The Swarm Manager view

                                                                Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                Fig 19 Breakpoint search tool (fuzzy search example)

                                                                invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                Swarm Debugging Views

                                                                On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                Fig 20 Sequence stack diagram for Bridge design pattern

                                                                Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                Breakpoint Search Tool

                                                                Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                Fig 21 Method call graph for Bridge design pattern [17]

                                                                StartingEnding Method Search Tool

                                                                This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                StartingPoint = VSP | VSP isin α and VSP isin β

                                                                EndingPoint = VEP | VEP isin β and VEP isin α

                                                                Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                Summary

                                                                Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                • 1 Introduction
                                                                • 2 Background
                                                                • 3 The Swarm Debugging Approach
                                                                • 4 SDI in a Nutshell
                                                                • 5 Using SDI to Understand Debugging Activities
                                                                • 6 Evaluation of Swarm Debugging using GV
                                                                • 7 Discussion
                                                                • 8 Threats to Validity
                                                                • 9 Related work
                                                                • 10 Conclusion
                                                                • 11 Acknowledgment

                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 33

                                                                  debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

                                                                  7 Discussion

                                                                  We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

                                                                  Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

                                                                  Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

                                                                  Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

                                                                  Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

                                                                  There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

                                                                  28 httpgithubcomswarmdebugging

                                                                  34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                  We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                                  8 Threats to Validity

                                                                  Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                                  As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                                  Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                                  Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                                  We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                                  Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                                  Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                                  Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                                  External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                                  9 Related work

                                                                  We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                                  Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                                  Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                                  36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                  which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                                  Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                                  Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                                  DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                                  Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                                  Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                                  Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                                  Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                                  Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                                  Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                                  Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                                  Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                                  10 Conclusion

                                                                  Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                                  To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                                  The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                                  38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                  breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                                  Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                                  Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                                  In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                                  Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                                  Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                                  haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                                  Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                                  11 Acknowledgment

                                                                  This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                                  References

                                                                  1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                                  2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                                  3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                                  Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                                  rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                                  neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                                  8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                                  9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                                  10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                                  neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                                  on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                                  13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                                  org107287peerjpreprints2743v1

                                                                  14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                                  40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                  15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                                  16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                                  17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                                  18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                                  19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                                  101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                                  oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                                  1218575

                                                                  22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                                  neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                                  conditional_breakpointhtmampcp=1_3_6_0_5

                                                                  23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                                  24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                                  linkspringercom101007s10818-015-9203-6

                                                                  25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                                  (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                                  actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                                  C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                                  29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                                  30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                                  31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                                  32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                                  pmcentrezamprendertype=abstract

                                                                  33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                                  34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                                  35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                                  36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                                  doiacmorg1011452622669

                                                                  37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                                  38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                                  39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                                  40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                                  41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                                  42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                                  43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                                  44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                                  45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                                  46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                                  47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                                  48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                                  49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                                  50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                                  51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                                  52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                                  53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                                  54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                                  55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                                  56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                                  57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                                  58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                                  42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                  Appendix - Implementation of Swarm Debugging

                                                                  Swarm Debugging Services

                                                                  The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                  Fig 13 The Swarm Debugging Services architecture

                                                                  The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                  We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                  ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                  projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                  and debugging events

                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                  Fig 14 The Swarm Debugging metadata [17]

                                                                  ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                  ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                  ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                  ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                  ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                  ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                  The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                  Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                  29 httpprojectsspringiospring-boot

                                                                  44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                  and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                  httpswarmdebuggingorgdevelopers

                                                                  searchfindByNamename=petrillo

                                                                  the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                  SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                  Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                  Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                  Fig 15 Swarm Debugging Dashboard

                                                                  30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                  Fig 16 Neo4J Browser - a Cypher query example

                                                                  Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                  graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                  Figure 16 shows an example of Cypher query and the resulting graph

                                                                  Swarm Debugging Tracer

                                                                  Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                  After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                  To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                  32 httpneo4jcom

                                                                  46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                  Fig 17 The Swarm Tracer architecture [17]

                                                                  Fig 18 The Swarm Manager view

                                                                  Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                  Fig 19 Breakpoint search tool (fuzzy search example)

                                                                  invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                  To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                  Swarm Debugging Views

                                                                  On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                  Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                  Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                  48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                  Fig 20 Sequence stack diagram for Bridge design pattern

                                                                  Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                  Breakpoint Search Tool

                                                                  Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                  Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                  Fig 21 Method call graph for Bridge design pattern [17]

                                                                  StartingEnding Method Search Tool

                                                                  This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                  Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                  StartingPoint = VSP | VSP isin α and VSP isin β

                                                                  EndingPoint = VEP | VEP isin β and VEP isin α

                                                                  Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                  Summary

                                                                  Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                  50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                  graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                  Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                  • 1 Introduction
                                                                  • 2 Background
                                                                  • 3 The Swarm Debugging Approach
                                                                  • 4 SDI in a Nutshell
                                                                  • 5 Using SDI to Understand Debugging Activities
                                                                  • 6 Evaluation of Swarm Debugging using GV
                                                                  • 7 Discussion
                                                                  • 8 Threats to Validity
                                                                  • 9 Related work
                                                                  • 10 Conclusion
                                                                  • 11 Acknowledgment

                                                                    34Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                    We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

                                                                    8 Threats to Validity

                                                                    Despite its promising results there exist threats to the validity of our studythat we discuss in this section

                                                                    As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

                                                                    Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

                                                                    Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

                                                                    We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                                    Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                                    Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                                    Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                                    External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                                    9 Related work

                                                                    We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                                    Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                                    Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                                    36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                    which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                                    Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                                    Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                                    DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                                    Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                                    Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                                    Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                                    Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                                    Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                                    Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                                    Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                                    Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                                    10 Conclusion

                                                                    Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                                    To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                                    The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                                    38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                    breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                                    Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                                    Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                                    In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                                    Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                                    Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                                    haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                                    Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                                    11 Acknowledgment

                                                                    This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                                    References

                                                                    1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                                    2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                                    3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                                    Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                                    rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                                    neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                                    8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                                    9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                                    10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                                    neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                                    on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                                    13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                                    org107287peerjpreprints2743v1

                                                                    14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                                    40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                    15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                                    16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                                    17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                                    18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                                    19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                                    101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                                    oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                                    1218575

                                                                    22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                                    neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                                    conditional_breakpointhtmampcp=1_3_6_0_5

                                                                    23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                                    24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                                    linkspringercom101007s10818-015-9203-6

                                                                    25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                                    (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                                    actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                                    C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                                    29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                                    30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                                    31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                                    32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                                    pmcentrezamprendertype=abstract

                                                                    33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                                    34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                                    35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                                    36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                                    doiacmorg1011452622669

                                                                    37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                                    38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                                    39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                                    40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                                    41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                                    42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                                    43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                                    44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                                    45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                                    46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                                    47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                                    48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                                    49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                                    50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                                    51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                                    52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                                    53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                                    54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                                    55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                                    56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                                    57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                                    58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                                    42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                    Appendix - Implementation of Swarm Debugging

                                                                    Swarm Debugging Services

                                                                    The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                    Fig 13 The Swarm Debugging Services architecture

                                                                    The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                    We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                    ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                    projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                    and debugging events

                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                    Fig 14 The Swarm Debugging metadata [17]

                                                                    ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                    ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                    ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                    ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                    ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                    ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                    The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                    Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                    29 httpprojectsspringiospring-boot

                                                                    44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                    and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                    httpswarmdebuggingorgdevelopers

                                                                    searchfindByNamename=petrillo

                                                                    the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                    SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                    Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                    Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                    Fig 15 Swarm Debugging Dashboard

                                                                    30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                    Fig 16 Neo4J Browser - a Cypher query example

                                                                    Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                    graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                    Figure 16 shows an example of Cypher query and the resulting graph

                                                                    Swarm Debugging Tracer

                                                                    Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                    After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                    To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                    32 httpneo4jcom

                                                                    46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                    Fig 17 The Swarm Tracer architecture [17]

                                                                    Fig 18 The Swarm Manager view

                                                                    Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                    Fig 19 Breakpoint search tool (fuzzy search example)

                                                                    invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                    To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                    Swarm Debugging Views

                                                                    On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                    Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                    Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                    48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                    Fig 20 Sequence stack diagram for Bridge design pattern

                                                                    Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                    Breakpoint Search Tool

                                                                    Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                    Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                    Fig 21 Method call graph for Bridge design pattern [17]

                                                                    StartingEnding Method Search Tool

                                                                    This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                    Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                    StartingPoint = VSP | VSP isin α and VSP isin β

                                                                    EndingPoint = VEP | VEP isin β and VEP isin α

                                                                    Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                    Summary

                                                                    Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                    50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                    graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                    Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                    • 1 Introduction
                                                                    • 2 Background
                                                                    • 3 The Swarm Debugging Approach
                                                                    • 4 SDI in a Nutshell
                                                                    • 5 Using SDI to Understand Debugging Activities
                                                                    • 6 Evaluation of Swarm Debugging using GV
                                                                    • 7 Discussion
                                                                    • 8 Threats to Validity
                                                                    • 9 Related work
                                                                    • 10 Conclusion
                                                                    • 11 Acknowledgment

                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 35

                                                                      Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

                                                                      Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

                                                                      Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

                                                                      External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

                                                                      9 Related work

                                                                      We now summarise works related to debugging to allow better positioning ofour study among the published research

                                                                      Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

                                                                      Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

                                                                      36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                      which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                                      Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                                      Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                                      DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                                      Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                                      Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                                      Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                                      Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                                      Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                                      Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                                      Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                                      Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                                      10 Conclusion

                                                                      Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                                      To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                                      The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                                      38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                      breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                                      Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                                      Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                                      In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                                      Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                                      Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                                      haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                                      Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                                      11 Acknowledgment

                                                                      This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                                      References

                                                                      1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                                      2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                                      3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                                      Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                                      rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                                      neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                                      8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                                      9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                                      10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                                      neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                                      on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                                      13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                                      org107287peerjpreprints2743v1

                                                                      14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                                      40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                      15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                                      16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                                      17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                                      18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                                      19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                                      101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                                      oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                                      1218575

                                                                      22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                                      neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                                      conditional_breakpointhtmampcp=1_3_6_0_5

                                                                      23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                                      24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                                      linkspringercom101007s10818-015-9203-6

                                                                      25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                                      (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                                      actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                                      C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                                      29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                                      30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                                      31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                                      32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                                      pmcentrezamprendertype=abstract

                                                                      33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                                      34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                                      35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                                      36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                                      doiacmorg1011452622669

                                                                      37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                                      38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                                      39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                                      40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                                      41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                                      42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                                      43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                                      44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                                      45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                                      46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                                      47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                                      48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                                      49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                                      50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                                      51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                                      52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                                      53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                                      54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                                      55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                                      56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                                      57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                                      58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                                      42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                      Appendix - Implementation of Swarm Debugging

                                                                      Swarm Debugging Services

                                                                      The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                      Fig 13 The Swarm Debugging Services architecture

                                                                      The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                      We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                      ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                      projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                      and debugging events

                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                      Fig 14 The Swarm Debugging metadata [17]

                                                                      ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                      ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                      ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                      ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                      ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                      ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                      The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                      Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                      29 httpprojectsspringiospring-boot

                                                                      44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                      and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                      httpswarmdebuggingorgdevelopers

                                                                      searchfindByNamename=petrillo

                                                                      the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                      SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                      Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                      Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                      Fig 15 Swarm Debugging Dashboard

                                                                      30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                      Fig 16 Neo4J Browser - a Cypher query example

                                                                      Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                      graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                      Figure 16 shows an example of Cypher query and the resulting graph

                                                                      Swarm Debugging Tracer

                                                                      Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                      After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                      To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                      32 httpneo4jcom

                                                                      46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                      Fig 17 The Swarm Tracer architecture [17]

                                                                      Fig 18 The Swarm Manager view

                                                                      Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                      Fig 19 Breakpoint search tool (fuzzy search example)

                                                                      invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                      To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                      Swarm Debugging Views

                                                                      On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                      Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                      Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                      48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                      Fig 20 Sequence stack diagram for Bridge design pattern

                                                                      Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                      Breakpoint Search Tool

                                                                      Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                      Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                      Fig 21 Method call graph for Bridge design pattern [17]

                                                                      StartingEnding Method Search Tool

                                                                      This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                      Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                      StartingPoint = VSP | VSP isin α and VSP isin β

                                                                      EndingPoint = VEP | VEP isin β and VEP isin α

                                                                      Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                      Summary

                                                                      Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                      50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                      graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                      Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                      • 1 Introduction
                                                                      • 2 Background
                                                                      • 3 The Swarm Debugging Approach
                                                                      • 4 SDI in a Nutshell
                                                                      • 5 Using SDI to Understand Debugging Activities
                                                                      • 6 Evaluation of Swarm Debugging using GV
                                                                      • 7 Discussion
                                                                      • 8 Threats to Validity
                                                                      • 9 Related work
                                                                      • 10 Conclusion
                                                                      • 11 Acknowledgment

                                                                        36Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                        which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

                                                                        Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

                                                                        Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

                                                                        DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

                                                                        Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

                                                                        Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

                                                                        Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

                                                                        Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

                                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                                        Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                                        Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                                        Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                                        Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                                        10 Conclusion

                                                                        Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                                        To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                                        The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                                        38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                        breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                                        Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                                        Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                                        In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                                        Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                                        Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                                        haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                                        Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                                        11 Acknowledgment

                                                                        This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                                        References

                                                                        1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                                        2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                                        3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                                        Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                                        rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                                        neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                                        8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                                        9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                                        10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                                        neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                                        on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                                        13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                                        org107287peerjpreprints2743v1

                                                                        14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                                        40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                        15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                                        16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                                        17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                                        18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                                        19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                                        101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                                        oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                                        1218575

                                                                        22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                                        neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                                        conditional_breakpointhtmampcp=1_3_6_0_5

                                                                        23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                                        24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                                        linkspringercom101007s10818-015-9203-6

                                                                        25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                                        (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                                        actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                                        C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                                        29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                                        30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                                        31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                                        32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                                        pmcentrezamprendertype=abstract

                                                                        33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                                        34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                                        35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                                        36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                                        doiacmorg1011452622669

                                                                        37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                                        38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                                        39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                                        40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                                        41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                                        42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                                        43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                                        44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                                        45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                                        46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                                        47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                                        48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                                        49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                                        50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                                        51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                                        52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                                        53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                                        54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                                        55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                                        56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                                        57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                                        58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                                        42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                        Appendix - Implementation of Swarm Debugging

                                                                        Swarm Debugging Services

                                                                        The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                        Fig 13 The Swarm Debugging Services architecture

                                                                        The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                        We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                        ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                        projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                        and debugging events

                                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                        Fig 14 The Swarm Debugging metadata [17]

                                                                        ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                        ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                        ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                        ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                        ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                        ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                        The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                        Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                        29 httpprojectsspringiospring-boot

                                                                        44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                        and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                        httpswarmdebuggingorgdevelopers

                                                                        searchfindByNamename=petrillo

                                                                        the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                        SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                        Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                        Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                        Fig 15 Swarm Debugging Dashboard

                                                                        30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                        Fig 16 Neo4J Browser - a Cypher query example

                                                                        Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                        graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                        Figure 16 shows an example of Cypher query and the resulting graph

                                                                        Swarm Debugging Tracer

                                                                        Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                        After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                        To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                        32 httpneo4jcom

                                                                        46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                        Fig 17 The Swarm Tracer architecture [17]

                                                                        Fig 18 The Swarm Manager view

                                                                        Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                        Fig 19 Breakpoint search tool (fuzzy search example)

                                                                        invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                        To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                        Swarm Debugging Views

                                                                        On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                        Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                        Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                        48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                        Fig 20 Sequence stack diagram for Bridge design pattern

                                                                        Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                        Breakpoint Search Tool

                                                                        Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                        Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                        Fig 21 Method call graph for Bridge design pattern [17]

                                                                        StartingEnding Method Search Tool

                                                                        This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                        Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                        StartingPoint = VSP | VSP isin α and VSP isin β

                                                                        EndingPoint = VEP | VEP isin β and VEP isin α

                                                                        Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                        Summary

                                                                        Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                        50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                        graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                        Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                        • 1 Introduction
                                                                        • 2 Background
                                                                        • 3 The Swarm Debugging Approach
                                                                        • 4 SDI in a Nutshell
                                                                        • 5 Using SDI to Understand Debugging Activities
                                                                        • 6 Evaluation of Swarm Debugging using GV
                                                                        • 7 Discussion
                                                                        • 8 Threats to Validity
                                                                        • 9 Related work
                                                                        • 10 Conclusion
                                                                        • 11 Acknowledgment

                                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 37

                                                                          Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

                                                                          Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

                                                                          Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

                                                                          Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

                                                                          10 Conclusion

                                                                          Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

                                                                          To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

                                                                          The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

                                                                          38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                          breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                                          Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                                          Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                                          In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                                          Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                                          Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                                          haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                                          Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                                          11 Acknowledgment

                                                                          This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                                          References

                                                                          1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                                          2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                                          3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                                          Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                                          rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                                          neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                                          8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                                          9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                                          10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                                          neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                                          on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                                          13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                                          org107287peerjpreprints2743v1

                                                                          14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                                          40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                          15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                                          16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                                          17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                                          18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                                          19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                                          101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                                          oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                                          1218575

                                                                          22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                                          neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                                          conditional_breakpointhtmampcp=1_3_6_0_5

                                                                          23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                                          24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                                          linkspringercom101007s10818-015-9203-6

                                                                          25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                                          (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                                          actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                                          C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                                          29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                                          30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                                          31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                                          32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                                          pmcentrezamprendertype=abstract

                                                                          33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                                          34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                                          35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                                          36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                                          doiacmorg1011452622669

                                                                          37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                                          38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                                          39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                                          40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                                          41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                                          42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                                          43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                                          44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                                          45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                                          46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                                          47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                                          48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                                          49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                                          50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                                          51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                                          52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                                          53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                                          54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                                          55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                                          56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                                          57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                                          58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                                          42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                          Appendix - Implementation of Swarm Debugging

                                                                          Swarm Debugging Services

                                                                          The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                          Fig 13 The Swarm Debugging Services architecture

                                                                          The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                          We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                          ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                          projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                          and debugging events

                                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                          Fig 14 The Swarm Debugging metadata [17]

                                                                          ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                          ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                          ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                          ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                          ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                          ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                          The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                          Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                          29 httpprojectsspringiospring-boot

                                                                          44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                          and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                          httpswarmdebuggingorgdevelopers

                                                                          searchfindByNamename=petrillo

                                                                          the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                          SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                          Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                          Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                          Fig 15 Swarm Debugging Dashboard

                                                                          30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                          Fig 16 Neo4J Browser - a Cypher query example

                                                                          Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                          graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                          Figure 16 shows an example of Cypher query and the resulting graph

                                                                          Swarm Debugging Tracer

                                                                          Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                          After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                          To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                          32 httpneo4jcom

                                                                          46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                          Fig 17 The Swarm Tracer architecture [17]

                                                                          Fig 18 The Swarm Manager view

                                                                          Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                          Fig 19 Breakpoint search tool (fuzzy search example)

                                                                          invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                          To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                          Swarm Debugging Views

                                                                          On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                          Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                          Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                          48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                          Fig 20 Sequence stack diagram for Bridge design pattern

                                                                          Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                          Breakpoint Search Tool

                                                                          Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                          Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                          Fig 21 Method call graph for Bridge design pattern [17]

                                                                          StartingEnding Method Search Tool

                                                                          This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                          Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                          StartingPoint = VSP | VSP isin α and VSP isin β

                                                                          EndingPoint = VEP | VEP isin β and VEP isin α

                                                                          Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                          Summary

                                                                          Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                          50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                          graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                          Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                          • 1 Introduction
                                                                          • 2 Background
                                                                          • 3 The Swarm Debugging Approach
                                                                          • 4 SDI in a Nutshell
                                                                          • 5 Using SDI to Understand Debugging Activities
                                                                          • 6 Evaluation of Swarm Debugging using GV
                                                                          • 7 Discussion
                                                                          • 8 Threats to Validity
                                                                          • 9 Related work
                                                                          • 10 Conclusion
                                                                          • 11 Acknowledgment

                                                                            38Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                            breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

                                                                            Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

                                                                            Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

                                                                            In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

                                                                            Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

                                                                            Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

                                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                                            haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                                            Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                                            11 Acknowledgment

                                                                            This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                                            References

                                                                            1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                                            2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                                            3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                                            Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                                            rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                                            neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                                            8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                                            9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                                            10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                                            neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                                            on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                                            13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                                            org107287peerjpreprints2743v1

                                                                            14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                                            40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                            15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                                            16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                                            17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                                            18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                                            19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                                            101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                                            oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                                            1218575

                                                                            22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                                            neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                                            conditional_breakpointhtmampcp=1_3_6_0_5

                                                                            23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                                            24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                                            linkspringercom101007s10818-015-9203-6

                                                                            25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                                            (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                                            actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                                            C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                                            29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                                            30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                                            31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                                            32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                                            pmcentrezamprendertype=abstract

                                                                            33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                                            34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                                            35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                                            36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                                            doiacmorg1011452622669

                                                                            37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                                            38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                                            39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                                            40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                                            41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                                            42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                                            43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                                            44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                                            45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                                            46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                                            47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                                            48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                                            49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                                            50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                                            51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                                            52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                                            53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                                            54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                                            55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                                            56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                                            57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                                            58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                                            42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                            Appendix - Implementation of Swarm Debugging

                                                                            Swarm Debugging Services

                                                                            The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                            Fig 13 The Swarm Debugging Services architecture

                                                                            The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                            We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                            ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                            projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                            and debugging events

                                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                            Fig 14 The Swarm Debugging metadata [17]

                                                                            ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                            ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                            ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                            ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                            ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                            ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                            The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                            Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                            29 httpprojectsspringiospring-boot

                                                                            44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                            and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                            httpswarmdebuggingorgdevelopers

                                                                            searchfindByNamename=petrillo

                                                                            the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                            SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                            Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                            Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                            Fig 15 Swarm Debugging Dashboard

                                                                            30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                            Fig 16 Neo4J Browser - a Cypher query example

                                                                            Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                            graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                            Figure 16 shows an example of Cypher query and the resulting graph

                                                                            Swarm Debugging Tracer

                                                                            Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                            After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                            To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                            32 httpneo4jcom

                                                                            46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                            Fig 17 The Swarm Tracer architecture [17]

                                                                            Fig 18 The Swarm Manager view

                                                                            Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                            Fig 19 Breakpoint search tool (fuzzy search example)

                                                                            invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                            To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                            Swarm Debugging Views

                                                                            On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                            Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                            Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                            48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                            Fig 20 Sequence stack diagram for Bridge design pattern

                                                                            Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                            Breakpoint Search Tool

                                                                            Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                            Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                            Fig 21 Method call graph for Bridge design pattern [17]

                                                                            StartingEnding Method Search Tool

                                                                            This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                            Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                            StartingPoint = VSP | VSP isin α and VSP isin β

                                                                            EndingPoint = VEP | VEP isin β and VEP isin α

                                                                            Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                            Summary

                                                                            Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                            50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                            graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                            Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                            • 1 Introduction
                                                                            • 2 Background
                                                                            • 3 The Swarm Debugging Approach
                                                                            • 4 SDI in a Nutshell
                                                                            • 5 Using SDI to Understand Debugging Activities
                                                                            • 6 Evaluation of Swarm Debugging using GV
                                                                            • 7 Discussion
                                                                            • 8 Threats to Validity
                                                                            • 9 Related work
                                                                            • 10 Conclusion
                                                                            • 11 Acknowledgment

                                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 39

                                                                              haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

                                                                              Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

                                                                              11 Acknowledgment

                                                                              This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

                                                                              References

                                                                              1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

                                                                              2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

                                                                              3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

                                                                              Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

                                                                              rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

                                                                              neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

                                                                              8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

                                                                              9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

                                                                              10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

                                                                              neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

                                                                              on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

                                                                              13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

                                                                              org107287peerjpreprints2743v1

                                                                              14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

                                                                              40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                              15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                                              16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                                              17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                                              18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                                              19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                                              101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                                              oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                                              1218575

                                                                              22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                                              neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                                              conditional_breakpointhtmampcp=1_3_6_0_5

                                                                              23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                                              24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                                              linkspringercom101007s10818-015-9203-6

                                                                              25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                                              (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                                              actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                                              C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                                              29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                                              30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                                              31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                                              32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                                              pmcentrezamprendertype=abstract

                                                                              33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                                              34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                                              35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                                              36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                                              doiacmorg1011452622669

                                                                              37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                                              38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                                              39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                                              40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                                              41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                                              42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                                              43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                                              44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                                              45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                                              46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                                              47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                                              48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                                              49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                                              50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                                              51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                                              52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                                              53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                                              54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                                              55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                                              56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                                              57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                                              58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                                              42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                              Appendix - Implementation of Swarm Debugging

                                                                              Swarm Debugging Services

                                                                              The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                              Fig 13 The Swarm Debugging Services architecture

                                                                              The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                              We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                              ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                              projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                              and debugging events

                                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                              Fig 14 The Swarm Debugging metadata [17]

                                                                              ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                              ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                              ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                              ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                              ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                              ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                              The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                              Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                              29 httpprojectsspringiospring-boot

                                                                              44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                              and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                              httpswarmdebuggingorgdevelopers

                                                                              searchfindByNamename=petrillo

                                                                              the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                              SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                              Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                              Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                              Fig 15 Swarm Debugging Dashboard

                                                                              30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                              Fig 16 Neo4J Browser - a Cypher query example

                                                                              Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                              graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                              Figure 16 shows an example of Cypher query and the resulting graph

                                                                              Swarm Debugging Tracer

                                                                              Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                              After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                              To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                              32 httpneo4jcom

                                                                              46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                              Fig 17 The Swarm Tracer architecture [17]

                                                                              Fig 18 The Swarm Manager view

                                                                              Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                              Fig 19 Breakpoint search tool (fuzzy search example)

                                                                              invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                              To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                              Swarm Debugging Views

                                                                              On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                              Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                              Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                              48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                              Fig 20 Sequence stack diagram for Bridge design pattern

                                                                              Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                              Breakpoint Search Tool

                                                                              Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                              Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                              Fig 21 Method call graph for Bridge design pattern [17]

                                                                              StartingEnding Method Search Tool

                                                                              This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                              Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                              StartingPoint = VSP | VSP isin α and VSP isin β

                                                                              EndingPoint = VEP | VEP isin β and VEP isin α

                                                                              Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                              Summary

                                                                              Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                              50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                              graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                              Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                              • 1 Introduction
                                                                              • 2 Background
                                                                              • 3 The Swarm Debugging Approach
                                                                              • 4 SDI in a Nutshell
                                                                              • 5 Using SDI to Understand Debugging Activities
                                                                              • 6 Evaluation of Swarm Debugging using GV
                                                                              • 7 Discussion
                                                                              • 8 Threats to Validity
                                                                              • 9 Related work
                                                                              • 10 Conclusion
                                                                              • 11 Acknowledgment

                                                                                40Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

                                                                                16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

                                                                                17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

                                                                                18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

                                                                                19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

                                                                                101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

                                                                                oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

                                                                                1218575

                                                                                22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

                                                                                neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

                                                                                conditional_breakpointhtmampcp=1_3_6_0_5

                                                                                23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

                                                                                24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

                                                                                linkspringercom101007s10818-015-9203-6

                                                                                25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

                                                                                (Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

                                                                                actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

                                                                                C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

                                                                                29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

                                                                                30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

                                                                                31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

                                                                                32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

                                                                                pmcentrezamprendertype=abstract

                                                                                33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

                                                                                34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

                                                                                35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

                                                                                36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

                                                                                doiacmorg1011452622669

                                                                                37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

                                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                                                38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                                                39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                                                40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                                                41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                                                42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                                                43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                                                44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                                                45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                                                46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                                                47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                                                48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                                                49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                                                50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                                                51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                                                52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                                                53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                                                54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                                                55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                                                56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                                                57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                                                58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                                                42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                Appendix - Implementation of Swarm Debugging

                                                                                Swarm Debugging Services

                                                                                The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                                Fig 13 The Swarm Debugging Services architecture

                                                                                The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                                We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                                ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                                projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                                and debugging events

                                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                                Fig 14 The Swarm Debugging metadata [17]

                                                                                ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                                ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                                ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                                ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                                ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                                ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                                The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                                Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                                29 httpprojectsspringiospring-boot

                                                                                44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                                httpswarmdebuggingorgdevelopers

                                                                                searchfindByNamename=petrillo

                                                                                the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                                SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                                Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                                Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                                Fig 15 Swarm Debugging Dashboard

                                                                                30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                                Fig 16 Neo4J Browser - a Cypher query example

                                                                                Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                                graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                                Figure 16 shows an example of Cypher query and the resulting graph

                                                                                Swarm Debugging Tracer

                                                                                Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                                After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                                To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                                32 httpneo4jcom

                                                                                46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                Fig 17 The Swarm Tracer architecture [17]

                                                                                Fig 18 The Swarm Manager view

                                                                                Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                                Fig 19 Breakpoint search tool (fuzzy search example)

                                                                                invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                                To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                                Swarm Debugging Views

                                                                                On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                                Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                                Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                                48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                Fig 20 Sequence stack diagram for Bridge design pattern

                                                                                Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                                Breakpoint Search Tool

                                                                                Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                                Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                                Fig 21 Method call graph for Bridge design pattern [17]

                                                                                StartingEnding Method Search Tool

                                                                                This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                                Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                                StartingPoint = VSP | VSP isin α and VSP isin β

                                                                                EndingPoint = VEP | VEP isin β and VEP isin α

                                                                                Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                                Summary

                                                                                Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                                50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                • 1 Introduction
                                                                                • 2 Background
                                                                                • 3 The Swarm Debugging Approach
                                                                                • 4 SDI in a Nutshell
                                                                                • 5 Using SDI to Understand Debugging Activities
                                                                                • 6 Evaluation of Swarm Debugging using GV
                                                                                • 7 Discussion
                                                                                • 8 Threats to Validity
                                                                                • 9 Related work
                                                                                • 10 Conclusion
                                                                                • 11 Acknowledgment

                                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 41

                                                                                  38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

                                                                                  39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

                                                                                  40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

                                                                                  41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

                                                                                  42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

                                                                                  43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

                                                                                  44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

                                                                                  45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

                                                                                  46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

                                                                                  47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

                                                                                  48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

                                                                                  49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

                                                                                  50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

                                                                                  51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

                                                                                  52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

                                                                                  53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

                                                                                  54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

                                                                                  55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

                                                                                  56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

                                                                                  57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

                                                                                  58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

                                                                                  42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                  Appendix - Implementation of Swarm Debugging

                                                                                  Swarm Debugging Services

                                                                                  The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                                  Fig 13 The Swarm Debugging Services architecture

                                                                                  The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                                  We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                                  ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                                  projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                                  and debugging events

                                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                                  Fig 14 The Swarm Debugging metadata [17]

                                                                                  ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                                  ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                                  ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                                  ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                                  ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                                  ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                                  The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                                  Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                                  29 httpprojectsspringiospring-boot

                                                                                  44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                  and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                                  httpswarmdebuggingorgdevelopers

                                                                                  searchfindByNamename=petrillo

                                                                                  the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                                  SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                                  Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                                  Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                                  Fig 15 Swarm Debugging Dashboard

                                                                                  30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                                  Fig 16 Neo4J Browser - a Cypher query example

                                                                                  Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                                  graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                                  Figure 16 shows an example of Cypher query and the resulting graph

                                                                                  Swarm Debugging Tracer

                                                                                  Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                                  After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                                  To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                                  32 httpneo4jcom

                                                                                  46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                  Fig 17 The Swarm Tracer architecture [17]

                                                                                  Fig 18 The Swarm Manager view

                                                                                  Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                                  Fig 19 Breakpoint search tool (fuzzy search example)

                                                                                  invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                                  To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                                  Swarm Debugging Views

                                                                                  On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                                  Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                                  Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                                  48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                  Fig 20 Sequence stack diagram for Bridge design pattern

                                                                                  Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                                  Breakpoint Search Tool

                                                                                  Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                                  Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                                  Fig 21 Method call graph for Bridge design pattern [17]

                                                                                  StartingEnding Method Search Tool

                                                                                  This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                                  Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                                  StartingPoint = VSP | VSP isin α and VSP isin β

                                                                                  EndingPoint = VEP | VEP isin β and VEP isin α

                                                                                  Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                                  Summary

                                                                                  Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                                  50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                  graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                  Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                  • 1 Introduction
                                                                                  • 2 Background
                                                                                  • 3 The Swarm Debugging Approach
                                                                                  • 4 SDI in a Nutshell
                                                                                  • 5 Using SDI to Understand Debugging Activities
                                                                                  • 6 Evaluation of Swarm Debugging using GV
                                                                                  • 7 Discussion
                                                                                  • 8 Threats to Validity
                                                                                  • 9 Related work
                                                                                  • 10 Conclusion
                                                                                  • 11 Acknowledgment

                                                                                    42Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                    Appendix - Implementation of Swarm Debugging

                                                                                    Swarm Debugging Services

                                                                                    The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

                                                                                    Fig 13 The Swarm Debugging Services architecture

                                                                                    The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

                                                                                    We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

                                                                                    ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

                                                                                    projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

                                                                                    and debugging events

                                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                                    Fig 14 The Swarm Debugging metadata [17]

                                                                                    ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                                    ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                                    ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                                    ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                                    ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                                    ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                                    The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                                    Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                                    29 httpprojectsspringiospring-boot

                                                                                    44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                    and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                                    httpswarmdebuggingorgdevelopers

                                                                                    searchfindByNamename=petrillo

                                                                                    the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                                    SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                                    Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                                    Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                                    Fig 15 Swarm Debugging Dashboard

                                                                                    30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                                    Fig 16 Neo4J Browser - a Cypher query example

                                                                                    Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                                    graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                                    Figure 16 shows an example of Cypher query and the resulting graph

                                                                                    Swarm Debugging Tracer

                                                                                    Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                                    After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                                    To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                                    32 httpneo4jcom

                                                                                    46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                    Fig 17 The Swarm Tracer architecture [17]

                                                                                    Fig 18 The Swarm Manager view

                                                                                    Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                                    Fig 19 Breakpoint search tool (fuzzy search example)

                                                                                    invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                                    To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                                    Swarm Debugging Views

                                                                                    On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                                    Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                                    Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                                    48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                    Fig 20 Sequence stack diagram for Bridge design pattern

                                                                                    Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                                    Breakpoint Search Tool

                                                                                    Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                                    Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                                    Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                                    Fig 21 Method call graph for Bridge design pattern [17]

                                                                                    StartingEnding Method Search Tool

                                                                                    This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                                    Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                                    StartingPoint = VSP | VSP isin α and VSP isin β

                                                                                    EndingPoint = VEP | VEP isin β and VEP isin α

                                                                                    Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                                    Summary

                                                                                    Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                                    50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                    graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                    Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                    • 1 Introduction
                                                                                    • 2 Background
                                                                                    • 3 The Swarm Debugging Approach
                                                                                    • 4 SDI in a Nutshell
                                                                                    • 5 Using SDI to Understand Debugging Activities
                                                                                    • 6 Evaluation of Swarm Debugging using GV
                                                                                    • 7 Discussion
                                                                                    • 8 Threats to Validity
                                                                                    • 9 Related work
                                                                                    • 10 Conclusion
                                                                                    • 11 Acknowledgment

                                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 43

                                                                                      Fig 14 The Swarm Debugging metadata [17]

                                                                                      ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

                                                                                      ndash Method is a method associated with a type which can be invoked duringdebugging sessions

                                                                                      ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

                                                                                      ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

                                                                                      ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

                                                                                      ndash Event is an event data that is collected when a developer performs someactions during a debugging session

                                                                                      The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

                                                                                      Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

                                                                                      29 httpprojectsspringiospring-boot

                                                                                      44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                      and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                                      httpswarmdebuggingorgdevelopers

                                                                                      searchfindByNamename=petrillo

                                                                                      the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                                      SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                                      Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                                      Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                                      Fig 15 Swarm Debugging Dashboard

                                                                                      30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                                      Fig 16 Neo4J Browser - a Cypher query example

                                                                                      Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                                      graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                                      Figure 16 shows an example of Cypher query and the resulting graph

                                                                                      Swarm Debugging Tracer

                                                                                      Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                                      After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                                      To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                                      32 httpneo4jcom

                                                                                      46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                      Fig 17 The Swarm Tracer architecture [17]

                                                                                      Fig 18 The Swarm Manager view

                                                                                      Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                                      Fig 19 Breakpoint search tool (fuzzy search example)

                                                                                      invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                                      To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                                      Swarm Debugging Views

                                                                                      On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                                      Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                                      Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                                      48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                      Fig 20 Sequence stack diagram for Bridge design pattern

                                                                                      Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                                      Breakpoint Search Tool

                                                                                      Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                                      Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                                      Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                                      Fig 21 Method call graph for Bridge design pattern [17]

                                                                                      StartingEnding Method Search Tool

                                                                                      This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                                      Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                                      StartingPoint = VSP | VSP isin α and VSP isin β

                                                                                      EndingPoint = VEP | VEP isin β and VEP isin α

                                                                                      Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                                      Summary

                                                                                      Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                                      50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                      graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                      Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                      • 1 Introduction
                                                                                      • 2 Background
                                                                                      • 3 The Swarm Debugging Approach
                                                                                      • 4 SDI in a Nutshell
                                                                                      • 5 Using SDI to Understand Debugging Activities
                                                                                      • 6 Evaluation of Swarm Debugging using GV
                                                                                      • 7 Discussion
                                                                                      • 8 Threats to Validity
                                                                                      • 9 Related work
                                                                                      • 10 Conclusion
                                                                                      • 11 Acknowledgment

                                                                                        44Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                        and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

                                                                                        httpswarmdebuggingorgdevelopers

                                                                                        searchfindByNamename=petrillo

                                                                                        the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

                                                                                        SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

                                                                                        Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

                                                                                        Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

                                                                                        Fig 15 Swarm Debugging Dashboard

                                                                                        30 httpdbswarmdebuggingorg31 httpswwwelasticco

                                                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                                        Fig 16 Neo4J Browser - a Cypher query example

                                                                                        Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                                        graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                                        Figure 16 shows an example of Cypher query and the resulting graph

                                                                                        Swarm Debugging Tracer

                                                                                        Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                                        After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                                        To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                                        32 httpneo4jcom

                                                                                        46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                        Fig 17 The Swarm Tracer architecture [17]

                                                                                        Fig 18 The Swarm Manager view

                                                                                        Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                                        Fig 19 Breakpoint search tool (fuzzy search example)

                                                                                        invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                                        To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                                        Swarm Debugging Views

                                                                                        On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                                        Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                                        Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                                        48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                        Fig 20 Sequence stack diagram for Bridge design pattern

                                                                                        Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                                        Breakpoint Search Tool

                                                                                        Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                                        Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                                        Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                                        Fig 21 Method call graph for Bridge design pattern [17]

                                                                                        StartingEnding Method Search Tool

                                                                                        This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                                        Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                                        StartingPoint = VSP | VSP isin α and VSP isin β

                                                                                        EndingPoint = VEP | VEP isin β and VEP isin α

                                                                                        Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                                        Summary

                                                                                        Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                                        50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                        graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                        Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                        • 1 Introduction
                                                                                        • 2 Background
                                                                                        • 3 The Swarm Debugging Approach
                                                                                        • 4 SDI in a Nutshell
                                                                                        • 5 Using SDI to Understand Debugging Activities
                                                                                        • 6 Evaluation of Swarm Debugging using GV
                                                                                        • 7 Discussion
                                                                                        • 8 Threats to Validity
                                                                                        • 9 Related work
                                                                                        • 10 Conclusion
                                                                                        • 11 Acknowledgment

                                                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 45

                                                                                          Fig 16 Neo4J Browser - a Cypher query example

                                                                                          Graph Querying Console The SDS also persists debugging data in a Neo4J32

                                                                                          graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

                                                                                          Figure 16 shows an example of Cypher query and the resulting graph

                                                                                          Swarm Debugging Tracer

                                                                                          Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

                                                                                          After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

                                                                                          To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

                                                                                          32 httpneo4jcom

                                                                                          46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                          Fig 17 The Swarm Tracer architecture [17]

                                                                                          Fig 18 The Swarm Manager view

                                                                                          Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                                          Fig 19 Breakpoint search tool (fuzzy search example)

                                                                                          invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                                          To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                                          Swarm Debugging Views

                                                                                          On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                                          Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                                          Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                                          48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                          Fig 20 Sequence stack diagram for Bridge design pattern

                                                                                          Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                                          Breakpoint Search Tool

                                                                                          Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                                          Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                                          Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                                          Fig 21 Method call graph for Bridge design pattern [17]

                                                                                          StartingEnding Method Search Tool

                                                                                          This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                                          Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                                          StartingPoint = VSP | VSP isin α and VSP isin β

                                                                                          EndingPoint = VEP | VEP isin β and VEP isin α

                                                                                          Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                                          Summary

                                                                                          Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                                          50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                          graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                          Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                          • 1 Introduction
                                                                                          • 2 Background
                                                                                          • 3 The Swarm Debugging Approach
                                                                                          • 4 SDI in a Nutshell
                                                                                          • 5 Using SDI to Understand Debugging Activities
                                                                                          • 6 Evaluation of Swarm Debugging using GV
                                                                                          • 7 Discussion
                                                                                          • 8 Threats to Validity
                                                                                          • 9 Related work
                                                                                          • 10 Conclusion
                                                                                          • 11 Acknowledgment

                                                                                            46Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                            Fig 17 The Swarm Tracer architecture [17]

                                                                                            Fig 18 The Swarm Manager view

                                                                                            Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

                                                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                                            Fig 19 Breakpoint search tool (fuzzy search example)

                                                                                            invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                                            To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                                            Swarm Debugging Views

                                                                                            On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                                            Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                                            Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                                            48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                            Fig 20 Sequence stack diagram for Bridge design pattern

                                                                                            Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                                            Breakpoint Search Tool

                                                                                            Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                                            Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                                            Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                                            Fig 21 Method call graph for Bridge design pattern [17]

                                                                                            StartingEnding Method Search Tool

                                                                                            This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                                            Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                                            StartingPoint = VSP | VSP isin α and VSP isin β

                                                                                            EndingPoint = VEP | VEP isin β and VEP isin α

                                                                                            Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                                            Summary

                                                                                            Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                                            50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                            graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                            Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                            • 1 Introduction
                                                                                            • 2 Background
                                                                                            • 3 The Swarm Debugging Approach
                                                                                            • 4 SDI in a Nutshell
                                                                                            • 5 Using SDI to Understand Debugging Activities
                                                                                            • 6 Evaluation of Swarm Debugging using GV
                                                                                            • 7 Discussion
                                                                                            • 8 Threats to Validity
                                                                                            • 9 Related work
                                                                                            • 10 Conclusion
                                                                                            • 11 Acknowledgment

                                                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 47

                                                                                              Fig 19 Breakpoint search tool (fuzzy search example)

                                                                                              invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

                                                                                              To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

                                                                                              Swarm Debugging Views

                                                                                              On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

                                                                                              Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

                                                                                              Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

                                                                                              48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                              Fig 20 Sequence stack diagram for Bridge design pattern

                                                                                              Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                                              Breakpoint Search Tool

                                                                                              Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                                              Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                                              Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                                              Fig 21 Method call graph for Bridge design pattern [17]

                                                                                              StartingEnding Method Search Tool

                                                                                              This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                                              Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                                              StartingPoint = VSP | VSP isin α and VSP isin β

                                                                                              EndingPoint = VEP | VEP isin β and VEP isin α

                                                                                              Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                                              Summary

                                                                                              Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                                              50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                              graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                              Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                              • 1 Introduction
                                                                                              • 2 Background
                                                                                              • 3 The Swarm Debugging Approach
                                                                                              • 4 SDI in a Nutshell
                                                                                              • 5 Using SDI to Understand Debugging Activities
                                                                                              • 6 Evaluation of Swarm Debugging using GV
                                                                                              • 7 Discussion
                                                                                              • 8 Threats to Validity
                                                                                              • 9 Related work
                                                                                              • 10 Conclusion
                                                                                              • 11 Acknowledgment

                                                                                                48Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                                Fig 20 Sequence stack diagram for Bridge design pattern

                                                                                                Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

                                                                                                Breakpoint Search Tool

                                                                                                Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

                                                                                                Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

                                                                                                Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                                                Fig 21 Method call graph for Bridge design pattern [17]

                                                                                                StartingEnding Method Search Tool

                                                                                                This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                                                Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                                                StartingPoint = VSP | VSP isin α and VSP isin β

                                                                                                EndingPoint = VEP | VEP isin β and VEP isin α

                                                                                                Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                                                Summary

                                                                                                Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                                                50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                                graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                                Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                                • 1 Introduction
                                                                                                • 2 Background
                                                                                                • 3 The Swarm Debugging Approach
                                                                                                • 4 SDI in a Nutshell
                                                                                                • 5 Using SDI to Understand Debugging Activities
                                                                                                • 6 Evaluation of Swarm Debugging using GV
                                                                                                • 7 Discussion
                                                                                                • 8 Threats to Validity
                                                                                                • 9 Related work
                                                                                                • 10 Conclusion
                                                                                                • 11 Acknowledgment

                                                                                                  Swarm Debugging the Collective Intelligence on Interactive Debugging 49

                                                                                                  Fig 21 Method call graph for Bridge design pattern [17]

                                                                                                  StartingEnding Method Search Tool

                                                                                                  This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

                                                                                                  Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

                                                                                                  StartingPoint = VSP | VSP isin α and VSP isin β

                                                                                                  EndingPoint = VEP | VEP isin β and VEP isin α

                                                                                                  Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

                                                                                                  Summary

                                                                                                  Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

                                                                                                  50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                                  graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                                  Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                                  • 1 Introduction
                                                                                                  • 2 Background
                                                                                                  • 3 The Swarm Debugging Approach
                                                                                                  • 4 SDI in a Nutshell
                                                                                                  • 5 Using SDI to Understand Debugging Activities
                                                                                                  • 6 Evaluation of Swarm Debugging using GV
                                                                                                  • 7 Discussion
                                                                                                  • 8 Threats to Validity
                                                                                                  • 9 Related work
                                                                                                  • 10 Conclusion
                                                                                                  • 11 Acknowledgment

                                                                                                    50Please give a shorter version with authorrunning and titlerunning prior to maketitle

                                                                                                    graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

                                                                                                    Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

                                                                                                    • 1 Introduction
                                                                                                    • 2 Background
                                                                                                    • 3 The Swarm Debugging Approach
                                                                                                    • 4 SDI in a Nutshell
                                                                                                    • 5 Using SDI to Understand Debugging Activities
                                                                                                    • 6 Evaluation of Swarm Debugging using GV
                                                                                                    • 7 Discussion
                                                                                                    • 8 Threats to Validity
                                                                                                    • 9 Related work
                                                                                                    • 10 Conclusion
                                                                                                    • 11 Acknowledgment

                                                                                                      top related