Using Emergent Team Structure to Focus Collaboration by Shawn Minto B.Sc., The University of British Columbia, 2005 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in The Faculty of Graduate Studies (Computer Science) The University Of British Columbia January 30, 2007 c Shawn Minto 2007
77
Embed
Using Emergent Team Structure to Focus Collaboration · Using Emergent Team Structure to Focus Collaboration by Shawn Minto B.Sc., ... Master of Science in The Faculty of Graduate
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Using Emergent Team Structure to Focus Collaboration
by
Shawn Minto
B.Sc., The University of British Columbia, 2005
A THESIS SUBMITTED IN PARTIAL FULFILMENT OFTHE REQUIREMENTS FOR THE DEGREE OF
To build successful complex software systems, developers must collaborate witheach other to solve issues. To facilitate this collaboration specialized tools arebeing integrated into development environments. Although these tools facilitatecollaboration, they do not foster it. The problem is that the tools require thedevelopers to maintain a list of other developers with whom they may wish tocommunicate. In any given situation, it is the developer who must determinewho within this list has expertise for the specific situation. Unless the team issmall and static, maintaining the knowledge about who is expert in particularparts of the system is difficult. As many organizations are beginning to useagile development and distributed software practices, which result in teams withdynamic membership, maintaining this knowledge is impossible.
This thesis investigates whether emergent team structure can be used tosupport collaboration amongst software developers. The membership of anemergent team is determined from analysis of software artifacts. We first showthat emergent teams exist within a particular open-source software project, theEclipse integrated development environment. We then present a tool calledEmergent Expertise Locator (EEL) that uses emergent team information topropose experts to a developer within their development environment as thedeveloper works.
We validated this approach to support collaboration by applying our ap-proach to historical data gathered from the Eclipse project, Firefox and Bugzillaand comparing the results to an existing heuristic for recommending experts thatproduces a list of experts based on the revision history of individual files. Wefound that EEL produces, on average, results with higher precision and higherrecall than the existing heuristic.
I would like to thank my supervisor Gail Murphy for introducing me to researchduring a co-op work term as an undergraduate. I would not have known aboutthe interesting world of research if it were not for her. Also, through work-ing on Mylar with Mik, my research interests in collaboration and task baseddevelopment were made apparent. Furthermore, I would like to thank all ofthe members of the Software Practices Lab for engaging conversations about arange of research topics related to software engineering.
Finally, I could not have finished this thesis without all of the love andsupport from Kenedee over the past two years.
1
Chapter 1
Introduction
Software developers must collaborate with each other at all stages of the soft-ware life-cycle to build successful complex software systems. To enable thiscollaboration, integrated development environments (IDEs) are including an in-creasing number of tools to support collaboration, such as chat support (e.g.,ECF1 and the Team Work Facilitation in IntelliJ 2) and screen sharing (e.g.,IBM Jazz3).
All of these tools have two limitations that make them harder to use thannecessary. First, the tools require the user to spend time and effort explainingthe tool to all members of a team with whom he may want to communicateover time (i.e., a buddy list). Given that the composition of software teamsis increasingly dynamic for many organizations due to agile development pro-cesses, distributed software development and other similar trends, it may not bestraightforward for a developer to keep a description of colleagues on the manyteams in which she may work up-to-date.4 Second, the tools require the userto determine with whom he should collaborate in a particular situation. Thisrequirement forces the user to have some knowledge of who has expertise onparticular parts of the system.
To support collaboration amongst members of such dynamic teams, there isa need for a mechanism to determine the composition of the team automaticallyso that developers do not need to spend time configuring membership lists forthe many teams to which they may belong. We believe, for many cases in whichcollaboration needs to occur, the context from which a developer initiates com-munication combined with information about the activity of developers on itemsrelated to that context can be used to determine the appropriate compositionof the team. We consider that the team structure emerges from the activity,and thus refer to this problem as determining emergent team structure.
In this thesis, we describe an approach and tool, called Emergent ExpertiseLocator (EEL), that overcomes these limitations for developers working on code.
1ECF is the Eclipse Communications Framework, http://www.eclipse.org/ecf/, verified12/17/06.
2IntelliJ is a Java development environment, http://www.jetbrains.com/idea/, verified12/17/06.
3Jazz is an IBM software development environment supporting team development anddesigned to incorporate all development artifacts and processes for a company. Some of thefeatures included are source control, issue tracking and synchronous communication throughchat.
4As one example, the Eclipse development process uses dynamic teamsas described by Gamma and Wiegand in an EclipseCon 2005 presentation,http://eclipsecon.org/2005/presentations/econ2005-eclipse-way.pdf, verified 12/17/06.
Chapter 1. Introduction 2
The intuition is that a useful definition of a team, from the point of view of aid-ing collaboration, are those colleagues who can provide useful help in solving aparticular problem. We approximate the nature of a problem by the file(s) onwhich a developer is working. Based on the history of how files have changed inthe past together and who has participated in the changes, we can recommendmembers of an emergent team for the current problem of interest. Our ap-proach uses the framework from Cataldo et. al.[2], adapting their matrix-basedcomputation to support on-line recommendations using different information,specifically files rather than task communication evidence. EEL produces a col-laboration matrix C through the computation, C = (FAFD)FT
A where FA is afile authorship matrix and FD is a file dependency matrix. Further informationon the matrices and the computation are presented in Section 4.1.1. After thiscomputation, a value in Cij describes which developers should interact basedon the revisions that they have committed in the past to the repository. Weuse these values in C to recommend a ranked list of the likely emergent teammembers with whom to communicate given a set of files currently of interest.
1.1 Scenario
To describe why and how EEL can help developers as they work, we describea scenario of a common development task that may require communicationbetween two developers to gather knowledge required to solve an issue. Thisscenario provides insight into the simplicity and usefulness of EEL.
Selena, Dave and John are developers on the same open-source project.Selena, Dave and John each live in a different part of the world. This projecthas been released and the developers are currently working to fix bugs thathave been reported by users. Even though these three developers all work onthe same project, each is more knowledgeable about a different area of the codethan the others because he or she has worked on that part of the code base morefrequently.
Many of the bugs that have been reported refer to the core data model forthe system, a part of the system for which Selena is the most knowledgeable.Dave, on the other hand, was in charge of the external representation of themodel for the release. Since Dave is somewhat knowledgeable about the model,and there have been no bugs reported related to his portion of the system, hedecides to help Selena and fix some bugs that are more appropriate for her.Dave picks a bug. To start working on it, he investigates a stack trace providedin the bug that is related to the error. Although he can reproduce the problem,he is unable to determine the source of the problem because it requires extensiveknowledge of how events are sent and handled within the model. Since he isunaware of how this works, he right clicks on the file on which he is currentlyworking and views a list of people to contact on the team associated with thatpart of the software as determined by EEL. This ranked list shows Selena listedas the most knowledgeable followed by John. Since Dave knows that Selenawas busy, he does not want to contact her. However, EEL has made it evident
Chapter 1. Introduction 3
that John is also knowledgeable, a fact of which Dave was unaware. John islisted because he worked part time on the model, but is very knowledgeable ofit. Dave decides to contact John through chat and is able to gain the knowledgethat he needed to fix the bug. If John was not online, Dave could have used anasynchronous communication method like e-mail to contact him.
EEL is not only useful during the exploration of a system to fix a bug, but itcan also be useful during regular development and testing and during mentoringwhen junior developers are becoming familiar with a system. EEL is integratedinto an IDE such that at any point while working on a system, a developer canopen the context menu on a file, determine who they may contact to gathermore information about a task that they are currently working on and easilyinitiate communication with the appropriate colleague.
1.2 Validation Approach
To determine the accuracy of EEL in predicting emergent teams, we appliedthe approach to historical data for the Eclipse project, Firefox and Bugzilla.The validation of EEL was fully automated and did not involve users since it isdifficult to recruit developers if there is no evidence of the usefulness of the tool.To perform the validation, we needed two pieces of information, a bug reportand the list of files needed to fix the bug report. Bug reports provide a recordof communication on a particular issue and we use the commenters on the bugas a list of potential experts. We populate EEL using the list of files neededto fix the bug as determined using a standard means of associating bugs andfile revisions (See Section 5.2). We then used the recommendations from EELand the list of potential experts from the bug report to calculate the precisionand recall. We then compared the performance of EEL to an existing heuristicfor recommending experts that produces a list of experts based on the revisionhistory of individual files. We found that EEL produces, on average, resultswith higher precision and higher recall than the existing heuristic.
1.3 Thesis Structure
We begin by comparing our approach with existing work on locating experts(Chapter 2). Next we show that emergent teams exist in Chapter 3. Next,we describe our approach and implementation (Chapter 4) before presentingour validation of the approach (Chapter 5). Before summarizing, we discussoutstanding issues with our approach and validation (Chapter 6).
4
Chapter 2
Related Work
Three types of approaches have been used to recommend experts for a softwaredevelopment project: heuristic-based (e.g., [15]), social network-based (e.g.,[16]) and machine learning-based (e.g., [1]).
Heuristic-based recommenders apply heuristics against data collected fromand about the development to determine who is expert in various areas of thesystem. Some approaches require users to maintain profiles that describe theirarea of expertise (i.e., Hewlett-Packard’s CONNEX1) or organizational position(i.e., [13]). CONNEX is a traditional expertise finder which requires users tomaintain a profile of their expertise. CONNEX then allows users to search orbrowse the directory of profiles looking for a person with the expertise in whichthey are interested . Expertise Recommender (ER) uses an organizational chartof departments within a company to determine if an expert should be recom-mended based on the “distance” the departments are from each other [13]. Thisallows ER to limit the recommendations to people who are in departments thatare “connected” to the department of the developer requesting the recommen-dation. These profiles can be effective because they gather information from thesource of the expertise. Unfortunately, it is difficult to keep such profiles up-to-date. During a field study of expertise location, it was found that a seven-yearold profile-based system was available but the profiles had never been updated[12]. To avoid this problem, EEL does not use any profile-based information.
Other heuristic-based expertise recommenders are based solely on data ex-tracted from the archives of the software development. The Expertise Browser(ExB), for example, uses experience atoms (EA), basic units of experience, asthe basis for recommending experts [15]. Experience atoms are created by min-ing the version control system for the author of each file revision and the changesmade to the file. A mined experience atom is then associated with multiple do-mains (e.g., the file containing a modification, the technology used, the purposeof the change and/or the release of the software). A simple counting of experi-ence atoms for each domain in question is then used determine the experiencein that area. Similar to our approach, ExB equates experience to expertise. Incontrast, our approach accounts for how files are modified together (See Sec-tion 4.1), which we believe contain rich information about the expertise of thedeveloper who made the change.
As another example, the Expertise Recommender (ER) by McDonald [13]was deployed using two heuristics: tech support and change history. The changehistory heuristic, which is related to our work, uses the “Line 10” rule that
states that the revision authors are the experts for a file. These experts areranked according to revision time so that the last developer to modify the filehas the highest rank [13]. If multiple modules are selected as the target for anexpertise request in ER, an intersection of the experts is performed, raising thepossibility of ER producing an empty set of experts. In contrast, EEL uses thefrequency of file modifications that occur together and can always produce arecommendation.
Both ER and ExB require the user to switch from the application in whichthey are currently working to a special one designed for supplying the expertiserecommendations. Yimam-Seid et. al. wrote that “it is beneficial if expertfinding systems are embedded in the day-to-day problem solving and informa-tion search environments” since expertise finding is a daily occurrence[18, p13]. EEL takes this approach by providing the expertise recommendations fromwithin the development environment. This allows developers to work as nor-mal, and if a problem arises, they can request the expertise list without havingto switch applications. Furthermore, ER requires the user to enter potentiallycomplicated queries to the system[13] to get recommendations and ExB makesthe developer select the module on which they are currently working [15]. Sinceactivity within the IDE provides the context for a developers current work, EELdetermines the files that the user is interested in by monitoring their work. EELis therefore able to determine the experts in the area that is currently underinvestigation automatically, without a user entered query.
A social network describes relationships between developers built using datamined from the system development (e.g., [9]). These networks often becomelarge. As a result, many tools support queries to prune the network to showthe most relevant portion; for instance, enabling the production of a view withexperts in a particular area such as NetExpert [16]. NetExpert provides supportfor searching for an expert, browsing the social and knowledge networks as wellas ways to initiate communication[16]. NetExpert requires the user to initiallycreate a profile, then it is maintained automatically by using information indocuments that the user submits to the system as well as their personal webpages[16]. This social network approach adds complexity for the user since theymust be able to interpret and search the network to extract the informationthat they want. In contrast, the query needed to determine the experts in EELis formed behind the scenes automatically based on what the artifacts and taskson which the developer is working.
Social networks were also used in the Expertise Recommender (ER) to tailorthe expertise results for each user[11]. For ER, the social networks were cre-ated by hand through information gained directly from the users. The networkscreated were then used to change the expertise recommendations based on therelationships between users. This means that each user might get different rec-ommendations based on the people with whom they would rather communicate.Explicit social network information is an excellent way to tailor the recommen-dations but was not used in EEL since it would be a per-project customizationthat would require extensive analysis of the teams to build. Furthermore, themethod used by ER to get the social network has the same problems as profiles
Chapter 2. Related Work 6
since it requires the networks to be updated as both the project and teamsevolve.
Machine learning-based approaches in the area of expertise recommendationhave focused on using text categorization techniques to characterize bugs [1]and documents [17]. Anvik et. al. describe a system to recommend developerswho should fix a bug based on the history of bug fixes for the system and the de-scription of the newly reported bug. A more generalized machine learning-basedexpertise locator is ExpertiseNet as described by Song et. al. [17]. ExpertiseNetexamines files, specifically papers, to dynamically update a users’ expertise pro-file [17]. Similar to machine learning-based expertise recommenders, EEL relieson past information to form recommendations. In contrast to these approaches,EEL uses a simple frequency-based weighting to form recommendations anddoes not produce any general model of the activity between developers.
To investigate the coordination requirements of a project [2], Cataldo et al.introduced an elegant matrix-based solution to finding and investigating theserequirements. In their approach, the product of a task dependency matrix anda task assignment matrix is multiplied by the transpose of the task assignmentmatrix to produce a description of the extent to which people involved in adevelopment share tasks. In comparison, we use the basic matrix framework toconsider how work performed on files, irrespective of tasks, can be used to pre-dict expertise as a developer works, as opposed to analyzing post-developmentif the communication matches the coordination requirements.
Previous expertise recommenders were validated using human subjects; fewhave undergone a systematic user evaluation[10]. Expertise Browser (ExB) wasdeployed in two companies and the type and number of interactions of userswith ExB was recorded [15]. Use of the tool was used to infer if the tool workedwell. No information was collected on the accuracy of the ExB recommen-dations. Expertise Recommender (ER) was validated using a systematic userevaluation[10]. This study had users rank a list of potential experts and thenthe results were compared to the ranking provided by ER[10]. The validationpresented in this thesis of EEL focuses on an automated validation strategyas initial accuracy numbers are needed prior to inserting the technology in adevelopment environment.
7
Chapter 3
Emergent Teams Exist
Two recent trends in software development are the use of more agile software de-velopment processes[6] and global(distributed) software development[4]. Thesetrends have arisen for many reasons, including the need to ensure appropriateexpertise for the development at appropriate times. By using these developmenttechniques, formal teams are replaced by dynamic ones that are created duringdevelopment and that are constantly changing. The team structure emergesfrom the activity, and thus we refer to this as the emergent team structure
Lewin and Regine define an emergent team as “a dynamic way of workingtogether that keeps organisations on the edge”[8]. In an EclipseCon 2005 pre-sentation entitled “the eclipse way: processes that adapt”, Erich Gamma andJohn Wiegand described the use of dynamic teams during the development ofEclipse; these teams are established to solve cross-component issues and consistof developers from all components affected by the issue.
We believe that emergent teams in software development are not created onlyexplicitly just to solve particular issues, but, more commonly, that they formimplicitly through the action of working in the same area of the system. If anew developer begins work on an area of a software system, they are joining theemergent team consisting of the other developers who have previously workedin that area. The previous developers have the expertise needed to work onthe system that the new developer must gather by interacting with the team.Even if formal teams are defined, developers naturally create, join and leaveemergent teams on a daily basis based on the area of the system in which theyare currently working.
To show that emergent teams exist and to understand more about their cre-ation and composition, we investigated whether such teams exist on the Eclipseproject. We found that, on average, each committer on the Eclipse Java devel-opment tools (JDT) component team committed to eight different Eclipse Javaprojects1 other than JDT projects2 within the past year. Table 3.1 shows thenumber of projects, other than JDT projects, that each of the Eclipse JDT de-velopers has committed code to within the last year. As this table shows, somedevelopers are stationary (just working on the code within their designatedteam) while others are much more active across the rest of Eclipse. Further-more, research on developer social networks by Madey et. al. saw a similartrend on SourceForge3. They noted that the “busiest” developer worked on
1A Java project is a module that is at the top level of the Eclipse CVS tree.2Java projects beginning with org.eclipse.jdt.3A project hosting site, http://sourceforge.net, verified 12/17/06.
Chapter 3. Emergent Teams Exist 8
between 17 and 27 projects during the 14 month period that they monitored[9].A more fine-grained view of the dynamic nature of teams on the Eclipse
project is to look at the amount of work developers perform on each projectwithin a given time frame. Figures 3.1, 3.2 3.3 and 3.4 show the activity of fourdifferent developers, two from the Eclipse JDT team and two from the EclipsePlatform team. The graphs consider a six month period divided into two weekspans. The bar graph for each two week period represents the total numberof commits that the developer made to all of the Eclipse Java projects. Eachof the lines represents the number of commits made to each separate project.These graphs show how a developer’s activity on a specific project changes overtime and their area of focus in the system changes as well. This change inactivity shows that the developers participate in many different teams, as theyare not the only developer contributing to the different parts. This providesproof that even though a developer belongs to a formal team, their emergentteam is changing as they work, showing how teams emerge.
Chapter 3. Emergent Teams Exist 9
Table 3.1: Number of projects each JDT developer has committed to in the lasttwo years other than JDT projects.
Username Number of Projectssdimitro 52dmegert 29dejan 26kmoir 20darins 12mfaraj 12darin 11teicher 11
The goal of the Emergent Expertise Locator (EEL) is to make it easier fora developer to determine with whom to communicate during a programmingtask. EEL displays a ranked list of other developers with expertise on the set offiles that the user of EEL has recently edited or selected—their current changeset. To use EEL, a developer accesses a menu on a source file that displaysa ranked list of developers along with ways to initiate a communication as inFigure 4.1. These communication methods may be synchronous (i.e., chat) orasynchronous (i.e., e-mail). This approach aims to minimize the impact of thecommunication on a developer’s work flow and aims to provide assistance incontext; for example, a developer need not switch to an external applicationto perform the communication and context about the developers current statemay be automatically transmitted to the expert with which communication isbegun.
4.1.1 Mechanics
Our approach is based on the mechanism of using matrices to compute coordi-nation requirements introduced by Cataldo et al. [2]. Our approach requirestwo matrices, the file dependency matrix and the file authorship matrix, andproduces a third, the expertise matrix.
1. File Dependency Matrix A cell ij (or ji) in this matrix represents thenumber of times that the file i and the file j have been modified together1.Since this produces a triangular symmetric matrix, EEL only records datain the upper half of the triangle to save space. This matrix is populatedby querying a version control system, for each version of a file, for the filesthat changed with it.
2. File Authorship Matrix A cell ij in this matrix represents the number oftimes an developer i has modified a file j.
1The time duration can be set within EEL. By default, the entire project history is used.
Chapter 4. Approach and Implementation 15
Figure 4.1: Context menu list of developers showing the multiple methods ofcommunication available.
3. Expertise Matrix This matrix represents the current experts based on thefile dependency matrix and the file authorship matrix. A cell ij (or ji)in this matrix specifies the amount of expertise that developer i has to j.We consider that the higher the number in cell ij, the more of an expertdeveloper j is to i. This matrix is computed using the equation:
C = (FAFD)FTA (4.1)
where C is the expertise matrix, FA is the file authorship matrix and FD
is the file dependency matrix.
The tool that we have built on this basic approach uses a developer’s currentchange set—the files the developer has recently selected or edited—to suggestan ordered list of developers with whom to communicate. Figure 4.2 shows thistool working within the Jazz Eclipse client. To provide this support, EEL minesinformation as a developer works. When a developer selects or edits a file, ittriggers EEL to access the version control system and mine the related filesand authors to populate the matrices. Once the user right clicks on a file andattempts to collaborate with another developer, EEL calculates the coordinationmatrix on the fly to ensure up-to-date information. The calculation of thecoordination matrix can be time intensive. To mitigate this problem, since weare interested in experts only for the current developer, we modify the expertisematrix calculation to be
v = (RFAFD)FT
A (4.2)
where v is a vector that represents the experts related to just the current devel-oper, RFA
is the row that corresponds to the current developer in the file au-thorship matrix, FD is the file dependency matrix and FA is the file authorshipmatrix. By using only the row that corresponds to the current developer, thematrix multiplications are reduced to simple vector calculations. Even thoughwe are only interested in the experts relative to the developer performing thequery, the entire file dependency and file authorship matrices must be populatedsince they are required for the expertise matrix calculation.
Chapter 4. Approach and Implementation 16
Fig
ure
4.2:
EE
Lin
use
wit
hin
Jazz
.
Chapter 4. Approach and Implementation 17
Figure 4.3: Architecture of EEL.
4.2 Implementation
EEL is implemented as a Java plug-in for Eclipse since Jazz has an Eclipseclient. The core plug-in contains the algorithms and data structures required todetermine the potential experts based on the repository data. This core plug-inalso handles scheduling the queries to the repository in the background so thatthey are transparent to the developer. Furthermore, it provides an extensionpoint so that additional repository support can be added easily. The secondplug-in is the repository plug-in. This plug-in is related directly to the typeof repository that EEL queries for author and related file information. Figure4.3 shows the architecture of EEL with respect to Eclipse and Jazz. We havedeveloped a repository plug-in for both Jazz and Subversion repositories.
EEL uses the application programming interface (API) provided by Jazzto gather information from the version control system (i.e., change sets andauthors) that is needed to build the file authorship and dependency matrices.Since EEL is developed as a client-side plug-in, no changes had to be madeto the Jazz server. This means that EEL is personalized based on each userand the files on which they have worked. EEL could have been implementedas a server, but we chose the client-based approach since we wanted to ensurethat each developer could personalize the tool to suit their needs. Furthermore,implementing EEL as a server would require additional infrastructure that wedid not want to impose on teams.
To support matrix computations in EEL, we used the open-source matrixpackage Matrix Toolkits for Java2. This package provides the ability to createthe required matrices as well as perform the necessary calculations to obtainthe expertise matrix. To ensure that the final matrix contains enough relevantinformation to determine the appropriate experts in the area of interest, EELuses matrices that are 1000 elements square. This choice means that we cantrack up to 1000 files, enabling a substantial portion of a developer’s work to be
used for the recommendation of experts. Even though the matrices are fairlylarge, they can fill up quickly due to the number of related files per revision ofa file. To mitigate this problem, EEL uses a least recently used approach todetermine which entries to remove from the matrix once it becomes full, allowingthe files that are either related or viewed more often to remain in the matrixlonger. To ensure that the files that a developer has worked on (the currentchange set) remains in the matrix, they are removed only if they occupy 50% ofthe matrix. The current change set, the files which the developer has selectedor edited, is treated differently since the set contains information that directlypertains to the developer’s current work.
Since the files in the change sets provide the basis for the determination ofexpertise within EEL, it is necessary that they provide accurate information.Ying and colleagues noted that while mining software repositories, change setscontaining over 100 files are often not meaningful since they usually correspondto automated modifications, such as formatting the code or changing the licenc-ing [19]. After inspecting the change logs for several projects,3 we noted thatthis is true of most change sets with over 50 files. With this knowledge, we wereable to limit EEL from mining information from change sets with over 50 filesin it. This choice ensures that irrelevant related file data does not pollute thefile authorship and dependency matrices.
The time required for EEL to produce a recommendation is dependant ontwo main factors. The first factor is the speed of the repository from whichEEL accesses the author and related file information. This speed is affected bymany factors such as network speed, repository size and server load. Generally,these systems can provide the information that is required by EEL quickly,therefore, it is not a major factor in the usability of EEL. The second factor isthe speed of the calculation of the expertise matrix. Since the expertise matrixis computed when the user opens the menu, it is the main factor in producinga recommendation quickly. On a 2.13Ghz Core 2 Duo system with 2Gb ofmemory, the calculation of the expertise matrix takes 891ms using the vectorcalculation approach.
4.2.1 Extensibility
EEL was designed to be extensible since many different repositories and com-munication methods exist. Currently, EEL supports only the addition of arepository that is tied to a single communication tool (e.g., Jabber4 or IBM Lo-tus Sametime5). This is an issue since two teams using the same repository mayuse different systems for communication. Each repository needs the communi-cation mechanism handled separately since many systems, unlike Jazz, do nothave collaboration support built into them and external tools need to be used.
3Most notably Eclipse and Gnome Evolution.4Jabber is an open-source instant messaging system, http://www.jabber.org/, verified
01/08/07.5Sametime is an enterprise instant messaging and web conference system, http://www-
If EEL is to employed it would be beneficial to provide a pluggable communi-cation framework since not all projects use the same collaboration techniques,even if they use the same version control system.
To add support to EEL for a new repository, a single class needs to becreated. This class must implement methods for retrieving author and relatedfile information from the repository given a file that the developer is currentlyinterested in. These methods are called from the core plug-in of EEL when thedeveloper selects or edits a file.
20
Chapter 5
Validation
Ideally, we would validate EEL by gathering statistics about the accuracy ofEEL’s recommendations as developers use the tool as a part of their daily work.Such an evaluation requires a moderately-sized, preferably distributed, devel-opment team. Engaging such a team in an evaluation is difficult without anyproven information about the effectiveness of the technique. To provide initialevaluation information, we have thus chosen to apply the approach to the historyof existing open-source systems. We use information about the revisions to filesstored in the version control system of a project to drive our approach. We usethe communication patterns recorded on bug reports as a partial glimpse intothe collaborations that actually occurred between the developers. Because wehave only a glimpse into the communication that occurred during the project,the results we provide in this section are essentially a lower-bound on the accu-racy of the recommendations provided.
For validation purposes, we created an extension to EEL that allowed forthe mining of Subversion1 repositories. Subversion was used since it retainsinformation about change sets stored in the repository, unlike CVS which storessingle file modifications. First, because the early access version of Jazz hadlimitations that prevented us from importing an active open-source project,limiting the size of project we could use for validation2. We were able to accessand create subversion repositories for a variety of open-source projects, enablinga more thorough evaluation.
5.1 Methodology
Our validation method involved selecting a bug of interest and recreating thedevelopment state at that time by considering only source code revisions thatwere committed before the bug was closed. We used a determination of the filesrequired to fix the bug to populate the matrices and determine the recommen-dations. We then compared our list of experts to those who had communicatedon the bug report, as determined through comments posted to the bug report.Since the communication recorded on a bug report largely discusses the issueunderlying the report, the developers involved in this discussion either have
1http://subversion.tigris.org/, verified 01/08/07.2The Jazz system has not yet been released and only limited development data from Jazz
itself is becoming available now.
Chapter 5. Validation 21
expertise in the area or gain expertise through the discussion.3
To perform this validation, we needed to determine a set of bugs with asufficient number of recorded comments to infer communication amongst devel-opers and with associated revisions of the source files that “solved” the bug,the resolving change set for the bug. We searched through all of the bugsmarked as resolved and fixed for reports with ten or more comments and whereat least five different developers had recorded comments; Appendix B providesa description of how we determined whether an entered comment representeda developer. For the validation, we retained only the comments provided bydevelopers, discarding the others as they are not relevant to providing a lower-bound on the developer communication. We used a standard approach (seeSection 5.2) to determine the resolving change set for a bug and ensured thatall change sets considered between three and nine files. We chose a range ofchange set sizes to enable evaluation across a range of situations.
EEL is intended to be used as development proceeds. To mimic developmentin this validation, we used the following process:
• Create three subsets of the resolving change set
Given the resolving change set for the solved bug, we create three changeset sized subsets ( 1
3 of the change set, 23 of the files, and the entire change
set) to test how well EEL performs in finding experts given less informationthan what is needed to fix the given bug. We choose the files for eachsubset randomly with the constraint that at least one file in each subsetmust not be an initial revision when the bug was fixed to ensure thatwe have some history from which EEL can recommend emergent teammembers. Random subset formation is necessary since we do not know inwhat order a developer may have modified the files used to solve the bug.
• Partition the comments in the bug into three groups
We partition the comments in the bug into three approximately equalgroups based on the date of the comment. The first group has the oldestcomments while the last group contains the newest ones, enabling us tomimic how development occurs. The bugs could have been partitionedbased on a date span, but this would have provided a varying sized parti-tion and could have produced poor results due to the frequency and speedof communication that occurred on the report. An example is if we chosea one week time period, one bug may have been fixed in a week, thereforeall comments are in one partition. On the other hand, one bug may havetaken two years to fix and was only commented on once per month, leav-ing each comment in its own partition. Also, as this example illustrates,the number of partitions would vary per bug and therefore it would bedifficult to compare the results.
Furthermore, if a comment partition has no developers communicatingwithin it, the entire bug is discarded. When no developers communicate,
3Communication on a bug report unrelated to the underlying issue is typically moved intoanother bug report.
Chapter 5. Validation 22
the precision would always be 0% and the recall would be incomputablesince it would be divided by 0 (see below). To rectify this situation, wechose to discard these bugs from our final dataset.
• Apply EEL to each combination of comment partitions and change setsubsets
We apply EEL to each of the nine cases resulting from combining thecomment partitions with the file revision subsets (see Figure 5.1) andevaluate the precision and recall of the recommendations produced byEEL. Specifically, for each case we,
1. find the last revision of each file (in the change set subset) before theearliest comment in the comment partition,
2. apply EEL to the file revisions in the change set subset obtainedin the previous step to produce an ordered list of emergent teammembers,
3. determine who all of the commenters were in the bug partition, form-ing the set of relevant developers (see Appendix B for details on howwe determined which comment corresponded to a developer)
4. compute the precision, representing the percentage of correctly iden-tified team members (5.1), and recall, representing the percentage ofpotential team members correctly identified (5.2).
Precision =# Appropriate Recomendations
Total # Recomendations(5.1)
Recall =# Appropriate Recomendations
# Possibly Relevant Developers(5.2)
The # Appropriate Recomendations is the number of developersrecommended by EEL that commented on a bug report, the Total #Recomendations is the number of recommendations that EEL madeand # Possibly Relevant Developers is the number of developersthat communicated on the bug report.
In our validation, we also compare EEL against the “Line 10” rule that wasused in the Expertise Recommender (ER) [13]. The “Line 10” rule is when thelast person that modified the file is considered as the expert in that file. ERextends this approach to rank all of the developers that have modified the file bytheir last edit date. If multiple files are selected, the expert lists are computedseparately for each file and then an intersection of the lists is performed toproduce the final expert list.
After a preliminary run of the validation on EEL, it was noticed that someof the precision and recall values were 0%. Further investigation revealed thatthe files that changed to fix a bug may have not been created prior to the dates
Chapter 5. Validation 23
Figure 5.1: The 9 cases for validation. Lines in this diagram represent a combi-nation of comment partition and change set subset for validation purposes.
of earlier comments. This means that EEL was unable to get any historicaldata for the file since it did not exist and therefore EEL was unable to producea list of experts. This situation can occur even if the file is not new as of thatchange set since development is ongoing and the file that needed to be changedwas added after the bug was commented on, but before it was fixed. For thesecases, we report an optimistic and pessimistic case precision and recall. Theoptimistic case precision and recall is 100% and it is appropriate since if the filedid not exist at the time, producing no experts is correct. On the other hand,since EEL was unable to produce any experts, we can consider the pessimisticcase and assume a precision and recall of 0%. These optimistic and pessimisticcase values are only used if EEL is unable to produce a list of experts. If EELis able to produce a list of experts, only one precision and recall value is givenand they are computed using equations 5.1 and 5.2 as described earlier.
Another thing that we noticed upon the preliminary run of the validationwas that the performance of the recommendations of EEL was lower than the“Line 10” rule for some projects. Investigation into this problem revealed thatmany of the developers contributing to these project committed only a fewchanges during a small time period then never committed again. Since EELconsiders the entire project history, the addition of authors that are no longeractive affects EEL’s recommendations. The “Line 10” rule is not impacted bythis case since it ranks experts by their last authorship date, ensuring that themost recent authors are recommended first. In contrast, EEL uses all authors
Chapter 5. Validation 24
and related files to determine the recommended set of experts, resulting in thepotential for an author that was previously highly active on the project, whohas not been active recently, to be recommended over a more recently activedeveloper. To ensure EEL provides relevant recommendations, we apply it tothe last twelve months of development history. To be fair in our comparison,we use the same underlying data for the “Line 10” rule. The use of a limitedamount of recent data from the project archives has limited effects on the resultsof applying the “Line 10” rule because this approach already ranks the mostrecent activity higher.
5.2 Data
We used three existing open-source software projects, the Eclipse project4, Fire-fox5 and Bugzilla6 in our validation. Eclipse is an open-source platform forintegrating tools implemented in Java, Firefox is a popular open-source webbrowser and Bugzilla is an open-source issue tracking system. As Firefox andBugzilla are part of the Mozilla Foundation, they use the same version controland issue tracking systems. These projects were chosen because they each havea number of developers who have been active in committing code and bugs intheir histories and there is a sufficient amount of data to run the validation af-ter we apply the constraints we outlined in the previous section. Furthermore,these projects use the repository types our infrastructure supported: CVS forsource control and Bugzilla for issue tracking. These two data sources are usedto populate EEL and validate its recommendations.
To form appropriate change sets of Eclipse we obtained an archive copyof the CVS repository from the Eclipse.org archive site7 and imported it intoSubversion using cvs2svn. We performed a similar operation to the Mozilla CVSrepository 8. Cvs2svn is a python script developed along with Subversion (SVN)by the Tigris.org9 community. This script must be run on the system that theCVS repository currently resides since it directly reads the RCS files. Cvs2svnhas many passes that it uses to prepare the SVN repository and to determine thechange sets associated with each of the CVS files, ensuring that the importeddata is robust and correct. Cvs2svn uses a simple algorithm, similar to thatdescribed by Mockus et. al. [14], that inspects the author and log messagefor each revision of a file and has a notion of time. Using this information, itinspects all of the revisions of all files, grouping revisions that have the sameauthor and log message and that occur within a five minute window of time tocreate a single change set. If multiple files are changed by the same author and
4http://www.eclipse.org, verified 01/08/07.5http://www.mozilla.com/en-US/firefox/, verified 01/08/07.6http://www.bugzilla.org/, verified 01/08/077Eclipse makes archives of the CVS repository available in compressed for at
http://archive.eclipse.org/arch/, verified 01/08/07.8The Mozilla CVS was obtained using rsync. Rsync is an open-source utility for file
Table 5.1: Bug and change set statistics per project.
Eclipse Firefox BugzillaTotal # change sets 122614 174581 174581Total # bugs resolved in last 2 years 10013 4918 2148Total # bugs with at least 5 developers and 10 comments 354 501 291Total # criteria fitting bugs with reference in change log 182 283 216Total # bugs with reference and correct change set size 49 70 81Total # bugs excluded due to no developers in partition 1 2 0
with the same comment but span more than five minutes, multiple change setsare created, hence a single change made to the system may be split into twoor more change sets. Furthermore, this algorithm could cause unrelated filesto be committed as a single change set, but it was noted in the cvs2svn designnotes that this will only happen with insufficiently detailed log messages, like“changed doc” or an empty message, coupled with multiple commits performedquickly[3]. The design notes also note that if a log message is insufficientlydetailed, then the change must not be important and there is no real harm ifthe files are grouped [3].
In selecting bugs for the validation, we used those marked as closed and fixedwithin the past two years. These bugs were obtained by downloading the XMLversion of all bugs that fit this criteria from the respective projects’ Bugzilladatabase. Our selection criteria ensures that the fix is fairly recent, and that itwas actually committed to the repository. If the status of a bug was not closedand fixed, it could still be under development, be a duplicate of another bugthat has an unknown status, or it may be marked not a bug.
Since most open-source projects associate a bug with a single change set,many of them require that the identifying number of the bug that was fixedbe entered into the comment of a commit to the version control system. Usingthis knowledge, we were able to map a bug to a single commit so that we couldrecreate the development state needed to fix the bug in question. To performthis mapping, we searched each of the log messages obtained from Subversion(one per revision) for a reference to one of the bugs that could be usable forvalidation. This was done by creating a log of all of the change sets stored in theversion control system along with the files that were changed and searching fora string that indicates that it corresponds to a bug fix, for example, “bug 321”,“fix 321” or just “321”. We then matched logs with a reference to a bug to thebugs that we have determined are appropriate for validation. Any bug that didnot have a reference in a log message was discarded from the validation.
Table 5.1 provides some statistics on the size of the data sets that were used.The total number of change sets presented in table 5.1 for Firefox and Bugzillaare the same since they are both contained in the same source code repository.
Chapter 5. Validation 26
5.3 Results
Since EEL can produce a varying number of recommendations, we computedthe precision and recall for three different sized lists of potential team members,namely three, five and seven recommendations. These lists were obtained bytaking the top parts of the ordered list produced by EEL and the “Line 10”rule. We varied the recommendations to investigate the impact of the size ofthe recommendations on the performance of EEL.
Figures 5.2, 5.3, 5.4, 5.5, 5.6 and 5.7 present a subset of the optimisticprecision and recall values using the Eclipse, Bugzilla and Firefox datasets.The presented results represent a more interesting subset of the data that wascollected. This subset presents the results for all of the time frames provided bythe bug partitions, but only 1
3 and 23 of the files in the change set. Furthermore,
the figures present only the results when five developers were recommended. Wechose to focus our presentation of results on these cases, as these cases representa developer looking for expertise prior to a problem being fixed and because therecommendation list is of a reasonable size for a developer to consider. AppendixA provides both the optimistic and pessimistic results of all of the 27 differenttest cases for each of the projects.
The results are presented in box-and-whisker plots. These plots assist inviewing the distribution of the results. The shaded box in the plot representsthe second and third quartiles of the data set, whereas the lines extendingabove and below them, the whiskers, represent the fourth and first quartilesrespectively. The large black dot represents the average value of the data andthe line represents the median. Any small unshaded circles that are locatedabove or below the whiskers represent outliers that are well outside the rangeof common values. On the results graphs, the y-axis represents the percentagevalue of the precision or the recall. The x-axis separates each of the cases thatwe are interested in. A label on the x-axis that reads T1,1/3 means that itrepresents the first bug partition (T1) and 1
3 of the files in the subset (1/3).Table 5.2 shows the overall average optimistic precision for both EEL and
the “Line 10” rule for all three of the projects used in the validation. Table 5.3presents the average optimistic recall for the three tested projects.
Figures 5.2 and 5.3 present the optimistic precision and recall values forEclipse. The overall average optimistic precision and recall of EEL for thisproject are 37% and 49% respectively, compared to 28% and 35% for the “Line10” rule.
Figures 5.4 and 5.5 present the optimistic precision and recall values forBugzilla. The overall average optimistic precision and recall of EEL are 28%and 38% respectively, compared to 23% and 28% for the “Line 10” rule.
Figures 5.6 and 5.7 present the optimistic precision and recall values forFirefox respectively. The overall average optimistic precision and recall of EELare 16% and 21% respectively, compared to 13% and 16% for the “Line 10” rule.
In each of the three projects tested, EEL produces higher precision andhigher recall than the “Line 10” rule. On average, in 88% of the 27 differenttest cases for each of the three projects, EEL produced a higher precision and
Chapter 5. Validation 27
Table 5.2: Optimistic average precision.
Project EEL Line 10Eclipse 37% 28%Bugzilla 28% 23%Firefox 16% 13%
Table 5.3: Optimistic average recall.
Project EEL Line 10Eclipse 49% 35%Bugzilla 38% 28%Firefox 21% 16%
recall than the “Line 10” rule. This shows that EEL produces better resultsthan the “Line 10” rule.
It is an open question whether these precision and recall values are sufficientto create an effective tool for recommendations. We are optimistic that aneffective tool can be based on this approach because McDonald’s study foundthat people working on the project generally agreed with the recommendationsprovided [10]. Knowing that the “Line 10” rule performs well when strict testingis performed, we believe that the our results show that EEL provides betterexpertise recommendations than the “Line 10” rule.
Furthermore, there were some cases where EEL was able to produce a list ofexperts when the “Line 10” rule recommended an empty list. Recommendingan empty list of experts does not help developers find expertise, forcing themto modify the information that they are interested in until they find an expertthat might be of interest. EEL always produced a recommendation if there washistory in the repository for it to use. For Eclipse, 6.5% of the cases we triedresulted in the “Line 10” rule being unable to produce a result when EEL could.For Bugzilla, this occurred 15.0% of the time, and Firefox 14.9%. The “Line10” produces an empty recommendation list since it performs an intersection ofthe authors when there are multiple files in the change set. This situation canoccur when the files are relatively new, or new dependencies have been addedwithin the files that is not reflected within the history of project.
The difference between the optimistic and pessimistic values gives an insightinto the number of times that there was insufficient history for either EEL orthe “Line 10” rule to produce a recommendation.
Chapter 5. Validation 28
Figure 5.2: Eclipse optimistic precision.
Figure 5.3: Eclipse optimistic recall.
Chapter 5. Validation 29
Figure 5.4: Bugzilla optimistic precision.
Figure 5.5: Bugzilla optimistic recall.
Chapter 5. Validation 30
Figure 5.6: Firefox optimistic precision.
Figure 5.7: Firefox optimistic recall.
Chapter 5. Validation 31
5.4 Threats
Several factors could affect the construct, internal and external validity in ourstudy of EEL.
5.4.1 Construct Validity
Construct validity considers whether the measures used in a study representthe concept that was being studied. A potential threat to the construct validityof our validation is how we determined the experts to whom we compare therecommendations made by the two approaches. We used the developers whocommented on the bug report as the experts for the area of the system inquestion. However, it is possible that the comments that were posted to thebug report were not related to the bug or were not technical in content. Eithersituation would mean that who we consider as experts may not actually beexperts in that part of the implementation of the system. As we described insection 5.1, these situations are unlikely given how the bug reporting systemis used in practice. In the open-source community, bug reports are used as acollaboration device to track the technical communication surrounding the bugreport. For instance, in the open-source communities we studied, if unrelatedissues arise in a report, they are generally continued within a new bug reportor through other communication means. The result is that the commenters onthe bug report are potential experts since they are commenting on the technicalaspects of the bug.
Our use of the bug report comment data is a lower bound on the commu-nication that occurred during the development of the system. As a result, wemay not have a complete list of the experts, or a ranking of the importance ofeach of the developers with respect to a bug. A user study where developerscould use their knowledge regarding expertise and provide a more detailed viewof the correctness of EEL would have better construct validity. The validationpresented here is intended to be a preliminary study to show the effectivenessof EEL so that it can be deployed into a development team.
Another potential threat to the construct validity of our study is that theversion control system may contain non-technical changes to the system. Thesekinds of changes would mean that a developer committing a part of the sys-tem made a change that did not require knowledge of the area (i.e., changingthe licence agreement that appears at the top of each file). As we previouslymentioned, these types of commits to the system are generally large in size,and due to this, EEL ignores change sets with more than 50 related files. Thisapproach tries to ensure that these untechnical changes to the system are notused in the expertise recommendation. Furthermore, since EEL uses the fre-quency of the commits as a factor in recommending experts, if a few of thesecommits are included, there is a high probability that they will not affect therecommendations.
Chapter 5. Validation 32
5.4.2 Internal Validity
Internal validity in a study ensures that there was a causal relationship betweenthe method being studied and the results. The username to e-mail mappingscould threaten the internal validity of the evaluation of EEL. The mappingbetween e-mail and username is a difficult problem and has no easy solution.Since we are comparing the performance of EEL to the performance of the“Line 10” rule, this should not affect the outcome. Both of the methods usethe mappings similarly and compare to the same set of potential experts. Thismeans that if one of the mappings is incorrect, it will affect both systems equally,therefore, not affecting the comparison between the two.
5.4.3 External Validity
External validity studies the generalizability, or applicability, of the results toother situations. One potential threat to external validity is that we only con-sidered open-source projects. This could be an issue since the processes used ina corporate environment might be different than that of an open-source com-munity. As the projects that we studied involve professional developers and thesystems developed are of high-quality, we believe that the processes used andthe structure of the projects are similar to those in a corporate situation.
Another threat to external validity is the size of the teams that we inves-tigated. We believe our approach is better suited to large teams and targetedlarge teams in the evaluation with both of the systems having between 200 and800 developers committing.
Finally, the use of mature projects could be a threat to the generalizabilityof the validation. We feel that this is not a threat since any project that is newdoes not contain the information that is needed to provide recommendations.There is no easy way to test the validity of a recommendation tool on a newproject since the project history is non-existent, therefore no information can becollected about the experts of the system. Furthermore, a recommender wouldproduce moot results since only a few people have edited the files, meaning thatthey are the creators of that file and therefore the experts. On a new project,a profile-based expertise recommender would perform the best since developerscan list their area of expertise.
33
Chapter 6
Discussion
In this chapter, we consider both extensions and limitations to EEL, as well as,describe the next steps in evaluating the approach.
6.1 Other Sources of Information
There are several different data sources that EEL could use to mine informationfor producing the emergent team recommendations other than source revisioninformation, including bug reports and the developer’s own activity. We choseto use revision information because our motivating use case is to recommendteam members with expertise on the code. The best information for this isdevelopers who have demonstrated work on the code.
Bug report information may be useful for augmenting the source revisioninformation. To do this, a few changes would need to be made to EEL. First,EEL would need to understand the bug that the user is currently working onand have the ability to extract data from the bug tracking system. Second, amapping between the source control repository e-mail and username and thebug tracking systems’ e-mail and username would be needed so that the users’expertise rankings could be unified within EEL. If using a tool like Jazz, thismapping is not needed since the names are unified throughout all of the compo-nents of the software life cycle. For EEL to properly utilize bug data, it needsdependency information to gather history of past bug fixes, but this informationis sparse and many times non-existent since developers do not properly recordthis information. Some tools are being developed to attempt to automaticallydetermine bug similarity for finding duplicate and dependant bugs, but thesetechnologies are not mature enough to reliably produce correct results that couldbe used for expertise locating (e.g., [5]).
Alternatively, a developer’s interaction with the system, similar to that col-lected by Mylar[7], could be helpful in two ways. First, Mylar’s task contextscapture information about which files a developer referred to when performinga task. Files to which a developer referred a lot may be also areas of the systemin which a developer is knowledgeable and for which the developer could beconsidered a member of the emergent team for that file. Second, Mylar’s taskinformation could be useful to maintain EEL’s matrices on a per task basis.This would mean that when a user changes the task that they are working on,they would start with an empty list of experts until they begin investigating thecode. This would ensure that the list of experts would be directly related to that
Chapter 6. Discussion 34
task and not the previous one. However, we did not implement this since thenumber of related files is large, it will quickly clear the matrices, and thereforethe expert list, as the user works. A drawback of this approach is that whena task is started there will be no information to recommend experts, leadingto less recommendations when a developer may most need a recommendation,early on in a task’s life cycle.
6.2 Using Emergent Team Information
There are some other potential uses for emergent team information other thanrecommending experts. Emergent team information could also be used to lookat churn or determine the load on a particular developer on a project.
One use of emergent team structure is to examine the churn of a projector component. Churn is when an entire team (or most of a team) continuesto change over time. This means that the team is always composed of newdevelopers that must learn the area of the system, therefore, resulting in alack of experts for that part of the system. A project that has little churnnormally means that the development team is stable and contains knowledge ofthe project. A manager could use this information to determine problem areasin a project and redistribute developers to fix the problem. By periodicallydetermining the emergent team for an area of a system, one could compare thecurrent team with previous team snapshots to determine the size or frequencyof the change on the emergent team. If a large number of developers on a teamleave or if the team changes frequently, it could indicate a problem.
Emergent team structure could also be used to determine the load on adeveloper. A manager could use this information to reduce the load on onedeveloper by reassigning work to other developers. To determine the load of adeveloper, the emergent teams of different areas of a project could be analyzed.If a developer is contained in a larger number of teams than other developers,this would indicate that they probably have a higher load and are key to thesuccess of the project.
6.3 Future Evaluation
The next step in evaluating EEL is to deploy the tool into an active developmentproject. Ideally, this project would have a relatively large code base (e.g., onemillion lines of code), follow some agile practices and communicate primarilythrough electronic means. We would like the team to follow some agile practicesso that EEL is able to be useful and not recommend developers that are known tobe experts by members of the team. It would be beneficial if the communicationwas done through electronic means so that we could track and analyze thecommunication that was initiated through EEL.
The focus of this evaluation should be on both the correctness of EEL’srecommendations and the users’ experience. Data should be collected on an on-
Chapter 6. Discussion 35
going basis when the tool is deployed. Information about the frequency of usingEEL to make a recommendation, along with the frequency of communicationinitiated from EEL could be collected and analyzed. To further ensure accuracy,a more formal study could be performed in the likes of the one performed byMcDonald[10]. This study involved having users rank a list of developers thatcould be experts on the system in question. The results provided by the users arethen compared to the tools recommendations to determine the accuracy of thetool. This second study could be used to supplement the results collected fromthe deployment of EEL, and provide statistical evidence for the performance ofEEL.
Second, the experience of using EEL should be evaluated. The goal of EELis to provide the simplest way to display the expert recommendations as well ashow communication can be initiated. This evaluation could be run in parallelwith the correctness study when EEL is deployed. Users could compare theusability of EEL to their previous methods of expertise location and communi-cation tools. Furthermore, evaluation could involve a separate study that teststhe current way that EEL provides the list of experts to another system, suchas, manually configured buddy lists.
Finally, since we discovered that the performance of EEL is highly dependanton the amount of history data that it uses, it would be beneficial to investigatethis further. This would require testing EEL using different lengths of timeto examine in the software repository. We chose twelve months to increasethe recency of the data that was being used, but using more or less time mayproduce better results. We recognize that there is a high probability that thisfactor is dependant on the project used. When inspecting the Eclipse results,using the entire history or the last twelve months produces similar results withEEL, whereas with the Mozilla projects, there was a large change in the results.It would be beneficial if there was an automated way to determine this value ona per-project basis so that the customization could be made available to EEL.
6.4 Limitations
A limitation of EEL’s approach is that it is unable to easily work with manytraditional version control systems like CVS and RCS. This limitation is dueto these systems maintaining commit information on a per file basis; therefore,not containing any information pertaining to the files that changed along withit. This means that we are limited to newer version control systems such asSubversion and Jazz since they support atomic commits across a number offiles. Tools and methods exist for extracting change sets from CVS, but it isinfeasible to run this every time data is mined for a file. As an alternative, anexternal tool could be periodically run to extract this information, but this isan intensive operation and therefore would create the need for a server basedapproach which we are attempting to avoid. Many open-source projects still useCVS, but Subversion is gaining popularity and some major open-source projects(like Apache and Gnome) have either migrated or have plans to migrate to using
Chapter 6. Discussion 36
this new repository system in the near future.Another limitation to EEL is that if a new developer is added to the team,
but they are already an expert, there is no support to ensure that this person iscorrectly recommended. If a client-server approach was used, a simple skew orreplacement value could be added to augment the recommendations to ensurethat this new developer is recommended. This method could also be used if adeveloper leaves the team. To solve this within EEL, the ability to personalizethe recommendations could be added. One personalization could be the abilityto substitute a expert who is recommended by EEL with a different expertspecified by the user. This could be done to replace a developer who left theproject with a new hire. Another use of this type of personalization would beto augment the recommendations based on the social structure of the team. Adeveloper may prefer to talk to an expert that they know over another memberthat has similar knowledge. Another way to solve this limitation would be toweigh the recent development activity heavier than older information. Thiswould mean that a developer that has worked on a file more recently could berated higher even if they are new to the team. To implement this mechanism,an in depth experiment would need to be performed to determine if it is feasible.
37
Chapter 7
Summary
To build successful complex software systems, software developers must collabo-rate with each other at all stages of the software life-cycle. Current developmenttools facilitate this collaboration by integrating communication tools, such aschat and screen sharing, within the development environment. By integratingthese tools, it has made it easier for developers to communicate, but as thecomposition of software teams becomes increasingly dynamic, it may not bestraightforward for a developer to keep a description of colleagues on the manyteams in which she may work up-to-date. This leaves the developer to determinewith whom he should collaborate in a particular situation, forcing the user tohave some knowledge of who has expertise on particular parts of the system.
EEL mitigates these problems by determining the composition of the teamautomatically so that developers do not need to spend time configuring mem-bership lists for the many teams to which they may belong. This is done byusing the context from which the developer initiates communication combinedwith the project history to produce a recommendation of experts related to thearea the developer is currently working.
Using an automated validation and historical data from three different open-source projects, we found that EEL produces higher precision and higher recallthan the “Line 10” rule. The results are promising but EEL still needs furthervalidation using a team of developers due to the limited availability of expertiseinformation.
38
Bibliography
[1] John Anvik, Lyndon Hiew, and Gail C. Murphy. Who should fix this bug?In Proceeding of the 28th international conference on Software engineering,pages 361–370, 2006.
[2] Marcelo Cataldo, Patrick Wagstrom, James Herbsleb, and Kathleen Carley.Identification of coordination requirements: Implications for the design ofcollaboration and awareness tools. In Proceedings of the 2006 20th anniver-sary conference on Computer supported cooperative work, pages 353–362,2006.
[3] cvs2svn Community. How cvs2svn Works.http://cvs2svn.tigris.org/svn/cvs2svn/trunk/design-notes.txt, 2006.
[4] James .D. Herbsleb and Deependra Moitra. Global software development.IEEE Software, 18(2):16–20, Mar/Apr 2001.
[5] Lyndon Hiew. Assisted detection of duplicate bug reports. Master’s thesis,University of British Columbia, 2006.
[6] Jim Highsmith and Alistair Cockburn. Agile software development: thebusiness of innovation. IEEE Computer, 34(9):120–127, Sept 2001.
[7] Mik Kersten and Gail C. Murphy. Using task context to improve program-mer productivity. In Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering, pages 1–11, 2006.
[8] Roger Lewin and Birute Regine. Complexity and business success.http://www.psych.lse.ac.uk/complexity/Seminars/1999/report99oct.htm,1999.
[9] Greg Madey, Vincent Freeh, and Renee Tynan. The open source softwaredevelopment phenomenon: An analysis based on social network theory. InAmericas conf. on Information Systems (AMCIS2002), pages 1806–1813,2002.
[10] David W. Mcdonald. Evaluating expertise recommendations. In GROUP’01: Proceedings of the 2001 International ACM SIGGROUP Conferenceon Supporting Group Work, pages 214–223, 2001.
Bibliography 39
[11] David W. McDonald. Recommending collaboration with social networks:A comparative evaluation. In Proceedings of the SIGCHI conference onHuman factors in computing systems, pages 593–600, 2003.
[12] David W. Mcdonald and Mark Ackerman. Just talk to me: a field study ofexpertise location. In CSCW ’98: Proceedings of the 1998 ACM conferenceon Computer supported cooperative work, pages 315–324, 1998.
[13] David W. Mcdonald and Mark S. Ackerman. Expertise recommender: aflexible recommendation system and architecture. In Proceedings of the2000 ACM conference on Computer supported cooperative work, pages 231–240, 2000.
[14] Audris Mockus, Roy T. Fielding, and James D. Herbsleb. Two case studiesof open source software development: Apache and mozilla. ACM Transac-tions in Software Engineering and Methodology, 11(3):1–38, 2002.
[15] Audris Mockus and James D. Herbsleb. Expertise browser: a quantitativeapproach to identifying expertise. In ICSE ’02: Proceedings of the 24thInternational Conference on Software Engineering, pages 503–512, 2002.
[16] Ramon Sanguesa and Josep M. Pujol. Netexpert: A multiagent systemfor expertise location. In International Joint Conference on Artificial In-telligence (IJCAI) Workshop on Organizational Memories and KnowledgeManagement, pages 85–93, 2001.
[17] Xiaodan Song, Belle L. Tseng, Ching-Yung Lin, and Ming-Ting Sun. Ex-pertisenet: Relational and evolutionary expert modeling. In Proceedings ofthe Tenth International Conference on User Modeling, pages 99–108, 2005.
[18] Dawit Yimam-Seid and Alfred Kobsa. Expert finding systems for organi-zations: Problem and domain analysis and the demoir approach. Journalof Organizational Computing and Electronic Commerce, 13(1):1–24, 2003.
[19] Annie T.T. Ying, Gail C. Murphy, Raymond Ng, and Mark C. Chu-Carroll.Predicting source code changes by mining change history. IEEE Transac-tions on Software Engineering, 30(9):574–586, Sept. 2004.
40
Appendix A
Complete Results
Figures A.1, A.2, A.3, A.4, A.5, A.6, A.7, A.8, A.9, A.10, A.11 and A.12 presentdetailed results of the validation of EEL. Each of the figures presents 27 differ-ent cases that were tested during the validation of EEL. Each of the test casesrepresent a unique combination of the bug comment partition, the size of thechange set used to make a recommendation, and the number of recommenda-tions made. A.1 describes the combinations for each of the test cases. Section5.3 has a detailed description of how to read the box-and-whisker diagramspresented here.
Appendix A. Complete Results 41
Table A.1: Mapping of test case number to unique combination of bug commentpartition, size of change set and number of recommendations.
In the validation of EEL we needed to determine the set of developer’s workingon the system. At first, this step would seem to be simple: match the e-mailaddresses used in the bug repository to the usernames from the source coderepository. We needed to map e-mail addresses to a usernames since the vali-dation required that we compare information between the two different systemswith disjoint username schemes. Bugzilla requires users to use their e-mail ad-dress as their login name and the source code repository has an independent setof usernames that are normally not based on e-mail address. Unfortunately, asdescribed elsewhere [1], this mapping is non-trivial. To produce an initial map-ping, we applied a longest substring matching algorithm to the usernames ande-mail addresses. After attempting to automatically determine the mapping,the entire list was inspected and completed by hand. This approach produceda unique one-to-one mapping for many e-mail and username and e-mail ad-dresses but three situations remained: 1) no mapping could be determined, 2)a single username could map to multiple e-mail addresses (one-to-many) and3) multiple e-mail and username could map to one or more e-mail addresses(many-to-many).
The first case occurred only with a small subset of names and after someinvestigation, we determined that the users had not been active for at least threeyears, and determined that it was acceptable, for our situation, to discard theseusers from the recommended list of experts. We discarded this information sincewe were unable to validate the correctness of the recommendation if one of theseusers was suggested by EEL.
The second case occurs when users changed their e-mail address or used adifferent address based on the parameters of the bug report (i.e., the product).This case did not have to be handled specially since EEL does not use the e-mailaddresses, but the e-mail and username, therefore multiple e-mail addresses mapto a single username.
The third case exists since both projects changed their CVS username schemeafter the project was started. Eclipse was originally developed by IBM, andwhen this transition occurred, many of the developers continued to work onthe product, but were given new e-mail and username that followed the Eclipsestandard. Mozilla decided to change the CVS usernames to be the e-mail ad-dress of the committer since many of the changes were submitted by patchesand this was the easiest way to track who had made the change. To solve theproblem of the many-to-many mapping, we chose one of the usernames to bethe “master” username. This username is the one to which all of the e-mail ad-
Appendix B. Name Mapping Method 55
Table B.1: E-mail to CVS username mapping statistics per project.
Eclipse Mozilla1-1 Username to E-Mail Mappings 142 ( 82.1%) 493 ( 62.1%)1-N Username to E-Mail Mappings 3 ( 1.7%) 103 ( 13.0%)N-M Username to Email Mappings 18 ( 10.4%) 178 ( 22.4%)Usernames With No Mappings 10 ( 5.8%) 20 ( 2.5%)Total Number Usernames 173 (100.0%) 794 (100.0%)
dresses associated with that username are mapped. Since the other usernamesstill existed within the software repository, we created a second mapping thatmapped username to the “master” username if one existed. The master user-name is used as if it was a single username mapped to multiple e-mail addresses.In this way, we ensured that the information mined from the software repositorywas consistent with the information that we were using from the bug reportsand that we did not loose any pertinent information.
Table B.1 provides the statistics of the username to e-mail mappings for eachproject. Appendix C provides a detailed list of all of the mappings used for thevalidation of EEL for each of the projects.
56
Appendix C
CVS to Bugzilla NameMappings
C.1 Eclipse
The following tables provide a detailed listing of the name mappings that wereused in the validation of EEL for Eclipse. Table C.1 contains all of the uniqueone-to-one mappings of username to e-mail address. Table C.2 contains themappings where a single username mapped to multiple e-mail address, the du-plicate usernames, as well as the usernames that we were unable to map.
Appendix C. CVS to Bugzilla Name Mappings 57
Tab
leC
.1:
Ecl
ipse
one-
to-o
neus
erna
me
toe-
mai
lm
appi
ngs.
One-
to-o
ne
Map
pin
gsak
iezu
n=>
akie
zun@
mit.e
dude
boer
=>
debo
er@
ca.ib
m.c
omjo
hna=
>jo
hnar
thor
ne@
ca.ib
m.c
omni
ck=
>ni
cked
gar@
ca.ib
m.c
omto
rres
=>
alex
andr
e.to
rres
@gm
ail.c
omde
jan=
>de
jan@
ca.ib
m.c
omjo
hnw
=>
John
Wie
gand
@us
.ibm
.com
obes
edin
=>
obes
edin
@ca
.ibm
.com
awei
nand
=>
andr
ew
eina
nd@
ch.ib
m.c
omjd
eupr
ee=
>de
upre
e@us
.ibm
.com
kmae
tzel
=>
kai-uw
em
aetz
el@
ch.ib
m.c
omot
hom
ann=
>O
livie
rT
hom
ann@
ca.ib
m.c
omac
ovas
=>
andr
eaco
vas@
ca.ib
m.c
omdb
aeum
er=
>di
rkba
eum
er@
ch.ib
m.c
omgk
aras
iu=
>ka
rasi
uk@
ca.ib
m.c
ompr
apic
au=
>pa
scal
rapi
caul
t@ca
.ibm
.com
anie
fer=
>an
iefe
r@ca
.ibm
.com
dj=
>dj
houg
hton
@ca
.ibm
.com
kari
ce=
>K
aric
eM
cInt
yre@
ca.ib
m.c
ompd
ubro
y=>
Pat
rick
Dub
roy@
ca.ib
m.c
omah
unte
r=>
anth
onyh
@ca
.ibm
.com
eids
ness
=>
eclip
se@
jfro
nt.c
omkc
orne
ll=>
kcor
nell@
ca.ib
m.c
ompm
ulet
=>
phili
ppe
mul
et@
fr.ib
m.c
omya
man
aka=
>A
tsuh
ikoY
aman
aka@
hotm
ail.c
omlb
ourl
ier=
>ec
lipse
@sk
yluc
.org
kdke
lley=
>kd
kelle
y@ca
.ibm
.com
pweb
ster
=>
pweb
ster
@ca
.ibm
.com
bbau
man
=>
baum
anbr
@us
.ibm
.com
jszu
rsze
wsk
i=>
eclip
se@
szur
szew
ski.c
omdk
ehn=
>ke
hn@
us.ib
m.c
omrc
have
s=>
rafa
elch
aves
@ca
.ibm
.com
bbau
mga
rt=
>be
nno
baum
gart
ner@
ch.ib
m.c
omte
iche
r=>
eclip
se@
tom
.eic
her.
nam
ekj
ohns
on=
>ke
ntjo
hnso
n@ca
.ibm
.com
rand
yg=
>R
andy
Giff
en@
oti.c
omdb
irsa
n=>
birs
an@
ca.ib
m.c
omed
uard
o=>
edua
rdo
pere
ira@
ca.ib
m.c
omke
vinh
=>
Kev
inH
aala
nd@
ca.ib
m.c
omrp
eret
ti=
>ro
drig
ope
rett
i@ca
.ibm
.com
bbok
owsk
i=>
Bor
isB
okow
ski@
ca.ib
m.c
omem
offat
t=>
emoff
att@
ca.ib
m.c
omke
vinm
=>
Kev
inM
cGui
re@
ca.ib
m.c
omse
mio
n=>
sem
ion@
il.ib
m.c
ombr
iany
=>
Bri
anY
oung
@ca
.ibm
.com
egam
ma=
>er
ich
gam
ma@
ch.ib
m.c
omkh
alst
ed=
>kh
alst
ed@
ca.ib
m.c
omsi
leni
o=>
Sile
nio
Qua
rti@
ca.ib
m.c
ombs
hing
ar=
>bs
hing
ar@
ca.ib
m.c
ombf
arn=
>fa
rn@
ca.ib
m.c
omkh
orne
=>
kim
horn
e@ca
.ibm
.com
skae
gi=
>si
mon
.kae
gi@
cogn
os.c
ombt
ripk
ovic
=>
btri
pkov
@ca
.ibm
.com
fhei
dric
=>
felip
ehe
idri
ch@
ca.ib
m.c
omkm
oir=
>km
oir@
ca.ib
m.c
omsa
rsen
au=
>si
mon
arse
naul
t@ca
.ibm
.com
caro
lyn=
>C
arol
ynM
acLeo
d@ca
.ibm
.com
ffusi
er=
>fr
eder
icfu
sier
@fr
.ibm
.com
krad
loff=
>kn
utra
dloff
@us
.ibm
.com
sdim
itro
=>
soni
adi
mitro
v@ca
.ibm
.com
cmck
illop
=>
cdm
@qn
x.co
mgh
eorg
he=
>gh
eorg
he@
ca.ib
m.c
omkk
olos
ow=
>ko
nrad
k@ca
.ibm
.com
stev
e=>
stev
eno
rtho
ver@
ca.ib
m.c
omce
lek=
>ce
lek@
ca.ib
m.c
omgm
ende
l=>
gmen
del@
us.ib
m.c
omkr
barn
es=
>kr
barn
es@
ca.ib
m.c
omss
arka
r=>
sum
it.s
arka
r@hp
.com
cgol
dtho
r=>
cgol
d@us
.ibm
.com
ggay
ed=
>gr
ant
gaye
d@ca
.ibm
.com
airv
ine=
>le
nar
hoyt
@ya
hoo.
com
sfra
nklin
=>
susa
nfr
ankl
in@
us.ib
m.c
omsc
han=
>ch
ansk
w@
ca.ib
m.c
omgr
eg=
>gr
egad
ams@
ca.ib
m.c
omlk
emm
el=
>lk
emm
el@
il.ib
m.c
omsx
enos
=>
sxen
os@
gmai
l.com
cwon
g=>
cher
ie.w
ong@
gmai
l.com
gunn
ar=
>gu
nnar
@w
agen
knec
ht.o
rglp
arso
ns=
>lp
arso
ns@
ca.ib
m.c
omtm
aede
r=>
t.s.
mae
der@
hisp
eed.
chcm
arti
=>
chri
stof
mar
ti@
ch.ib
m.c
omha
rgra
ve=
>ha
rgra
ve@
us.ib
m.c
omlk
ues=
>ly
nne
kues
@us
.ibm
.com
than
son=
>th
anso
n@be
a.co
mcc
ornu
=>
chri
stop
heco
rnu@
ca.ib
m.c
omik
helifi
=>
ines
@vt
.edu
mhu
ebsc
her=
>M
arku
sH
uebs
cher
@ot
i.com
tbay
=>
till
bay@
oti.c
omck
naus
=>
Cla
ude
Kna
us@
oti.c
omjle
brun
=>
Jacq
ues
lebr
un@
oti.c
omm
kelle
r=>
mar
kus
kelle
r@ch
.ibm
.com
telli
son=
>T
imE
lliso
n@uk
.ibm
.com
cmcl
aren
=>
csm
clar
en@
ande
lain
.com
jam
es=
>Ja
mes
Moo
dy@
ca.ib
m.c
omm
aesc
hlim
ann=
>m
arti
nae
schl
iman
n@ch
.ibm
.com
twat
son=
>tj
wat
son@
us.ib
m.c
omcu
rtis
pd=
>cu
rtis
pd@
ca.ib
m.c
omja
nekl
b=>
jane
klb@
ca.ib
m.c
omm
hate
m=
>M
atth
ewH
atem
@no
tesd
ev.ib
m.c
omtw
idm
er=
>to
bias
wid
mer
@ch
.ibm
.com
dmeg
ert=
>da
niel
meg
ert@
ch.ib
m.c
omja
burn
s=>
jare
dbu
rns@
us.ib
m.c
omm
dani
el=
>m
axim
eda
niel
@fr
.ibm
.com
tod=
>Tod
Cre
asey
@ca
.ibm
.com
dsw
anso
n=>
Dar
inSw
anso
n@us
.ibm
.com
jbur
ns=
>ja
redb
urns
@ac
m.o
rgm
elde
r=>
mde
lder
@us
.ibm
.com
tyeu
ng=
>ty
eung
@be
a.co
mdw
righ
t=>
Dar
inW
righ
t@ca
.ibm
.com
jlem
ieux
=>
jean
-mic
helle
mie
ux@
ca.ib
m.c
omm
fara
j=>
mfa
raj@
ca.ib
m.c
omve
roni
ka=
>ve
roni
kair
vine
@ca
.ibm
.com
dave
d=>
dave
d@di
s-co
rp.c
omje
d=>
jed.
ande
rson
@ge
nuitec
.com
mre
nnie
=>
Mic
hael
Ren
nie@
ca.ib
m.c
omvl
ad=
>vh
irsl
@ho
tmai
l.com
dorm
e=>
dave
o@as
c-is
erie
s.co
mjb
row
n=>
jeff
brow
n@ot
i.com
mva
lent
a=>
Mic
hael
Val
enta
@ca
.ibm
.com
wm
elhe
m=
>w
assi
mm
@ca
.ibm
.com
daud
el=
>da
vid
aude
l@fr
.ibm
.com
jeff=
>je
ffm
caffe
r@ca
.ibm
.com
mcq
=>
Mik
eW
ilson
@ca
.ibm
.com
droy
=>
web
mas
ter@
eclip
se.o
rgds
prin
ggay
=>
davi
dsp
ring
gay@
ca.ib
m.c
omjla
nnel
uc=
>je
rom
ela
nnel
uc@
fr.ib
m.c
omm
voro
nin=
>M
ikha
il.V
oron
in@
inte
l.com
wha
rley
=>
wha
rley
@be
a.co
mda
vids
=>
davi
dms@
ca.ib
m.c
omjfog
ell=
>jfog
ell@
us.ib
m.c
omm
kauf
man
=>
mka
ufm
an@
bea.
com
win
ches
t=>
Win
ches
t@uk
.ibm
.com
drob
erts
=>
dean
robe
rts@
ca.ib
m.c
omjg
arm
s=>
jgar
ms@
bea.
com
mpa
wlo
wsk
=>
mpa
wlo
w@
ca.ib
m.c
omdw
ilson
=>
debb
iew
ilson
@ca
.ibm
.com
jdes
rivi
eres
=>
jimde
sri
vier
es@
ca.ib
m.c
omm
vanm
eek=
>m
vm@
ca.ib
m.c
om
Appendix C. CVS to Bugzilla Name Mappings 58
Tab
leC
.2:
Ecl
ipse
one-
to-m
any,
dupl
icat
ean
dun
know
nus
erna
me
toe-
mai
lm
appi
ngs.
One-
to-m
any
Map
pin
gsD
uplica
teU
sern
ame
Map
pin
gsU
nknow
nU
sern
ames
bbig
gs=
>bb
iggs
@ca
.ibm
.com
bbok
owsk
=>
bbok
owsk
ifb
ellin
gbb
iggs
=>
billy
.big
gs@
eclip
se.o
rgch
rix=
>cc
ornu
lchu
idp
ollo
ck=
>do
ugla
s.po
llock
@gm
ail.c
omda
rins
=>
dsw
anso
nm
alic
edp
ollo
ck=
>do
ugla
s.po
llock
@m
agm
a.ca
dari
n=>
dwri
ght
ptff
dpol
lock
=>
pollo
ckd@
ca.ib
m.c
omdr
ober
ts2=
>dr
ober
tsse
ven
ebb=
>ed
.bur
nett
e@gm
ail.c
omjs
zurs
ze=
>js
zurs
zew
ski
wad
man
ebb=
>ed
.bur
nett
e@sa
s.co
mer
ich=
>eg
amm
aw
choi
jeem
=>
jdes
rivi
eres
wm
test
kent
=>
kjoh
nson
jona
than
knut
r=>
krad
loff
ocon
stan
lynn
e=>
lkue
sm
aesc
hli=
>m
aesc
hlim
ann
oliv
iert
=>
otho
man
nro
drig
o=>
rper
etti
ssq=
>si
leni
osd
imit
ro2=
>sd
imit
ropt
obia
s=>
twid
mer
jero
mel
=>
jlann
eluc
Appendix C. CVS to Bugzilla Name Mappings 59
C.2 Mozilla
The following tables provide a detailed listing of the name mappings that wereused in the validation of EEL for Mozilla. Tables C.3, C.4, C.5, C.6 and C.7contain all of the unique one-to-one mappings of username to e-mail address.Tables C.8, C.9 and C.10 contain the mappings where a single username mappedto multiple e-mail address. Tables C.11 and C.12 contains the duplicate user-names. Finally, table C.13 has all of the usernames that we were unable tomap.