Top Banner
Findability through Traceability - A Realistic Application of Candidate Trace Links? Borg, Markus Published in: ENASE 2012 - Proceedings of the 7th International Conference on Evaluation of Novel Approaches to Software Engineering 2012 Link to publication Citation for published version (APA): Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE 2012 - Proceedings of the 7th International Conference on Evaluation of Novel Approaches to Software Engineering (pp. 173-181). SciTePress. Total number of authors: 1 General rights Unless other specific re-use rights are stated the following general rights apply: Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal Read more about Creative commons licenses: https://creativecommons.org/licenses/ Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Download date: 07. Nov. 2020
10

Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

Aug 10, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

LUND UNIVERSITY

PO Box 117221 00 Lund+46 46-222 00 00

Findability through Traceability - A Realistic Application of Candidate Trace Links?

Borg, Markus

Published in:ENASE 2012 - Proceedings of the 7th International Conference on Evaluation of Novel Approaches to SoftwareEngineering

2012

Link to publication

Citation for published version (APA):Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE2012 - Proceedings of the 7th International Conference on Evaluation of Novel Approaches to SoftwareEngineering (pp. 173-181). SciTePress.

Total number of authors:1

General rightsUnless other specific re-use rights are stated the following general rights apply:Copyright and moral rights for the publications made accessible in the public portal are retained by the authorsand/or other copyright owners and it is a condition of accessing publications that users recognise and abide by thelegal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private studyor research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal

Read more about Creative commons licenses: https://creativecommons.org/licenses/Take down policyIf you believe that this document breaches copyright please contact us providing details, and we will removeaccess to the work immediately and investigate your claim.

Download date: 07. Nov. 2020

Page 2: Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

Findability through Traceability - A Realistic Application ofCandidate Trace Links?

Keywords:Traceability; Impact Analysis; Information Seeking; Findability; Human Computer Interaction;Automation;

Abstract:Since software development is of a dynamic nature, the impact analysis is an inevitable worktask. Traceability is known as one factor that supports this task, and several researchers haveproposed traceability recovery tools to propose trace links in an existing system. However, thesesemi-automatic tools have not yet proven useful in industrial applications. Based on an establishedautomation model, we analyzed the potential value of such a tool. We based our analysis on apilot case study of an impact analysis process in a safety-critical development context, and arguethat traceability recovery should be considered an investment in findability. Moreover, severalrisks involved in an increased level of impact analysis automation are already plaguing the state-of-practice work flow. Consequently, deploying a traceability recovery tool involves a lower degreeof change than has previously been acknowledged.

1 INTRODUCTION

Change is an inherent characteristic of the evo-lution of large software systems. Consequently,impact analysis, the process of determining pos-sible effects of proposed software changes, isan inevitable work task. Conducting an im-pact analysis is often a labor-intensive man-ual process, avoided unless absolutely neces-sary [Bohner, 2002]. However, in developmentprojects governed by safety regulations, impactanalysis is a fundamental part of the developmentprocess, necessary for safety certification of prod-ucts [IEC, 2003].

The impact analysis work task involves ahigh degree of information seeking, an increas-ingly costly activity among knowledge work-ers in general [Karr-Wisniewski and Lu, 2010].Previous studies have identified this issue alsoin software engineering projects [Olsson, 2002,Sabaliauskaite et al., 2010]. As a large softwaredevelopment project constitutes a complex in-formation landscape (i.e., thousands of artifactssuch as requirements, source files, test casesand user manuals), analysing change impact isa challenging task. Thus, an important aspectof a development project is the findability itoffers, defined as “the degree to which a sys-tem or environment supports navigation and re-

trieval” [Morville, 2005].

One way to support the informationseeking is to maintain traceability, definedas “the degree to which artifacts are re-lated” [IEEE Computer Society, 1990]. It iswidely recognized as an important factor for effi-cient software development [Antoniol et al., 2002,Domges and Pohl, 1998]. However, maintain-ing trace links in an evolving system is atedious task. To support this activity, sev-eral researchers have proposed traceabilityrecovery (i.e., proposing trace links amongexisting artifacts) based on Information Re-trieval (IR) approaches [Antoniol et al., 2002,Marcus and Maletic, 2003]. However, despitenumerous related publications during the lastdecade, success stories in industrial settingsare conspicuously few [Borg et al., 2012]. Thismakes us believe that the general expectationson the approach are too high, and that the toolsshould be considered from a new perspective.

The typical functionality of traceability re-covery tools is to present the user a ranked listof candidate trace links, who then gets to vetthe output [De Lucia et al., 2012]. Since the hu-man still is expected in the process, tools ofthis kind often claim to provide “semi-automatic”support. As automation is defined as “a deviceor system that accomplishes (partially or fully)

Page 3: Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

a function that was previously, or conceivablycould be, carried out (partially or fully) by a hu-man” [Parasuraman et al., 2000], one might won-der to what level the tools actually automate hu-man work. This leads us to discuss findabilityaround the following research questions:

RQ1 What type and level of automation do state-of-the-art (SoA) traceability recovery tools of-fer?

RQ2 How would the introduction of a SoA trace-ability recovery tool change the state-of-practice (SoP) impact analysis work flow?

To tackle these questions, we conducted apilot case study of an impact analysis processin a safety-critical development context, andassessed how SoA tool support could be ap-plied. To structure our analysis, we appliedthe SWELL Automation Analysis Framework(SAAF), developed as an extension to an automa-tion model initially proposed by Parasuraman etal. [Parasuraman et al., 2000]. Based on our find-ings, we argue that proposed traceability recov-ery tools primarily should be considered as a steptowards improved findability, rather than as anattempt to generate a full set of traces.

This paper is organized as follows: Section 2presents work related to IR-based traceability re-covery and automation analyses. Section 3 de-scribes our case study and SAAF. Section 4 re-ports the outcome of our analysis, and finally Sec-tion 5 concludes and outlines future work.

2 RELATED WORK

2.1 IR-based TraceabilityRecovery

Several researchers have proposed express-ing traceability recovery as an IR prob-lem. Most developed traceability recoverytools implement standard IR techniquesbased on algebraic or probabilistic mod-els [Antoniol et al., 2002, De Lucia et al., 2005,Marcus and Maletic, 2003]. In such tools, theanswer to a query is a ranked list of artifactsuggestions, sorted by the level of calculatedsimilarity (algebraic models), or probability thatthey are related (probabilistic models). Theranked list is analogous to the output of websearch engines and enterprise search tools. Con-sequently, search results can be either relevant

or non-relevant to the information need of thespecific user.

A number of traceability recovery toolswere developed as plug-ins. Klock et al.have developed Traceclipse, supportingtrace link recovery and management withinEclipse [Klock et al., 2011]. They developedTraceclipse to be expandable, to simplify meet-ing future feature requests and to easily supportother IR models. The functionality of theplug-in was initially evaluated, however only tooloutput was considered rather than human-toolinteraction. Canfora and Cerulo developedJimpa, another traceability recovery plug-infor Eclipse [Canfora and Cerulo, 2006]. Theyimplemented probabilistic retrieval to establishlinks between change requests and source code,and evaluated the approach on three opensource systems. De Lucia et al. developed theirown Document Management Systen (DocMS),ADAMS, and the IR-based traceability recoveryplug-in ReTrace [De Lucia et al., 2005]. Further-more, they have evaluated their plug-in in studieswith student subjects [De Lucia et al., 2009].Falessi et al. implemented a plug-in, PROUD,to the industrial CASE tool Enterprise Archi-tect [Falessi and Briand, 2009], and evaluatedit in a controlled experiment with studentsand in an industrial case study. We, onthe other hand, have proposed developing atraceability recovery plug-in to HP QualityCenter [Borg, 2011a, Borg, 2011b]. Developingplug-ins to tools already deployed in industryenables in-vivo studies without introducingadditional external tools.

2.2 Automation Analysis

Several taxonomies and frameworks have been de-veloped to support the analysis of automation. Acomprehensive overview, however from the view-point of manufacturing, was recently presented byFrohm et al. [Frohm et al., 2008]. In their work,they report eight different definitions of “levelsof automation”. Sheridan and Verplank devel-oped a 10-level taxonomy for automation levelsin 1980 [Sheridan and Verplank, 1978]. Billingsstudied automation in the context of air-trafficcontrollers [Billings, 1997]. His work explicitlyseparated automation and human functions, anddefined a continuum of management modes fromunassisted control to fully autonomous opera-tions. Parasuraman et al. developed a modelincorporating both the type and the level of

Page 4: Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

automation [Parasuraman et al., 2000]. We ex-tend this model by adding two preceding andone subsequent analysis phases, described in de-tail in Section 4.3. The model by Parasura-man et al. was previously applied in human-computer interaction research to analyze adap-tive automation solutions for air-traffic con-trol [Clamann et al., 2002]. However, to the bestof our knowledge, it has not been applied to ana-lyze software engineering tools.

Huffman Hayes et al. touched upon the au-tomation questions for traceability recovery ina technical report [Huffman Hayes et al., 2006],in which they base the discussions on theirlong experience of hands-on traceability activi-ties in industry. Furthermore, they have pub-lished several studies on how engineers shouldwork with semi-automatic tool support, includinghow humans interact with tools in the traceabilityloop [Huffman Hayes and Dekhtyar, 2005]. How-ever, the automation analysis is not the centralpart of their publications, which motivated ourinquiry.

3 METHOD

To concretize our discussion on automation,we applied our automation analysis on a spe-cific case in a safety-critical development con-text. An overview of SAAF, the framework usedfor the automation analysis, is presented in Fig-ure 1. It is based on the model by Parasura-man et al. [Parasuraman et al., 2000] mentionedin Section 2.2. Our understanding of tool sup-port for traceability recovery originates from anextensive literature review, and the outcome ofthis automation analysis is intended to guide ourfuture tool developing efforts.

3.1 Safety-critical Impact Analysis- A Pilot Study

To better understand how traceability recoverycan support the impact analysis process, we de-veloped an initial model of the inherent informa-tion seeking activity based on our industrial ex-periences. To validate the model, we presented itto three software engineers from the case com-pany. We communicated primarily via e-mail,and the respondents were selected using conve-nience sampling. However, to improve gener-alizability, we selected respondents representing

Figure 1: Overview of the automation analysis. Thebox represents SAAF.

three different development teams from two dif-ferent departments. Based on the feedback re-ceived, we refined the model to the version pre-sented in Section 4.2. Furthermore, we asked therespondents about their views on risks involvedin increasing automation in the impact analysisprocess.

3.2 Phases of SAAF

The first phase, Preliminaries, establishes the fo-cus, i.e., the scope of activities affected as wellas effect targets, for the automation effort withina context. Also, the phase clarifies any assump-tions taken and what is included in the analysis.The three steps of this phase describe: Contextof the automation, Scope of the automation, andEffect targets.

Automation change identification, the sec-ond phase of SAAF, describes pre- and post-automation task flows. This phase shouldspecify which work tasks are changed, addedor removed as a consequence of automation.The three steps of this phase describe: Pre-automation work flow, Post-automation workflow, and Changed/Added/Removed tasks.

Automation classification, third phase, analy-ses automation according to the model by Para-suraman et al. [Parasuraman et al., 2000]. Theobject of the analysis is both the pre- and post-automation work flows. The two steps of thisphase comprise analysis of: Types of automation,and Levels of automation (presented in Table 1and Figure 3).

The final phase, Automation impact analysis,estimates both direct and indirect effects of the

Page 5: Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

increased automation. Since automated solutionstypically bring both positive and negative effects,understanding them prior to implementing anychanges is essential. Our analysis targets threatsinvolved in the actual automation, and also itscognitive side-effects. The final phase of SAAFalso includes a break-even analysis, a parametricassessment of benefits, where the parameter val-ues are selected to equate costs and benefits. Itis one of the methods that evolved to quantifybenefits of information systems [Sassone, 1988].Fixed costs and costs dependant on volume arecompared to determine the volume at which anautomation investment results in neither a profitnor a loss, the so called breakpoint. We base dis-cussions on the feasibility of traceability recoveryfor impact analysis on such an initial analysis,and report: Direct effects, Indirect effects, andExisting evidence for effects.

4 RESULTS AND DISCUSSION

This section describes the results from apply-ing SAAF. Every phase is concluded by our finalassessment, expressed using the levels presentedin Table 1.

4.1 Preliminaries

4.1.1 Description of the context

The analyzed impact analysis process originatesfrom a large multinational company active in thepower and automation sector. The developmentcontext is safety-critical embedded developmentin the domain of industrial control systems, gov-erned by IEC 61511 [IEC, 2003]. The numberof developers is in the magnitude of hundreds;a project has typically a length of 12-18 monthsand follows an iterative stage-gate project man-agement model. The software is certified to aSafety Integrity Level (SIL) of 2 as defined byIEC 61508 [IEC, 2010], corresponding to a risk re-duction factor of 1,000,000-10,000,000 for contin-uous operation. There are process requirementson the maintenance of traceability information,especially between requirements and test cases.Both requirements and test case descriptions arepredominantly specified in English natural lan-guage text.

4.1.2 Description of the work task

As specified in IEC 61511 [IEC, 2003], impactof proposed software changes should be ana-lyzed before implementation. In the studied case,this process is tightly integrated in the DefectManagement System (DefMS). The issues in theDefMS, i.e., defect reports and change requests,are administered by a change control board. Theboard distributes issues to responsible teams forinvestigation. As part of the investigation, devel-opers are required to perform an impact analysis,and report their results according to a projectspecific template. The template, developed byinternal safety engineers and validated by an ex-ternal certifying agency, contains between 5-20questions depending on the SIL of the affectedsoftware components. In the template, severalquestions explicitly ask for trace links. The de-veloper is required to specify source code that willbe modified (with a class-level granularity), andalso which related software artifacts need to beupdated to reflect the changes, e.g., requirementspecifications, design documentation, test casedescriptions, test scripts and user manuals. Fur-thermore, the report should specify which high-level system requirements cover the involved fea-tures, and which test cases should be executedto verify that the changes are correct once im-plemented in the system. The test case selec-tion should cover both developer-centric func-tional testing, and system testing conducted bythe test organization.

4.1.3 Effect targets

The intention of increased automation is to makethe impact analysis faster and more accurate.Also, one engineer explained “It is important toreduce the number of mundane questions to savethe effort for the ones requiring thought and an-alytical abilities”.

4.2 Automation ChangeIdentification (RQ2)

4.2.1 Pre-automation task flow

A major part of the impact analysis involves spec-ifying trace links to related software artifacts. Asthere rarely are any requirement traceability ma-trices to consult, the tracing is mainly a poorlysupported information seeking activity. If the en-gineers do not already know which artifacts (if

Page 6: Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

Type Level DescriptionInformation 10 Autonomous, ignoring human.acquisition 9 May inform human.

⇓ 8 Informs human if asked.Information 7 Executes automatically, then informs.

analysis 6 Allows human time to veto.⇓ 5 Executes if human approves.

Decision 4 Suggests one alternative.selection 3 Narrows down selection to a few.

⇓ 2 Offers complete set of alternatives.Action 1 No assistance at all.

implementation

Table 1: Summary of types and levels of automation [Parasuraman et al., 2000].

Figure 2: Overview of the SoP impact analysis pro-cess.

any) are related to the issue, or if they do notknow where to find the information, they haveto search or browse databases or seek informa-tion from colleagues. When this information needarise, a typical first step is to search the DefMSfor already solved issues that are similar, as pre-sented in Figure 2. If no such issues are found, onecould search the DocMS for relevant project doc-umentation, or ask a colleague for help as a lastresort. One engineer stated “I probably search forinformation in project documentation more thanI should, it is very time-consuming and rarely suc-cessful”.

4.2.2 Post-automation task flow

The idea of the traceability recovery tool is tosupport the two steps of manual database search-ing by automatically executing search queries.Without human action, we envision the tool tosearch for both related issues in the DefMS and

related documentation in the DocMS. Based onthe textual content of the currently analyzed is-sue, the tool predicts which software artifacts arethe most likely to be related. As in the manualwork flow, the last resort is to ask a colleague.

4.2.3 Changed/Added/Removed tasks

From the perspective of the engineer, the worktask is slightly altered. An automatic search isconducted and the resulting search hits, i.e., can-didate trace links, are presented. Thus, an addedhuman subtask is to assess the search hits. Ifthe search result is enough to satisfy the informa-tion need, the manual DefMS and DocMS search-ing are removed subtasks. On the other hand,if the engineer does not consider the search re-sults to provide enough information, the engineerwill as before have to seek information by manualqueries, or by asking a colleague.

4.3 Automation Classification(RQ1)

4.3.1 Types of automation

Table 1 shows Parasuraman et al.’s types of au-tomation. The first step, Information acquisition,refers to operations supporting human sensoryprocesses, sensing and registration of input data.In the case of traceability recovery, the requiredinformation is stored digitally in databases. Soft-ware artifacts are typically distributed in sepa-rate systems with poor interoperability, a condi-tion that applies also to the case we study. Sincethe usefulness of an IR-based traceability recov-ery tool is dependant on which software artifactsit can access and index, plug-in solutions to ex-isting DefMS and DocMS have the advantage ofbeing deployed where the information actually re-sides.

Page 7: Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

Information analysis, the second step of anautomation solution, deals with the cognitivefunctions such as working memory and infer-ential processes [Parasuraman et al., 2000]. Re-garding IR-based traceability recovery tools, it isnot meaningful to distinguish between acquisitionand analysis of information. As the informationis accessed, it is also analyzed. This includes thesteps of the implemented IR model such as pre-processing (stop word removal, stemming etc.),feature extraction and weighting, and extractionof language models. In the rest of this report, weconsider information acquisition and analysis tobe one single type of automation.

The third step, Decision selection, comprisesthe augmentation and replacement of human de-cision options. Supporting decision selection isthe backbone of IR-based traceability recoverytools, since they are intended to present candidatetrace links. The tools rank search results, andpresent them to the user. Furthermore, a num-ber of traceability recovery tools offer the userboth filtering and highlighting.

Action implementation, the last automationtype, is defined as the actual machine executionof the action choice. For the IR-based trace-ability recovery plug-in we envision, the actionimplementation is limited to correctly reportingthe outcome in the impact analysis template de-scribed in Section 4.1.2. Automating this stepis meaningful since it would reduce the risk ofhuman input errors (which are known to exist),e.g., incorrect use of document identifiers, andcopy/paste errors.

4.3.2 Levels of automation

Table 1 shows the levels of automation accord-ing to Parasuraman et al., from no support atall to a fully autonomous solution. Based on ouranalysis of types of automation, we studied thecorresponding levels of automation, and the risksinvolved in increasing them.

Regarding Information acquisition and analy-sis, the SoP activity is to manually input searchqueries in search tools offered by DefMSs andDocMSs. In such tools, the human activity is atbest supported by a set of boolean search opera-tions, e.g., AND, OR, NOT. The SoA traceabilityrecovery tools on the other hand, automaticallyenters and executes search queries. The human isnot at all involved in this process, however errormessages tend to appear if major failures occur.Letting the tools access too much information is asecurity issue. In many cases, employees have dif-

ferent access rights. Efforts to improve informa-tion access in enterprise search solutions are oftenlimited by policy decisions [Tolone et al., 2005].

The SoP activities corresponding to the au-tomation type Decision support are mainly con-cerned with vetting various search results. Searchstrings can be refined until the information needof the engineer is satisfied. SoP search solutionstypically return a ranked list of documents andsupport features such as sorting and filtering.More advanced search solutions also implementfeatures as query expansion and relevance feed-back [Baeza-Yates and Ribeiro-Neto, 2011].

After automatic execution of the searchqueries, SoA tracebility recovery tools do not dif-fer from SoP search solutions. In both cases,the goal is to limit the search space of the engi-neers, and to at least give them starting pointsfor browsing to the information they are seek-ing. Obviously, an ideal traceability recoverytool would automatically make decisions with abetter judgement than a human engineer. In-stead, since search results include both relevantand non-relevant results, increasing the automa-tion to levels where the human is not part of theprocess leads to both false positives and missedtrace links. Since the accuracy of SoA trace-ability recovery is considered low, fully remov-ing the human involvement is currently not fea-sible [Oliveto et al., 2007, De Lucia et al., 2012].IR tools balance on the precision-recall trade-off [Baeza-Yates and Ribeiro-Neto, 2011], and afully automated solution have to be designed withcare. Missed links threat underestimating changeimpact, which motivates search tools offering ahigh recall. On the other hand, false positivescaused by low precision force engineers to spendextra effort. For both cases, incorrect effort esti-mations are consequences.

The last automation type in the sequence, theAction implementation, is currently supported bya template and corresponding guidelines devel-oped by internal safety engineers. The outcome ofthe impact analysis is reported manually, by en-tering free text in a document. The tailored trace-ability recovery plug-in we envision, enables sim-ple drag-and-drop operations in a graphical userinterface. Both manually typing free-text infor-mation in a template and operations in a graph-ical interface introduce errors due to the humanfactor, thus we do not consider this a major risk.Although, efficient UIs enable engineers to inputincorrect information faster. Consequently, an in-creased level of automation of the last automation

Page 8: Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

Figure 3: Types and levels of automation for IR-basedtraceability recovery support in impact analysis. Themajor risks involved in increasing the levels are fromleft to right: contravening access policies, incorrectimpact analysis, increased number of human errors.

type in the sequence might actually increase thenumber of human errors.

4.4 Automation Impact Analysis(RQ2)

4.4.1 Direct effects

There are two main hypotheses of increasing thelevel of automation by deploying a traceabilityrecovery plug-in. First, engineers will on aver-age finish the impact analysis work task faster.Second, the correctness of the engineers’ impactanalyses will be higher.

4.4.2 Indirect effects

Besides risks already mentioned, there is a riskthat additional tool support, if it is well received,would make engineers too confident in their out-put. Such overconfidence might cause engineers,especially under stress, to hastily accept tool out-put as final answers. Another risk of deploying ef-ficient search support is that communication be-tween developers might decrease, as people mightrely more on tools.

4.4.3 Evidence of effects

There are few evaluations of deploying IR-basedtraceability recovery tools in complete softwaredevelopment projects, and only one of them wasconducted in an industrial setting. Li et al. con-ducted a case study on impact analysis in a five-people project running in a Chinese company for30 weeks [Li et al., 2008]. They concluded it to bea feasible approach, and that it helped engineersfinish the tracing tasks faster. De Lucia et al.

conducted another case study, however in a uni-versity setting [De Lucia et al., 2009]. By study-ing seven student development projects, theyfound that deploying their IR-based traceabil-ity recovery plug-in improved the maintenance oftraceability information, as more trace links werediscovered.

Apart from case studies, several controlledexperiments have concluded that tools imple-menting IR-based traceability recovery can bebeneficial. For similar tasks related to es-tablishing trace links, both Huffman Hayes etal. [Huffman Hayes et al., 2007] and De Lucia etal. [De Lucia et al., 2009] concluded that workingwith tool support is better than working man-ually, and that it improves accuracy and/or ef-ficiency. De Lucia et al. also found that in-experienced developers benefit most from suchtool support [De Lucia et al., 2009]. On the con-trary, an experiment by Falessi et al. concludedthat letting student subjects work with their IR-traceability recovery tool did not lead to anysignificant advantages [Falessi and Briand, 2009].Also, in contrast to findings by De Lucia et al. re-garding impact of experience, they found the toolsupport more useful when used by a requirementsanalyst in an Italian company.

The usefulness of IR solutions are known todepends on the context, users, and the task it ismeant to support, which is also indicated by theconclusions from previous studies. Consequently,the applicability of deploying a tailored plug-inin the safety-critical context of our case study isuncertain. However, primarily considering it asan investment in findability (i.e., moving beyondmanual keyword searching to automatically sug-gesting search results), we expect the informationseeking to be more efficient. Figure 4 presents abreak-even analysis, assuming that the tool sup-port would make an engineer complete an impactanalysis faster. The main costs in an automatedsolution would be initial, as the tool maintenanceis expected to be negligible. Instead the costs aremainly related to development and deployment ofa tool, and user training. The tracing costs wouldthen linearly increase depending on the number oftracing tasks performed. For the manual tracingprocess, the tracing cost is expected to decrease asthe number of tasks grow due to human learningeffects. Thus, for a traceability recovery plug-into be a meaningful investment, we would expecttwo aspects from the development context. First,the information landscape must be challengingenough to make tracing a time-consuming task

Page 9: Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

Figure 4: Initial break-even analysis for automatedimpact analysis.

(high slope of manual curve). Second, the tracingtask must be common enough to be consideredan issue (n high enough for curves to intersect).Whether the studied case fulfils these aspects re-quires further study.

5 CONCLUSIONS ANDFUTURE WORK

We have presented a structured analysis ofhow a SoA IR-based traceability recovery toolcould support findability in an impact analysiswork task. Also, we argue that the tools should bejudged as search tools rather than as traceabilityminers, and thus should be evaluated accordingly.The rest of our contribution is threefold. First, wepresented a model of the information seeking ac-tivity involved in an impact analysis work task ina safety-critical development context. Second, wefound that deploying a SoA traceability recoverytool, with a limited risk, can increase the level ofautomation, but not for the automation type de-fined as decision selection. Third, based on an ini-tial break-even analysis, we claim that investingin automated traceability recovery only is worththe effort if both (a) the information landscapeis challenging, and (b) the artifact tracing is afrequent work task.

Future work includes implementing an IR-based traceability recovery plug-in to HP QualityCenter and evaluating it in an industrial context,based on established methodology for user eval-uations in information retrieval. Furthermore, adeeper case study on impact analysis processes insafety-critical software engineering should be con-ducted. Such a study should be dominated byqualitative analysis, since quantitative analyses

have dominated the field of IR-based traceabilityrecovery.

Acknowledgement

This study was done as part of the SWELL1

course on Automated Verification and Validationwhere we have studied and developed models foranalyzing levels of automation in V&V. Thanksgo to Rickard Torkar, Saıd Assar, and Per Rune-son for review comments.

REFERENCES

[Antoniol et al., 2002] Antoniol, G., Canfora, G.,Casazza, G., De Lucia, A., and Merlo, E. (2002).Recovering traceability links between code anddocumentation. In Trans. on Software Engineer-ing, volume 28, pages 970–983.

[Baeza-Yates and Ribeiro-Neto, 2011] Baeza-Yates,R. and Ribeiro-Neto, B. (2011). Modern Infor-mation Retrieval: The Concepts and Technologybehind Search. Addison-Wesley.

[Billings, 1997] Billings, C. (1997). Aviation Automa-tion: The Search for a Human-Centered Approach.Larrence Erlbaum Associates, New Jersey.

[Bohner, 2002] Bohner, S. (2002). Software changeimpacts-an evolving perspective. In Proceedings ofthe International Conference on Software Mainte-nance, pages 263– 272.

[Borg, 2011a] Borg, M. (2011a). In vivo evaluation oflarge-scale IR-based traceability recovery. In Pro-ceedings of the European Conference on SoftwareMaintenance and Reengineering, pages 365–368.

[Borg, 2011b] Borg, M. (2011b). IR-based traceabil-ity recovery as a plugin - an industrial case study.In Fourth BCS-IRSG Symposium on Future Direc-tions in Information Access.

[Borg et al., 2012] Borg, M., Wnuk, K., and Pfahl,D. (2012). Industrial comparability of student ar-tifacts in traceability recovery research - an ex-ploratory survey. In Proceedings of the 16th Eu-ropean Conference on Software Maintenance andReengineering.

[Canfora and Cerulo, 2006] Canfora, G. and Cerulo,L. (2006). Fine grained indexing of software repos-itories to support impact analysis. In Proceedingsof the International Workshop on Mining softwarerepositories, pages 105–111.

[Clamann et al., 2002] Clamann, M., Wright, M.,and Kaber, D. (2002). Comparison of performanceeffects of adaptive automation applied to various

1swell.se

Page 10: Findability through Traceability - A Realistic Application of … · Borg, M. (2012). Findability through Traceability - A Realistic Application of Candidate Trace Links? In ENASE

stages of human-machine system information pro-cessing. In Proc. of the Ann. Meeting of the HumanFactors and Ergonomics Soc., pages 342–346.

[De Lucia et al., 2005] De Lucia, A., Fasano, F.,Oliveto, R., and Tortora, G. (2005). ADAMS re-trace: A traceability recovery tool. In Proc. of the9th European Conference on Software Maintenanceand Reengineering, pages 32–41.

[De Lucia et al., 2012] De Lucia, A., Marcus, A.,Oliveto, R., and Poshyvanyk, D. (2012). Informa-tion retrieval methods for automated traceabilityrecovery. In Cleland-Huang, J., Gotel, O., and Zis-man, A., editors, Software and Systems Traceabil-ity, pages 71–98. Springer, London.

[De Lucia et al., 2009] De Lucia, A., Oliveto, R., andTortora, G. (2009). Assessing IR-based traceabil-ity recovery tools through controlled experiments.Empirical Software Engineering, 14(1):57–92.

[Domges and Pohl, 1998] Domges, R. and Pohl, K.(1998). Adapting traceability environments toproject-specific needs. Communications of theACM, 41(12):54–62.

[Falessi and Briand, 2009] Falessi, D. and Briand, L.(2009). The impact of automated support forlinking equivalent requirements based on similar-ity measures. Technical report, Simula.

[Frohm et al., 2008] Frohm, J., Lindstrom, V., Win-roth, M., and Stahre, M. (2008). Levels of automa-tion in manufacturing. Ergonomia, page 29.

[Huffman Hayes and Dekhtyar, 2005]Huffman Hayes, J. and Dekhtyar, A. (2005).Humans in the traceability loop: can’t live with’em, can’t live without ’em. In Proceedings ofthe 3rd International Workshop on Traceability inEmerging Forms of Software Engineering, pages20–23.

[Huffman Hayes et al., 2006] Huffman Hayes, J.,Dekhtyar, A., and Sundaram, S. (2006). Advancesin dynamic generation of traceability links: Twosteps closer to full automation? Technical report,University of Kentucky.

[Huffman Hayes et al., 2007] Huffman Hayes, J.,Dekhtyar, A., Sundaram, S., Holbrook, A., Vad-lamudi, S., and April, A. (2007). REquirementsTRacing on target (RETRO): improving softwaremaintenance through traceability recovery. In-novations in Systems and Software Engineering,3(3):193–202.

[IEC, 2003] IEC (2003). IEC 61511-1 ed 1.0, safetyinstrumented systems for the process industry sec-tor.

[IEC, 2010] IEC (2010). IEC 61508 ed 2.0, Elec-trical/Electronic/Programmable electronic safety-related systems.

[IEEE Computer Society, 1990] IEEE Computer So-ciety (1990). 610.12-1990 IEEE standard glossaryof software engineering terminology. Technical re-port.

[Karr-Wisniewski and Lu, 2010] Karr-Wisniewski,P. and Lu, Y. (2010). When more is too much:Operationalizing technology overload and explor-ing its impact on knowledge worker productivity.Computers in Human Behavior, 26(5):1061–1072.

[Klock et al., 2011] Klock, S., Gethers, M., Dit, B.,and Poshyvanyk, D. (2011). Traceclipse: an eclipseplug-in for traceability link recovery and manage-ment. In Proceeding of the 6th International Work-shop on Traceability in Emerging Forms of Soft-ware Eengineering, pages 24–30.

[Li et al., 2008] Li, Y., Li, J., Yang, Y., and Li, M.(2008). Requirement-centric traceability for changeimpact analysis: A case study. In InternationalConference on Software Process, pages 100–111.

[Marcus and Maletic, 2003] Marcus, A. and Maletic,J. (2003). Recovering documentation-to-source-code traceability links using latent semantic index-ing. In Proc. of the Int’l Conference on SoftwareEngineering, pages 125–135.

[Morville, 2005] Morville, P. (2005). Ambient Find-ability: What We Find Changes Who We Become.O’Reilly Media.

[Oliveto et al., 2007] Oliveto, R., Antoniol, G., Mar-cus, A., and Hayes, J. (2007). Software artefacttraceability: the Never-Ending challenge. In Soft-ware Maintenance, 2007. ICSM 2007. IEEE Inter-national Conference on, pages 485–488.

[Olsson, 2002] Olsson, T. (2002). Software Informa-tion Management in Requirements and Test Docu-mentation. Licentiate thesis, Lund University.

[Parasuraman et al., 2000] Parasuraman, R., Sheri-dan, T., and Wickens, C. (2000). A model fortypes and levels of human interaction with automa-tion. Transactions on Systems, Man and Cybernet-ics, 30(3):286–297.

[Sabaliauskaite et al., 2010] Sabaliauskaite, G., Lo-console, A., Engstrm, E., Unterkalmsteiner, M.,Regnell, B., Runeson, P., Gorschek, T., and Feldt,R. (2010). Challenges in aligning requirements en-gineering and verification in a Large-Scale indus-trial context. In Requirements Engineering: Foun-dation for Software Quality, pages 128–142.

[Sassone, 1988] Sassone, P. (1988). Cost benefit anal-ysis of information systems: a survey of method-ologies. In Proceedings of the Conference on OfficeInformation Systems, pages 126–133.

[Sheridan and Verplank, 1978] Sheridan, T. and Ver-plank, W. (1978). Human and computer controlof undersea teleoperators. Technical, MIT Man-Machine Systems Laboratory, Cambridge, MA,United states.

[Tolone et al., 2005] Tolone, W., Ahn, G., Pai, T.,and Hong, S. (2005). Access control in collabora-tive systems. ACM Computing Surveys, 37(1):29–41.