Final Year IEEE Project 2013-2014 - Software Engineering Project Title and Abstract

Elysium Technologies Private Limited Singapore | Madurai | Chennai | Trichy | Coimbatore | Cochin | Ramnad |

Pondicherry | Trivandrum | Salem | Erode | Tirunelveli

http://www.elysiumtechnologies.com, [email protected]

13 Years of Experience

Automated Services

24/7 Help Desk Support

Experience & Expertise Developers

Advanced Technologies & Tools

Legitimate Member of all Journals

Having 1,50,000 Successive records in

all Languages

More than 12 Branches in Tamilnadu,

Kerala & Karnataka.

Ticketing & Appointment Systems.

Individual Care for every Student.

Around 250 Developers & 20

Researchers




227-230 Church Road, Anna Nagar, Madurai – 625020.

0452-4390702, 4392702, + 91-9944793398.

[email protected], [email protected]

S.P.Towers, No.81 Valluvar Kottam High Road, Nungambakkam,

Chennai - 600034. 044-42072702, +91-9600354638,

[email protected]

15, III Floor, SI Towers, Melapudur main Road, Trichy – 620001.

0431-4002234, + 91-9790464324.

[email protected]

577/4, DB Road, RS Puram, Opp to KFC, Coimbatore – 641002

0422- 4377758, +91-9677751577.

[email protected]

mailto:[email protected]

mailto:[email protected]




Plot No: 4, C Colony, P&T Extension, Perumal puram, Tirunelveli-

627007. 0462-2532104, +919677733255,

[email protected]

1st Floor, A.R.IT Park, Rasi Color Scan Building, Ramanathapuram

- 623501. 04567-223225,

[email protected]

74, 2nd floor, K.V.K Complex,Upstairs Krishna Sweets, Mettur

Road, Opp. Bus stand, Erode-638 011. 0424-4030055, +91-

9677748477 [email protected]

No: 88, First Floor, S.V.Patel Salai, Pondicherry – 605 001. 0413–

4200640 +91-9677704822

[email protected]

TNHB A-Block, D.no.10, Opp: Hotel Ganesh Near Busstand. Salem

– 636007, 0427-4042220, +91-9894444716.

[email protected]




ETPL

SE-001

Trends in the Quality of Human-Centric Software Engineering Experiments--A Quasi-

Experiment

Abstract: Context: Several text books and papers published between 2000 and 2002 have attempted to

introduce experimental design and statistical methods to software engineers undertaking empirical

studies. Objective: This paper investigates whether there has been an increase in the quality of human-

centric experimental and quasi-experimental journal papers over the time period 1993 to 2010. Method:

Seventy experimental and quasi-experimental papers published in four general software engineering

journals in the years 1992-2002 and 2006-2010 were each assessed for quality by three empirical software

engineering researchers using two quality assessment methods (a questionnaire-based method and a

subjective overall assessment). Regression analysis was used to assess the relationship between paper

quality and the year of publication, publication date group (before 2003 and after 2005), source journal,

average coauthor experience, citation of statistical text books and papers, and paper length. The results

were validated both by removing papers for which the quality score appeared unreliable and using an

alternative quality measure. Results: Paper quality was significantly associated with year, citing general

statistical texts, and paper length (p < 0.05). Paper length did not reach significance when quality

was measured using an overall subjective assessment. Conclusions: The quality of experimental and

quasi-experimental software engineering papers appears to have improved gradually since 1993.

ETPL

SE-002 Self-Organizing Roles on Agile Software Development Teams

Abstract: Self-organizing teams have been recognized and studied in various forms-as autonomous groups

in socio-technical systems, enablers of organizational theories, agents of knowledge management, and as

examples of complex-adaptive systems. Over the last decade, self-organizing teams have taken center

stage in software engineering when they were incorporated as a hallmark of Agile methods. Despite the

long and rich history of self-organizing teams and their recent popularity with Agile methods, there has

been little research on the topic within software wngineering. Particularly, there is a dearth of research on

how Agile teams organize themselves in practice. Through a Grounded Theory research involving 58

Agile practitioners from 23 software organizations in New Zealand and India over a period of four years,

we identified informal, implicit, transient, and spontaneous roles that make Agile teams self-organizing.

These roles-Mentor, Coordinator, Translator, Champion, Promoter, and Terminator-are focused toward

providing initial guidance and encouraging continued adherence to Agile methods, effectively managing

customer expectations and coordinating customer collaboration, securing and sustaining senior

management support, and identifying and removing team members threatening the self-organizing ability

of the team. Understanding these roles will help software development teams and their managers better

comprehend and execute their roles and responsibilities as a self-organizing team.

ETPL

SE-003 How Programmers Debug, Revisited: An Information Foraging Theory Perspective

Abstract: Many theories of human debugging rely on complex mental constructs that offer little practical



http://www.elysiumtechnologies.com, [email protected] advice to builders of software engineering tools. Although hypotheses are important in debugging, a

theory of navigation adds more practical value to our understanding of how programmers debug.

Therefore, in this paper, we reconsider how people go about debugging in large collections of source code

using a modern programming environment. We present an information foraging theory of debugging that

treats programmer navigation during debugging as being analogous to a predator following scent to find

prey in the wild. The theory proposes that constructs of scent and topology provide enough information to

describe and predict programmer navigation during debugging, without reference to mental states such as

hypotheses. We investigate the scope of our theory through an empirical study of 10 professional

programmers debugging a real-world open source program. We found that the programmers'

verbalizations far more often concerned scent-following than hypotheses. To evaluate the predictiveness

of our theory, we created an executable model that predicted programmer navigation behavior more

accurately than comparable models that did not consider information scent. Finally, we discuss the

implications of our results for enhancing software engineering tools

ETPL

SE-004

Empirical Principles and an Industrial Case Study in Retrieving Equivalent

Requirements via Natural Language Processing Techniques

Abstract: Though very important in software engineering, linking artifacts of the same type (clone

detection) or different types (traceability recovery) is extremely tedious, error-prone, and effort-intensive.

Past research focused on supporting analysts with techniques based on Natural Language Processing

(NLP) to identify candidate links. Because many NLP techniques exist and their performance varies

according to context, it is crucial to define and use reliable evaluation procedures. The aim of this paper is

to propose a set of seven principles for evaluating the performance of NLP techniques in identifying

equivalent requirements. In this paper, we conjecture, and verify, that NLP techniques perform on a given

dataset according to both ability and the odds of identifying equivalent requirements correctly. For

instance, when the odds of identifying equivalent requirements are very high, then it is reasonable to

expect that NLP techniques will result in good performance. Our key idea is to measure this random

factor of the specific dataset(s) in use and then adjust the observed performance accordingly. To support

the application of the principles we report their practical application to a case study that evaluates the

performance of a large number of NLP techniques for identifying equivalent requirements in the context

of an Italian company in the defense and aerospace domain. The current application context is the

evaluation of NLP techniques to identify equivalent requirements. However, most of the proposed

principles seem applicable to evaluating any estimation technique aimed at supporting a binary decision

(e.g., equivalent/nonequivalent), with the estimate in the range [0,1] (e.g., the similarity provided by the

NLP), when the dataset(s) is used as a benchmark (i.e., testbed), independently of the type of estimator

(i.e., requirements text) and of the estimation method (e.g., NLP).

ETPL

SE-005

Assessing the Effectiveness of Sequence Diagrams in the Comprehension of Functional

Requirements: Results from a Family of Five Experiments

Abstract: Modeling is a fundamental activity within the requirements engineering process and concerns

the construction of abstract descriptions of requirements that are amenable to interpretation and



http://www.elysiumtechnologies.com, [email protected] validation. The choice of a modeling technique is critical whenever it is necessary to discuss the

interpretation and validation of requirements. This is particularly true in the case of functional

requirements and stakeholders with divergent goals and different backgrounds and experience. This paper

presents the results of a family of experiments conducted with students and professionals to investigate

whether the comprehension of functional requirements is influenced by the use of dynamic models that

are represented by means of the UML sequence diagrams. The family contains five experiments

performed in different locations and with 112 participants of different abilities and levels of experience

with UML. The results show that sequence diagrams improve the comprehension of the modeled

functional requirements in the case of high ability and more experienced participants.

ETPL

SE-006 Using Dependency Structures for Prioritization of Functional Test Suites

Abstract: Test case prioritization is the process of ordering the execution of test cases to achieve a certain

goal, such as increasing the rate of fault detection. Increasing the rate of fault detection can provide earlier

feedback to system developers, improving fault fixing activity and, ultimately, software delivery. Many

existing test case prioritization techniques consider that tests can be run in any order. However, due to

functional dependencies that may exist between some test cases-that is, one test case must be executed

before another-this is often not the case. In this paper, we present a family of test case prioritization

techniques that use the dependency information from a test suite to prioritize that test suite. The nature of

the techniques preserves the dependencies in the test ordering. The hypothesis of this work is that

dependencies between tests are representative of interactions in the system under test, and executing

complex interactions earlier is likely to increase the fault detection rate, compared to arbitrary test

orderings. Empirical evaluations on six systems built toward industry use demonstrate that these

techniques increase the rate of fault detection compared to the rates achieved by the untreated order,

random orders, and test suites ordered using existing "coarse-grained” techniques based on function

coverage.

ETPL

SE-007

A Second Replicated Quantitative Analysis of Fault Distributions in Complex Software

Systems

Abstract: Background: Software engineering is searching for general principles that apply across contexts,

for example, to help guide software quality assurance. Fenton and Ohlsson presented such observations

on fault distributions, which have been replicated once. Objectives: We aimed to replicate their study

again to assess the robustness of the findings in a new environment, five years later. Method: We

conducted a literal replication, collecting defect data from five consecutive releases of a large software

system in the telecommunications domain, and conducted the same analysis as in the original study.

Results: The replication confirms results on unevenly distributed faults over modules, and that fault

proneness distributions persist over test phases. Size measures are not useful as predictors of fault

proneness, while fault densities are of the same order of magnitude across releases and contexts.

Conclusions: This replication confirms that the uneven distribution of defects motivates uneven

distribution of quality assurance efforts, although predictors for such distribution of efforts are not



http://www.elysiumtechnologies.com, [email protected] sufficiently precise.

ETPL

SE-008 Automated API Property Inference Techniques

Abstract: Frameworks and libraries offer reusable and customizable functionality through Application

Programming Interfaces (APIs). Correctly using large and sophisticated APIs can represent a challenge

due to hidden assumptions and requirements. Numerous approaches have been developed to infer

properties of APIs, intended to guide their use by developers. With each approach come new definitions

of API properties, new techniques for inferring these properties, and new ways to assess their correctness

and usefulness. This paper provides a comprehensive survey of over a decade of research on automated

property inference for APIs. Our survey provides a synthesis of this complex technical field along

different dimensions of analysis: properties inferred, mining techniques, and empirical results. In

particular, we derive a classification and organization of over 60 techniques into five different categories

based on the type of API property inferred: unordered usage patterns, sequential usage patterns,

behavioral specifications, migration mappings, and general information.

ETPL

SE-009 Identifying and Summarizing Systematic Code Changes via Rule Inference

Abstract: Programmers often need to reason about how a program evolved between two or more program

versions. Reasoning about program changes is challenging as there is a significant gap between how

programmers think about changes and how existing program differencing tools represent such changes.

For example, even though modification of a locking protocol is conceptually simple and systematic at a

code level, diff extracts scattered text additions and deletions per file. To enable programmers to reason

about program differences at a high level, this paper proposes a rule-based program differencing approach

that automatically discovers and represents systematic changes as logic rules. To demonstrate the viability

of this approach, we instantiated this approach at two different abstraction levels in Java: first at the level

of application programming interface (API) names and signatures, and second at the level of code

elements (e.g., types, methods, and fields) and structural dependences (e.g., method-calls, field-accesses,

and subtyping relationships). The benefit of this approach is demonstrated through its application to

several open source projects as well as a focus group study with professional software engineers from a

large e-commerce company.

ETPL

SE-010

Ant Colony Optimization for Software Project Scheduling and Staffing with an Event-

Based Scheduler

Abstract: Research into developing effective computer aided techniques for planning software projects is

important and challenging for software engineering. Different from projects in other fields, software

projects are people-intensive activities and their related resources are mainly human resources. Thus, an

adequate model for software project planning has to deal with not only the problem of project task

scheduling but also the problem of human resource allocation. But as both of these two problems are

difficult, existing models either suffer from a very large search space or have to restrict the flexibility of



http://www.elysiumtechnologies.com, [email protected] human resource allocation to simplify the model. To develop a flexible and effective model for software

project planning, this paper develops a novel approach with an event-based scheduler (EBS) and an ant

colony optimization (ACO) algorithm. The proposed approach represents a plan by a task list and a

planned employee allocation matrix. In this way, both the issues of task scheduling and employee

allocation can be taken into account. In the EBS, the beginning time of the project, the time when

resources are released from finished tasks, and the time when employees join or leave the project are

regarded as events. The basic idea of the EBS is to adjust the allocation of employees at events and keep

the allocation unchanged at nonevents. With this strategy, the proposed method enables the modeling of

resource conflict and task preemption and preserves the flexibility in human resource allocation. To solve

the planning problem, an ACO algorithm is further designed. Experimental results on 83 instances

demonstrate that the proposed method is very promising.

ETPL

SE-011

Coordination Breakdowns and Their Impact on Development Productivity and

Software Failures

Abstract: The success of software development projects depends on carefully coordinating the effort of

many individuals across the multiple stages of the development process. In software engineering,

modularization is the traditional technique intended to reduce the interdependencies among modules that

constitute a system. Reducing technical dependencies, the theory argues, results in a reduction of work

dependencies between teams developing interdependent modules. Although that research stream has been

quite influential, it considers a static view of the problem of coordination in engineering activities.

Building on a dynamic view of coordination, we studied the relationship between socio-technical

congruence and software quality and development productivity. In order to investigate the generality of

our findings, our analyses were performed on two large-scale projects from two companies with distinct

characteristics in terms of product and process maturity. Our results revealed that the gaps between

coordination requirements and the actual coordination activities carried out by the developers

significantly increased software failures. Our analyses also showed that higher levels of congruence are

associated with improved development productivity. Finally, our results showed the congruence between

dependencies and coordinative actions is critical both in mature development settings as well as in novel

and dynamic development contexts.

ETPL

SE-012 Amorphous Slicing of Extended Finite State Machines

Abstract: Slicing is useful for many software engineering applications and has been widely studied for

three decades, but there has been comparatively little work on slicing extended finite state machines

(EFSMs). This paper introduces a set of dependence-based EFSM slicing algorithms and an

accompanying tool. We demonstrate that our algorithms are suitable for dependence-based slicing. We

use our tool to conduct experiments on 10 EFSMs, including benchmarks and industrial EFSMs. Ours is

the first empirical study of dependence-based program slicing for EFSMs. Compared to the only

previously published dependence-based algorithm, our average slice is smaller 40 percent of the time and

larger only 10 percent of the time, with an average slice size of 35 percent for termination insensitive



http://www.elysiumtechnologies.com, [email protected] slicing.

ETPL

SE-013

Performance Specification and Evaluation with Unified Stochastic Probes and Fluid

Analysis

Abstract: Rapid and accessible performance evaluation of complex software systems requires two critical

features: the ability to specify useful performance metrics easily and the capability to analyze massively

distributed architectures, without recourse to large compute clusters. We present the unified stochastic

probe, a performance specification mechanism for process algebra models that combines many existing

ideas: state and action-based activation, location-based specification, many-probe specification, and

immediate signaling. These features, between them, allow the precise and compositional construction of

complex performance measurements. The paper shows how a subset of the stochastic probe language can

be used to specify common response-time measures in massive process algebra models. The second

contribution of the paper is to show how these response-time measures can be analyzed using so-called

fluid techniques to produce rapid results. In doing this, we extend the fluid approach to incorporate

immediate activities and a new type of response-time measure. Finally, we calculate various response-

time measurements on a complex distributed wireless network of O(10129) states in size.

ETPL

SE-014

Trustrace: Mining Software Repositories to Improve the Accuracy of Requirement

Traceability Links

Abstract: Traceability is the only means to ensure that the source code of a system is consistent with its

requirements and that all and only the specified requirements have been implemented by developers.

During software maintenance and evolution, requirement traceability links become obsolete because

developers do not/cannot devote effort to updating them. Yet, recovering these traceability links later is a

daunting and costly task for developers. Consequently, the literature has proposed methods, techniques,

and tools to recover these traceability links semi-automatically or automatically. Among the proposed

techniques, the literature showed that information retrieval (IR) techniques can automatically recover

traceability links between free-text requirements and source code. However, IR techniques lack accuracy

(precision and recall). In this paper, we show that mining software repositories and combining mined

results with IR techniques can improve the accuracy (precision and recall) of IR techniques and we

propose Trustrace, a trust--based traceability recovery approach. We apply Trustrace on four medium-size

open-source systems to compare the accuracy of its traceability links with those recovered using state-of-

the-art IR techniques from the literature, based on the Vector Space Model and Jensen-Shannon model.

The results of Trustrace are up to 22.7 percent more precise and have 7.66 percent better recall values

than those of the other techniques, on average. We thus show that mining software repositories and

combining the mined data with existing results from IR techniques improves the precision and recall of

requirement traceability links.

ETPL

SE-015

Using Timed Automata for Modeling Distributed Systems with Clocks: Challenges and

Solutions,




Abstract: The application of model checking for the formal verification of distributed embedded systems

requires the adoption of techniques for realistically modeling the temporal behavior of such systems. This

paper discusses how to model with timed automata the different types of relationships that may be found

among the computer clocks of a distributed system, namely, ideal clocks, drifting clocks, and

synchronized clocks. For each kind of relationship, a suitable modeling pattern is thoroughly described

and formally verified.

ETPL

SE-016 EDZL Schedulability Analysis in Real-Time Multicore Scheduling

Abstract: In real-time systems, correctness depends not only on functionality but also on timeliness. A

great number of scheduling theories have been developed for verification of the temporal correctness of

jobs (software) in such systems. Among them, the Earliest Deadline first until Zero-Laxity (EDZL)

scheduling algorithm has received growing attention thanks to its effectiveness in multicore real-time

scheduling. However, the true potential of EDZL has not yet been fully exploited in its schedulability

analysis as the state-of-the-art EDZL analysis techniques involve considerable pessimism. In this paper,

we propose a new EDZL multicore schedulability test. We first introduce an interesting observation that

suggests an insight toward pessimism reduction in the schedulability analysis of EDZL. We then

incorporate it into a well-known existing Earliest Deadline First (EDF) schedulability test, resulting in a

new EDZL schedulability test. We demonstrate that the proposed EDZL test not only has lower time

complexity than existing EDZL schedulability tests, but also significantly improves the schedulability of

EDZL by up to 36.6 percent compared to the best existing EDZL schedulability tests.

ETPL

SE-017 Validating Second-Order Mutation at System Level

Abstract: Mutation has been recognized to be an effective software testing technique. It is based on the

insertion of artificial faults in the system under test (SUT) by means of a set of mutation operators.

Different operators can mutate each program statement in several ways, which may produce a huge

number of mutants. This leads to very high costs for test case execution and result analysis. Several works

have approached techniques for cost reduction in mutation testing, like $(n)$-order mutation where each

mutant contains $(n)$ artificial faults instead of one. There are two approaches to $(n)$-order mutation:

increasing the effectiveness of mutation by searching for good $(n)$-order mutants, and decreasing the

costs of mutation testing by reducing the mutants set through the combination of the first-order mutants

into $(n)$-order mutants. This paper is focused on the second approach. However, this second use entails

a risk: the possibility of leaving undiscovered faults in the SUT, which may distort the perception of the

test suite quality. This paper describes an empirical study of different combination strategies to compose

second-order mutants at system level as well as a cost-risk analysis of $(n)$-order mutation at system

level.




ETPL

SE-018

Locating Need-to-Externalize Constant Strings for Software Internationalization with

Generalized String-Taint Analysis

Abstract: Nowadays, a software product usually faces a global market. To meet the requirements of

different local users, the software product must be internationalized. In an internationalized software

product, user-visible hard-coded constant strings are externalized to resource files so that local versions

can be generated by translating the resource files. In many cases, a software product is not

internationalized at the beginning of the software development process. To internationalize an existing

product, the developers must locate the user-visible constant strings that should be externalized. This

locating process is tedious and error-prone due to 1) the large number of both user-visible and non-user-

visible constant strings and 2) the complex data flows from constant strings to the Graphical User

Interface (GUI). In this paper, we propose an automatic approach to locating need-to-externalize constant

strings in the source code of a software product. Given a list of precollected API methods that output

values of their string argument variables to the GUI and the source code of the software product under

analysis, our approach traces from the invocation sites (within the source code) of these methods back to

the need-to-externalize constant strings using generalized string-taint analysis. In our empirical

evaluation, we used our approach to locate need-to-externalize constant strings in the uninternationalized

versions of seven real-world open source software products. The results of our evaluation demonstrate

that our approach is able to effectively locate need-to-externalize constant strings in uninternationalized

software products. Furthermore, to help developers understand why a constant string requires translation

and properly translate the need-to-externalize strings, we provide visual representation of the string

dependencies related to the need-to-externalize strings.

ETPL

SE-019 Systematic Elaboration of Scalability Requirements through Goal-Obstacle Analysis

Abstract: Scalability is a critical concern for many software systems. Despite the recognized importance

of considering scalability from the earliest stages of development, there is currently little support for

reasoning about scalability at the requirements level. This paper presents a goal-oriented approach for

eliciting, modeling, and reasoning about scalability requirements. The approach consists of systematically

identifying scalability-related obstacles to the satisfaction of goals, assessing the likelihood and severity

of these obstacles, and generating new goals to deal with them. The result is a consolidated set of

requirements in which important scalability concerns are anticipated through the precise, quantified

specification of scaling assumptions and scalability goals. The paper presents results from applying the

approach to a complex, large-scale financial fraud detection system.

ETPL

SE-020 Event Logs for the Analysis of Software Failures: A Rule-Based Approach,

Abstract: Event logs have been widely used over the last three decades to analyze the failure behavior of a

variety of systems. Nevertheless, the implementation of the logging mechanism lacks a systematic

approach and collected logs are often inaccurate at reporting software failures: This is a threat to the



http://www.elysiumtechnologies.com, [email protected] validity of log-based failure analysis. This paper analyzes the limitations of current logging mechanisms

and proposes a rule-based approach to make logs effective to analyze software failures. The approach

leverages artifacts produced at system design time and puts forth a set of rules to formalize the placement

of the logging instructions within the source code. The validity of the approach, with respect to traditional

logging mechanisms, is shown by means of around 12,500 software fault injection experiments into real-

world systems.

ETPL

SE-021 Local versus Global Lessons for Defect Prediction and Effort Estimation

Abstract: Existing research is unclear on how to generate lessons learned for defect prediction and effort

estimation. Should we seek lessons that are global to multiple projects or just local to particular projects?

This paper aims to comparatively evaluate local versus global lessons learned for effort estimation and

defect prediction. We applied automated clustering tools to effort and defect datasets from the PROMISE

repository. Rule learners generated lessons learned from all the data, from local projects, or just from each

cluster. The results indicate that the lessons learned after combining small parts of different data sources

(i.e., the clusters) were superior to either generalizations formed over all the data or local lessons formed

from particular projects. We conclude that when researchers attempt to draw lessons from some historical

data source, they should 1) ignore any existing local divisions into multiple sources, 2) cluster across all

available data, then 3) restrict the learning of lessons to the clusters from other sources that are nearest to

the test data.

ETPL

SE-022 Centroidal Voronoi Tessellations- A New Approach to Random Testing

Abstract: Although Random Testing (RT) is low cost and straightforward, its effectiveness is not

satisfactory. To increase the effectiveness of RT, researchers have developed Adaptive Random Testing

(ART) and Quasi-Random Testing (QRT) methods which attempt to maximize the test case coverage of

the input domain. This paper proposes the use of Centroidal Voronoi Tessellations (CVT) to address this

problem. Accordingly, a test case generation method, namely, Random Border CVT (RBCVT), is

proposed which can enhance the previous RT methods to improve their coverage of the input space. The

generated test cases by the other methods act as the input to the RBCVT algorithm and the output is an

improved set of test cases. Therefore, RBCVT is not an independent method and is considered as an add-

on to the previous methods. An extensive simulation study and a mutant-based software testing

investigation have been performed to demonstrate the effectiveness of RBCVT against the ART and QRT

methods. Results from the experimental frameworks demonstrate that RBCVT outperforms previous

methods. In addition, a novel search algorithm has been incorporated into RBCVT reducing the order of

computational complexity of the new approach. To further analyze the RBCVT method, randomness

analysis was undertaken demonstrating that RBCVT has the same characteristics as ART methods in this

regard.




ETPL

SE-023

Ranking and Clustering Software Cost Estimation Models through a Multiple

Comparisons Algorithm

Abstract: Software Cost Estimation can be described as the process of predicting the most realistic effort

required to complete a software project. Due to the strong relationship of accurate effort estimations with

many crucial project management activities, the research community has been focused on the

development and application of a vast variety of methods and models trying to improve the estimation

procedure. From the diversity of methods emerged the need for comparisons to determine the best model.

However, the inconsistent results brought to light significant doubts and uncertainty about the

appropriateness of the comparison process in experimental studies. Overall, there exist several potential

sources of bias that have to be considered in order to reinforce the confidence of experiments. In this

paper, we propose a statistical framework based on a multiple comparisons algorithm in order to rank

several cost estimation models, identifying those which have significant differences in accuracy, and

clustering them in nonoverlapping groups. The proposed framework is applied in a large-scale setup of

comparing 11 prediction models over six datasets. The results illustrate the benefits and the significant

information obtained through the systematic comparison of alternative methods.

ETPL

SE-024 Automated Behavioral Testing of Refactoring Engines

Abstract: Refactoring is a transformation that preserves the external behavior of a program and improves

its internal quality. Usually, compilation errors and behavioral changes are avoided by preconditions

determined for each refactoring transformation. However, to formally define these preconditions and

transfer them to program checks is a rather complex task. In practice, refactoring engine developers

commonly implement refactorings in an ad hoc manner since no guidelines are available for evaluating

the correctness of refactoring implementations. As a result, even mainstream refactoring engines contain

critical bugs. We present a technique to test Java refactoring engines. It automates test input generation by

using a Java program generator that exhaustively generates programs for a given scope of Java

declarations. The refactoring under test is applied to each generated program. The technique uses

SafeRefactor, a tool for detecting behavioral changes, as an oracle to evaluate the correctness of these

transformations. Finally, the technique classifies the failing transformations by the kind of behavioral

change or compilation error introduced by them. We have evaluated this technique by testing 29

refactorings in Eclipse JDT, NetBeans, and the JastAdd Refactoring Tools. We analyzed 153,444

transformations, and identified 57 bugs related to compilation errors, and 63 bugs related to behavioral

changes.

ETPL

SE-025 Compositional Verification for Hierarchical Scheduling of Real-Time Systems

Abstract: Hierarchical Scheduling (HS) techniques achieve resource partitioning among a set of real-time

applications, providing reduction of complexity, confinement of failure modes, and temporal isolation

among system applications. This facilitates compositional analysis for architectural verification and plays

a crucial role in all industrial areas where high-performance microprocessors allow growing integration of



http://www.elysiumtechnologies.com, [email protected] multiple applications on a single platform. We propose a compositional approach to formal specification

and schedulability analysis of real-time applications running under a Time Division Multiplexing (TDM)

global scheduler and preemptive Fixed Priority (FP) local schedulers, according to the ARINC-653

standard. As a characterizing trait, each application is made of periodic, sporadic, and jittering tasks with

offsets, jitters, and nondeterministic execution times, encompassing intra-application synchronizations

through semaphores and mailboxes and interapplication communications among periodic tasks through

message passing. The approach leverages the assumption of a TDM partitioning to enable compositional

design and analysis based on the model of preemptive Time Petri Nets (pTPNs), which is expressly

extended with a concept of Required Interface (RI) that specifies the embedding environment of an

application through sequencing and timing constraints. This enables exact verification of intra-application

constraints and approximate but safe verification of interapplication constraints. Experimentation

illustrates results and validates their applicability on two challenging workloads in the field of safety-

critical avionic systems.

ETPL

SE-026

Language-Independent and Automated Software Composition: The FeatureHouse

Experience

Abstract: Superimposition is a composition technique that has been applied successfully in many areas of

software development. Although superimposition is a general-purpose concept, it has been (re)invented

and implemented individually for various kinds of software artifacts. We unify languages and tools that

rely on superimposition by using the language-independent model of feature structure trees (FSTs). On

the basis of the FST model, we propose a general approach to the composition of software artifacts

written in different languages. Furthermore, we offer a supporting framework and tool chain, called

FEATUREHOUSE. We use attribute grammars to automate the integration of additional languages. In

particular, we have integrated Java, C#, C, Haskell, Alloy, and JavaCC. A substantial number of case

studies demonstrate the practicality and scalability of our approach and reveal insights into the properties

that a language must have in order to be ready for superimposition. We discuss perspectives of our

approach and demonstrate how we extended FEATUREHOUSE with support for XML languages (in

particular, XHTML, XMI/UML, and Ant) and alternative composition approaches (in particular, aspect

weaving). Rounding off our previous work, we provide here a holistic view of the FEATUREHOUSE

approach based on rich experience with numerous languages and case studies and reflections on several

years of research.

ETPL

SE-027 A Machine Learning Approach to Software Requirements Prioritization

Abstract: Deciding which, among a set of requirements, are to be considered first and in which order is a

strategic process in software development. This task is commonly referred to as requirements

prioritization. This paper describes a requirements prioritization method called Case-Based Ranking

(CBRank), which combines project's stakeholders preferences with requirements ordering approximations

computed through machine learning techniques, bringing promising advantages. First, the human effort to

input preference information can be reduced, while preserving the accuracy of the final ranking estimates.

Second, domain knowledge encoded as partial order relations defined over the requirement attributes can



http://www.elysiumtechnologies.com, [email protected] be exploited, thus supporting an adaptive elicitation process. The techniques CBRank rests on and the

associated prioritization process are detailed. Empirical evaluations of properties of CBRank are

performed on simulated data and compared with a state-of-the-art prioritization method, providing

evidence of the method ability to support the management of the tradeoff between elicitation effort and

ranking accuracy and to exploit domain knowledge. A case study on a real software project complements

these experimental measurements. Finally, a positioning of CBRank with respect to state-of-the-art

requirements prioritization methods is proposed, together with a discussion of benefits and limits of the

method.

ETPL

SE-028

The Effects of Test-Driven Development on External Quality and Productivity: A

Meta-Analysis

Abstract: This paper provides a systematic meta-analysis of 27 studies that investigate the impact of Test-

Driven Development (TDD) on external code quality and productivity. The results indicate that, in

general, TDD has a small positive effect on quality but little to no discernible effect on productivity.

However, subgroup analysis has found both the quality improvement and the productivity drop to be

much larger in industrial studies in comparison with academic studies. A larger drop of productivity was

found in studies where the difference in test effort between the TDD and the control group's process was

significant. A larger improvement in quality was also found in the academic studies when the difference

in test effort is substantial; however, no conclusion could be derived regarding the industrial studies due

to the lack of data. Finally, the influence of developer experience and task size as moderator variables was

investigated, and a statistically significant positive correlation was found between task size and the

magnitude of the improvement in quality.

ETPL

SE-029 Verifying Linearizability via Optimized Refinement Checking

Abstract: Linearizability is an important correctness criterion for implementations of concurrent objects.

Automatic checking of linearizability is challenging because it requires checking that: 1) All executions

of concurrent operations are serializable, and 2) the serialized executions are correct with respect to the

sequential semantics. In this work, we describe a method to automatically check linearizability based on

refinement relations from abstract specifications to concrete implementations. The method does not

require that linearization points in the implementations be given, which is often difficult or impossible.

However, the method takes advantage of linearization points if they are given. The method is based on

refinement checking of finite-state systems specified as concurrent processes with shared variables. To

tackle state space explosion, we develop and apply symmetry reduction, dynamic partial order reduction,

and a combination of both for refinement checking. We have built the method into the PAT model

checker, and used PAT to automatically check a variety of implementations of concurrent objects,

including the first algorithm for scalable nonzero indicators. Our system is able to find all known and

injected bugs in these implementations.




ETPL

SE-030 Abstracting Runtime Heaps for Program Understanding

Abstract: Modern programming environments provide extensive support for inspecting, analyzing, and

testing programs based on the algorithmic structure of a program. Unfortunately, support for inspecting

and understanding runtime data structures during execution is typically much more limited. This paper

provides a general purpose technique for abstracting and summarizing entire runtime heaps. We describe

the abstract heap model and the associated algorithms for transforming a concrete heap dump into the

corresponding abstract model as well as algorithms for merging, comparing, and computing changes

between abstract models. The abstract model is designed to emphasize high-level concepts about heap-

based data structures, such as shape and size, as well as relationships between heap structures, such as

sharing and connectivity. We demonstrate the utility and computational tractability of the abstract heap

model by building a memory profiler. We use this tool to identify, pinpoint, and correct sources of

memory bloat for programs from DaCapo.

ETPL

SE-031

Generating Domain-Specific Visual Language Tools from Abstract Visual

Specifications

Abstract: Domain-specific visual languages support high-level modeling for a wide range of application

domains. However, building tools to support such languages is very challenging. We describe a set of key

conceptual requirements for such tools and our approach to addressing these requirements, a set of visual

language-based metatools. These support definition of metamodels, visual notations, views, modeling

behaviors, design critics, and model transformations and provide a platform to realize target visual

modeling tools. Extensions support collaborative work, human-centric tool interaction, and multiplatform

deployment. We illustrate application of the metatoolset on tools developed with our approach. We

describe tool developer and cognitive evaluations of our platform and our exemplar tools, and summarize

key future research directions.

ETPL

SE-032

Toward Comprehensible Software Fault Prediction Models Using Bayesian Network

Classifiers

Abstract: Software testing is a crucial activity during software development and fault prediction models

assist practitioners herein by providing an upfront identification of faulty software code by drawing upon

the machine learning literature. While especially the Naive Bayes classifier is often applied in this regard,

citing predictive performance and comprehensibility as its major strengths, a number of alternative

Bayesian algorithms that boost the possibility of constructing simpler networks with fewer nodes and arcs

remain unexplored. This study contributes to the literature by considering 15 different Bayesian Network

(BN) classifiers and comparing them to other popular machine learning techniques. Furthermore, the

applicability of the Markov blanket principle for feature selection, which is a natural extension to BN

theory, is investigated . The results, both in terms of the AUC and the recently introduced H -measure, are

rigorously tested using the statistical framework of Demsar . It is concluded that simple and

comprehensible networks with less nodes can be constructed using BN classifiers other than the Naive



http://www.elysiumtechnologies.com, [email protected] Bayes classifier. Furthermore, it is found that the aspects of comprehensibility and predictive performance

need to be balanced out, and also the development context is an item which should be taken into account

during model selection.

ETPL

SE-033 On Fault Representativeness of Software Fault Injection

Abstract: The injection of software faults in software components to assess the impact of these faults on

other components or on the system as a whole, allowing the evaluation of fault tolerance, is relatively new

compared to decades of research on hardware fault injection. This paper presents an extensive

experimental study (more than 3.8 million individual experiments in three real systems) to evaluate the

representativeness of faults injected by a state-of-the-art approach (G-SWFIT). Results show that a

significant share (up to 72 percent) of injected faults cannot be considered representative of residual

software faults as they are consistently detected by regression tests, and that the representativeness of

injected faults is affected by the fault location within the system, resulting in different distributions of

representative/nonrepresentative faults across files and functions. Therefore, we propose a new approach

to refine the faultload by removing faults that are not representative of residual software faults. This

filtering is essential to assure meaningful results and to reduce the cost (in terms of number of faults) of

software fault injection campaigns in complex software. The proposed approach is based on classification

algorithms, is fully automatic, and can be used for improving fault representativeness of existing software

fault injection approaches.

ETPL

SE-034

A Decentralized Self-Adaptation Mechanism for Service-Based Applications in the

Cloud

Abstract: Cloud computing, with its promise of (almost) unlimited computation, storage, and bandwidth,

is increasingly becoming the infrastructure of choice for many organizations. As cloud offerings mature,

service-based applications need to dynamically recompose themselves to self-adapt to changing QoS

requirements. In this paper, we present a decentralized mechanism for such self-adaptation, using market-

based heuristics. We use a continuous double-auction to allow applications to decide which services to

choose, among the many on offer. We view an application as a multi-agent system and the cloud as a

marketplace where many such applications self-adapt. We show through a simulation study that our

mechanism is effective for the individual application as well as from the collective perspective of all

applications adapting at the same time.

ETPL

SE-035 Coverage Estimation in Model Checking with Bitstate Hashing

Abstract: Explicit-state model checking which is conducted by state space search has difficulty in

exploring satisfactory state space because of its memory requirements. Though bitstate hashing achieves

memory efficiency, it cannot guarantee complete verification. Thus, it is desirable to provide a reliability

indicator such as a coverage estimate. However, the existing approaches for coverage estimation are not

very accurate when a verification run covers a small portion of state space. This mainly stems from the



http://www.elysiumtechnologies.com, [email protected] lack of information that reflects characteristics of models. Therefore, we propose coverage estimation

methods using a growth curve that approximates an increase in reached states by enlarging a bloom filter.

Our approaches improve estimation accuracy by leveraging the statistics from multiple verification runs.

Coverage is estimated by fitting the growth curve to these statistics. Experimental results confirm the

validity of the proposed growth curve and the applicability of our approaches to practical models. In fact,

for practical models, our approaches outperformed the conventional ones when the actual coverage is

relatively low.

ETPL

SE-036 Synthesizing Modal Transition Systems from Triggered Scenarios

Abstract: Synthesis of operational behavior models from scenario-based specifications has been

extensively studied. The focus has been mainly on either existential or universal interpretations. One

noteworthy exception is Live Sequence Charts (LSCs), which provides expressive constructs for

conditional universal scenarios and some limited support for nonconditional existential scenarios. In this

paper, we propose a scenario-based language that supports both existential and universal interpretations

for conditional scenarios. Existing model synthesis techniques use traditional two-valued behavior

models, such as Labeled Transition Systems. These are not sufficiently expressive to accommodate

specification languages with both existential and universal scenarios. We therefore shift the target of

synthesis to Modal Transition Systems (MTS), an extension of labeled Transition Systems that can

distinguish between required, unknown, and proscribed behavior to capture the semantics of existential

and universal scenarios. Modal Transition Systems support elaboration of behavior models through

refinement, which complements an incremental elicitation process suitable for specifying behavior with

scenario-based notations. The synthesis algorithm that we define constructs a Modal Transition System

that uses refinement to characterize all the Labeled Transition Systems models that satisfy a mixed,

conditional existential and universal scenario-based specification. We show how this combination of

scenario language, synthesis, and Modal Transition Systems supports behavior model elaboration.

ETPL

SE-037 Proactive and Reactive Runtime Service Discovery: A Framework and Its Evaluation

Abstract: The identification of services during the execution of service-based applications to replace

services in them that are no longer available and/or fail to satisfy certain requirements is an important

issue. In this paper, we present a framework to support runtime service discovery. This framework can

execute service discovery queries in pull and push mode. In pull mode, it executes queries when a need

for finding a replacement service arises. In push mode, queries are subscribed to the framework to be

executed proactively and, in parallel with the operation of the application, to identify adequate services

that could be used if the need for replacing a service arises. Hence, the proactive (push) mode of query

execution makes it more likely to avoid interruptions in the operation of service-based applications when

a service in them needs to be replaced at runtime. In both modes of query execution, the identification of

services relies on distance-based matching of structural, behavioral, quality, and contextual characteristics

of services and applications. A prototype implementation of the framework has been developed and an




evaluation was carried out to assess the performance of the framework. This evaluation has shown

positive results, which are discussed in the paper.

Final Year IEEE Project 2013-2014 - Software Engineering Project Title and Abstract

Technology

quality of experimental

paper quality

quality of human

quasiexperimental papers

quality assessment methods

agile teams selforganizing

quality score

software organizations