Performance, Scalability, and Reliability (PSR) challenges ...

Thesis no: MSSE-2016-14

Performance, Scalability, andReliability (PSR) challenges, metrics

and tools for web testingA Case Study

Akshay Kumar MagapuNikhil Yarlagadda

Faculty of ComputingBlekinge Institute of TechnologySE–371 79 Karlskrona, Sweden

This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technologyin partial fulfillment of the requirements for the degree of Master of Science in SoftwareEngineering. The thesis is equivalent to 20 weeks of full time studies.

Contact Information:Author(s):Akshay Kumar MagapuE-mail: [email protected]

Nikhil YarlagaddaE-mail: [email protected]

External advisor:Saket RustagiProject ManagerEricsson India Global Services Pvt. Ltd.Gurgaon, India.

University advisor:Michael UnterkalmsteinerDepartment of Software Engineering

Faculty of Computing Internet : www.bth.seBlekinge Institute of Technology Phone : +46 455 38 50 00SE–371 79 Karlskrona, Sweden Fax : +46 455 38 50 57

Abstract

Context. Testing of web applications is an important task, as it ensures thefunctionality and quality of web applications. The quality of web applica-tion comes under non-functional testing. There are many quality attributessuch as performance, scalability, reliability, usability, accessibility and se-curity. Among these attributes, PSR is the most important and commonlyused attributes considered in practice. However, there are very few empiri-cal studies conducted on these three attributes.Objectives. The purpose of this study is to identify metrics and toolsthat are available for testing these three attributes. And also to identifythe challenges faced while testing these attributes both from literature andpractice.Methods. In this research, a systematic mapping study was conductedin order to collect information regarding the metrics, tools, challenges andmitigations related to PSR attributes. The required information is gatheredby searching in five scientific databases. We also conducted a case study toidentify the metrics, tools and challenges of the PSR attributes in practice.The case study is conducted at Ericsson, India where eight subjects wereinterviewed. And four subjects working in other companies (in India) werealso interviewed in order to validate the results obtained from the case com-pany. In addition to this, few documents of previous projects from the casecompany are collected for data triangulation.Results.A total of 69 metrics, 54 tools and 18 challenges are identified fromsystematic mapping study. And 30 metrics, 18 tools and 13 challenges areidentified from interviews. Data is also collected through documents anda total of 16 metrics, 4 tools and 3 challenges were identified from thesedocuments. We formed a list based on the analysis of data that is relatedto tools, metrics and challenges.Conclusions.We found that metrics available from literature are overlap-ping with metrics that are used in practice. However, tools found in liter-ature are overlapping only to some extent with practice. The main reasonfor this deviation is because of the limitations that are identified for thetools, which lead to the development of their own in-house tool by the casecompany.

i

We also found that challenges are partially overlapped between state of artand practice. We are unable to collect mitigations for all these challengesfrom literature and hence there is a need for further research to be done.Among the PSR attributes, most of the literature is available on perfor-mance attribute and most of the interviewees are comfortable to answerthe questions related to performance attribute. Thus, we conclude there isa lack of empirical research related to scalability and reliability attributes.As of now, our research is dealing with PSR attributes in particular andthere is a scope for further research in this area. It can be implemented onthe other quality attributes and the research can be done in a larger scale(considering more number of companies).

Keywords: Web applications, Web testing, Performance, Scalability, Reli-ability, Quality.

ii

Acknowledgments

We would like to thank our supervisor Michael Unterkalmsteiner for his tremen-dous and quick support whenever needed. We also thank Ericsson for providingus the opportunity to conduct case study and interviewees from other organiza-tions for participating in the interviews. Special credits go to our family, friendsfor providing us the support to make the thesis completed.

The authors

iii

Contents

Abstract i

Acknowledgments iii

1 Introduction 11.1 Web testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Functional testing . . . . . . . . . . . . . . . . . . . . . . . 21.1.2 Non-functional testing . . . . . . . . . . . . . . . . . . . . 2

1.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background and Related Work 52.1 Web applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Web testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Functional testing . . . . . . . . . . . . . . . . . . . . . . . 82.2.2 Non-functional testing . . . . . . . . . . . . . . . . . . . . 10

2.3 Selected attributes . . . . . . . . . . . . . . . . . . . . . . . . . . 132.4 Research scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.5 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.5.1 Literature related to metrics . . . . . . . . . . . . . . . . . 152.5.2 Literature related to tools . . . . . . . . . . . . . . . . . . 152.5.3 Literature related to challenges . . . . . . . . . . . . . . . 162.5.4 Research gap . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Method 183.1 Research purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.1.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.2 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.3 Research method . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.3.1 Systematic mapping study . . . . . . . . . . . . . . . . . . 213.3.2 Case study . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.4 Data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.4.1 Familiarizing yourself with the data . . . . . . . . . . . . 36

iv

3.4.2 Generating initial codes . . . . . . . . . . . . . . . . . . . 373.4.3 Searching for themes . . . . . . . . . . . . . . . . . . . . . 383.4.4 Reviewing themes . . . . . . . . . . . . . . . . . . . . . . . 383.4.5 Defining and naming themes . . . . . . . . . . . . . . . . . 393.4.6 Producing the report . . . . . . . . . . . . . . . . . . . . . 39

3.5 Validity threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.5.1 Construct validity . . . . . . . . . . . . . . . . . . . . . . . 393.5.2 Internal validity . . . . . . . . . . . . . . . . . . . . . . . . 403.5.3 External validity . . . . . . . . . . . . . . . . . . . . . . . 403.5.4 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4 Results and Analysis 434.1 Facet 1: Metrics for testing PSR attributes . . . . . . . . . . . . . 44

4.1.1 Systematic mapping study . . . . . . . . . . . . . . . . . . 444.1.2 Interviews and documents . . . . . . . . . . . . . . . . . . 484.1.3 Criteria for selection of metrics . . . . . . . . . . . . . . . 52

4.2 Facet 2: Tools for testing PSR attributes . . . . . . . . . . . . . . 534.2.1 Systematic mapping study . . . . . . . . . . . . . . . . . . 534.2.2 Interviews and documents . . . . . . . . . . . . . . . . . . 564.2.3 Tool drawbacks and improvements . . . . . . . . . . . . . 61

4.3 Facet 3: Challenges faced by software testers . . . . . . . . . . . . 624.3.1 Systematic mapping study . . . . . . . . . . . . . . . . . . 624.3.2 Interviews and documents . . . . . . . . . . . . . . . . . . 674.3.3 Does mitigations available in literature mitigates challenges

in practice? . . . . . . . . . . . . . . . . . . . . . . . . . . 724.4 Facet 4: Important attribute among PSR . . . . . . . . . . . . . . 72

4.4.1 Interviews . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5 Discussion 755.1 Metrics for testing PSR attributes of web applications . . . . . . . 755.2 Tools for testing PSR attributes of web applications . . . . . . . . 775.3 Challenges in PSR testing of web applications . . . . . . . . . . . 795.4 Most important attribute among PSR . . . . . . . . . . . . . . . . 825.5 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

6 Conclusions and Future Work 866.1 Research questions and answers . . . . . . . . . . . . . . . . . . . 86

6.1.1 RQ 1: Metrics used for testing the PSR attributes . . . . . 866.1.2 RQ 2: Tools used for testing the PSR attributes . . . . . . 876.1.3 RQ 3 Challenges identified while testing the PSR attributes 896.1.4 RQ 4: Important attribute among PSR . . . . . . . . . . . 90

6.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 906.3 Research contribution . . . . . . . . . . . . . . . . . . . . . . . . . 91

v

6.4 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Appendices 105

A Systematic maps 106

B SMS overview 108

C List of metrics 117

D List of tools 121

E List of challenges 122

F Interview questions 124F.1 Technical Questions . . . . . . . . . . . . . . . . . . . . . . . . . . 124

F.1.1 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124F.1.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125F.1.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 126F.1.4 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

G MTC and IA identified between case company and other com-panies 128G.1 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128G.2 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129G.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129G.4 Important attribute . . . . . . . . . . . . . . . . . . . . . . . . . . 129

H Consent form 130

vi

List of Figures

1.1 Types of testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.1 Requirements classification . . . . . . . . . . . . . . . . . . . . . . 72.2 Types in functional testing . . . . . . . . . . . . . . . . . . . . . . 82.3 Types in non-functional testing . . . . . . . . . . . . . . . . . . . 102.4 Research scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.1 Systematic mapping study process . . . . . . . . . . . . . . . . . . 223.2 Case study process steps . . . . . . . . . . . . . . . . . . . . . . . 313.3 Pyramid model for interview questions . . . . . . . . . . . . . . . 333.4 Steps for thematic analysis . . . . . . . . . . . . . . . . . . . . . . 363.5 Themes formed in Nvivo tool for interviews . . . . . . . . . . . . 42

4.1 Number of sources addressing the research attributes . . . . . . . 444.2 Thematic map for metrics from SMS . . . . . . . . . . . . . . . . 454.3 Thematic map for metrics from interviews . . . . . . . . . . . . . 484.4 Thematic map for metrics from documents . . . . . . . . . . . . . 524.5 Thematic map for tools from SMS . . . . . . . . . . . . . . . . . . 534.6 Thematic map for tools from interviews . . . . . . . . . . . . . . . 574.7 Types of tools obtained from interviews . . . . . . . . . . . . . . . 574.8 Thematic map for tools from documents . . . . . . . . . . . . . . 614.9 Thematic map for challenges from SMS . . . . . . . . . . . . . . . 624.10 Number of articles addressed each theme from SMS . . . . . . . . 634.11 Thematic map for challenges from interviews . . . . . . . . . . . . 674.12 Number of interviewees addressed the themes . . . . . . . . . . . 684.13 Thematic map for challenges from documents . . . . . . . . . . . 714.14 Thematic map for important attribute from interviews . . . . . . 73

5.1 Overlap and differences in metrics among all data sources . . . . 765.2 Overlap and differences in tools among all data sources . . . . . . 785.3 Overlap and differences in challenges among all data sources . . . 815.4 Overlap and differences in challenge areas among all data sources 81

vii

5.5 Overlap and differences in metrics between state of art and stateof practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.6 Overlap and differences in tools between state of art and state ofpractice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.7 Overlap and differences in challenges between state of art and stateof practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

A.1 Research parameters vs research attributes in SMS . . . . . . . . 106A.2 Research methods vs research attributes in SMS . . . . . . . . . . 107A.3 Research methods vs research parameters in SMS . . . . . . . . . 107

viii

List of Tables

3.1 Keywords used for search string formulation . . . . . . . . . . . . 233.2 Search strings used for selection of literature . . . . . . . . . . . . 253.3 Search results before and after applying exclusion criteria . . . . . 283.4 Initial search selection results . . . . . . . . . . . . . . . . . . . . 283.5 Search results after removing duplicate articles in each database . 293.6 Data extraction form . . . . . . . . . . . . . . . . . . . . . . . . . 303.7 Details of interviewee . . . . . . . . . . . . . . . . . . . . . . . . . 343.8 Overview of selected companies . . . . . . . . . . . . . . . . . . . 353.9 Research questions and their respective data collection technique . 35

4.1 Performance metrics . . . . . . . . . . . . . . . . . . . . . . . . . 464.2 Scalability metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 474.3 Reliability metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.4 Metrics obtained from documents . . . . . . . . . . . . . . . . . . 524.5 Identified PSR tools . . . . . . . . . . . . . . . . . . . . . . . . . 554.6 Commercial tools obtained from interviews . . . . . . . . . . . . . 584.7 Frameworks obtained from interviews . . . . . . . . . . . . . . . . 584.8 Monitoring tools obtained from interviews . . . . . . . . . . . . . 594.9 Open source tools obtained from interviews . . . . . . . . . . . . . 604.10 Tools obtained from documents . . . . . . . . . . . . . . . . . . . 61

B.1 SMS overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

C.1 Metrics description . . . . . . . . . . . . . . . . . . . . . . . . . . 117

E.1 List of challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

G.1 Identified metrics between case company and other companies . . 128G.2 Identified tools between case company and other companies . . . . 129G.3 Identified challenges between case company and other companies . 129G.4 Identified important attribute between case company and other

companies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

ix

List of acronyms and abbreviations

GUI Graphical User Interface

IA Important Attribute

MTBF Mean Time Between Failures

MTC Metrics Tools Challenges

MTTF Mean Time To Failure

MTTR Mean Time To Repair

PS Performance Scalability

PSR Performance Scalability Reliability

SMS Systematic Mapping Study

SR Scalability Reliability

UI User Interface

WWW World Wide Web

x

Chapter 1Introduction

1.1 Web testing

The internet has evolved a lot in the recent years and number of users dependingon the internet resources is also increasing exponentially. The internet acts as amedium between web applications and users. These web applications have becomevery popular and are attracting many users towards it because of its flexibilityand ease of access from anywhere. Because of its popularity, many softwarecompanies are using web applications as a new trend in the modern world forproviding their services directly to the users. These are used in various fieldssuch as education, entertainment, business, manufacturing, cooperative work,scientific simulation, and communication in order to satisfy the requirements ofthe users [1]. The complexity in web applications is increasing day to day in orderto satisfy the requirements. Along with this, organizations are also deploying webapplications into the market without proper testing because of time pressure andearly releases [2].Some web applications are failing to satisfy the needs of usersbecause of their poor quality. The users are getting unsatisfied and leave web siteswith a negative impression which leads to a loss in the number of users, sales andbusiness opportunities. A recent study [3] stated that because of the poor qualityof a website, some of the users of the site stopped buying the product from website while some other users stopped using the product entirely. Because of poorquality of the website, the organization has lost majority of its customers. Hence,users are the ultimate judges for the success or failure of web applications. Usersmainly evaluate website in terms of availability, reliability, response time, costand accuracy [4]. Hence, satisfying the users is becoming one of the importantchallenges and there is a need for the companies to keep in mind about thesecriteria while developing web applications. Along with this, withstanding in themarket against competitive products is also an important factor to be considered.

User satisfaction and defeating the competitive products in market are the twomajor challenges that are needed to be addressed by software companies whiledeveloping web applications. This can be achieved by producing and deliveringa quality assured web application into the market. This can be achieved by sub-jecting these web applications to various tests during the software testing phase

1

Chapter 1. Introduction 2

in software development life cycle [5]. The testing phase is a significant activitywhich ensures software quality and reliability. There are two types of testing thatcan be done for web applications represented in figure 1.1 and discussed next.

Figure 1.1: Types of testing

1.1.1 Functional testing

Functional testing mainly concentrates on validating the functional requirementsof web applications. These requirements are focused on the activities and in-teractions the software shall support in order to fulfill the users’ requirements.Functional testing consists of three approaches: white box testing, black box test-ing and grey box testing [6]. These approaches are further discussed in section2.2.1.

1.1.2 Non-functional testing

Non-functional testing mainly concentrates on validating non-functional require-ments of web applications. Even though functional testing is important, qualitytesting is a must criterion to be considered in a competitive market [7]. Thecore functionality is important, but without quality, functionality of the systemis less concerned by the users. So the focus of the research is more shifted towardsnon-functional parameters testing [1]. Testing non-functional attributes mainlydepend on the runtime environment of an application.

Recently, focus of the researchers and industries shifted towards non-functionaltesting of web applications. Numerous methods are being developed to test thequality attributes effectively for web testing [3].Web applications with high qualityare seeking the attention of customers, so the demand for developing quality webapplications is increasing in the current market [8]. Various testing parameterscan be considered to measure the quality of web applications which include GUItesting, usability testing, PSR, safety, security and many more [9]. Users rate the


response time as an important factor. If web applications fail to provide responsewithin eight seconds, then 30% of the users leave the application [10, 11]. Forexample, in case of banking system security, performance and reliability factorsare important [2].

Performance, scalability and reliability attributes are selected for our researchas these are important attributes which are commonly used for testing non-functional requirements in many systems [1, 2]. PSR attributes are perceivedas a critical factor by the users and hence these attributes can judge the qualityof web applications [2]. A lot of research has been done on these three attributesin terms of software systems i.e. traditional software [11]. However, PSR testingof web applications has got more attention recently with the rise of web applica-tion’s popularity.

1.2 Problem statement

Web testing has gained more popularity in the last few years but there are fewempirical studies that focus on tools and metrics used for testing PSR qualityattributes [1, 12]. The challenges faced while testing web applications are alsonot known as there are limited amount of studies that concentrated on challengesregarding PSR attributes [1, 13, 14]. In order to gain knowledge regarding theseterms and to know how to perform testing without any problems, there is a needfor further research in this area. To solve this problem, we conduct a studyto provide a list of MTC for practitioners and researchers to move on further.This research investigates and analyzes metrics, selection criteria for metrics,tools, drawbacks in tools from software testers perspective, and challenges fromboth literature and practice. Along with that mitigation strategies available forchallenges in literature are also analyzed. An empirical study is conducted atEricsson, India to investigate the practical challenges faced by the testers andalso to identify tools, metrics used by software testers for web application testing.

1.3 Thesis structure

This section gives a general idea about the chapters present in this thesis. Thethesis consists of seven chapters and each chapter has its own importance. Themain focus and the details of each chapter are provided below and shown in thefigure 1.2.


Figure 1.2: Thesis structure

• Chapter 2 focuses on the topics involved in the research, which provides abasic information or basic idea to the reader and it also explains some ofthe previous works that have been carried out in the studied area.

• Chapter 3 presents the research methodology opted for conducting the research.It also provides the implementation process of research method and researchoperation.

• Chapter 4 provides the results obtained from research methods and analysis.

• Chapter 5 discusses about the significant information retrieved from the anal-ysis.

• Chapter 6 summarizes and concludes the contemporary research and presentsthe future work that can be carried out in later stages.

Chapter 2Background and Related Work

This chapter describes the details about the background of proposed research,the selection of attributes, research scope and related work carried out in theproposed research. Section 2.1 describes the concept of web applications. Section2.2 provides the concept of web testing. Section 2.3 concentrates on the selectedattributes among several quality attributes for the research. Section 2.4 focuseson scope of the research. Section 2.5 discusses the previous research on testing ofPSR attributes (related to MTC).

2.1 Web applications

World Wide Web (WWW) has developed with an exponential growth in the pasttwenty-two years. Web applications are evolved from static applications to dy-namic applications and distributed applications [15]. Nowadays web applicationscan be retrieved or viewed from desktops to smart phones [16]. The growth innumber of users and devices leads to increase in the use of web applications phe-nomenally. The evolving nature of web applications and their services to theusers made it to became a fundamental part of their life. The main difference ofweb applications from traditional applications is that they can be accessed fromany device and from anywhere without the need of installing any module, whichexplains the success of web applications [17, 11].

Web applications provide services in many areas such as e-commerce, e-learning,e-business and entertainment, socializing and many more [14, 1]. People startedusing these web applications as a medium for day to day communication and arebecoming a part of their life in recent years [6].Due to the vast growth of web,many companies and businesses are relying on web applications. And also thestructure and easiness of these web applications can decide the success or failureof the enterprises. As observed by Hugo Menino et al., Yu Qi et al. [17, 18], theaccuracy and performance of these web applications are considered as one of thefactors in deciding the success of enterprises. Many technologies are introducedfor developing web applications, as each of the technology have its own pros andcons. Hence, based on the suitability and requirements, the appropriate tech-nology will be selected. According to Nikfard [6] applications are classified into

5

Chapter 2. Background and Related Work 6

different classes based on their context and the information they are providing.

• Deliverable

• Customizable

• Transactional

• Acceptable

• Interactive

• Service-oriented

• Accessible

• Data warehouse

• Automatic

Web applications are the applications which run on the server side and theinterface of applications can be accessed through client side. Web applicationsare said to be a collection of files where the user can create a request and therequest will be sent to the server, then the server processes the request andgenerates a response to the client side. The communication between client andserver is executed as mentioned above. Web applications are basically hosted onweb servers and the requests generated by the clients are handled by the hostedservers.

Until now, we have considered only the basics of web application; the complexpart is developing and testing the web applications. Web development is a tedioustask, as even a minor error can cause chaos to the entire web application. Hence,web applications are needed to be developed by keeping security in mind, sothat exploiters cannot access confidential data. In general, web applications havemany other disadvantages but the discussion of these is however out of scope ofthis thesis. There is a lot of development done on web applications such thatthese applications are generally accessed with an internet connection. Nowadaysweb applications can also be accessed offline.

Web applications are retrieved or viewed by using client side browser. Thereare many other browsers available in the market and web applications are notsupported by all the browsers. Web applications can be accessed from any plat-form. One major reason for attracting many users towards web applications isthe fact that they can be accessed without additional software installation [19].

As web applications are valued by both the customers and organizations,proper testing is needed to be done prior to the deployment in live environment[19]. Web testing and their types are discussed in section 2.2.


2.2 Web testing

Testing is the technique used to ensure the functionality and quality of a software.Testing of web applications is said to be web testing and it is a branch of softwaretesting. The aim of web testing is to find application failures. Generally, thefailure is due to the presence of faults in the application. According to Di Luccaand Fasolino [14], the faults can be caused due to running environment, andinteraction between client and server. According to Arora and Sinha [20], webtesting is different from traditional software testing since web applications mayundergo many maintenance changes and even daily updates. Web applicationsdoes not have any specific or default number of users. They vary as the number ofusers accessing web application can increase or decrease at a given point of time.It is very difficult to find the existing errors in web applications as it is a multi-tierarchitecture [21]. Web applications are dynamic in nature so the complexity inweb application is increasing in order to fulfill numerous requirements [22]. Tomaintain the functionality and quality in web applications, they are needed tobe tested. There are two types of requirements for building web applications asrepresented in figure 2.1.

Figure 2.1: Requirements classification

• Functional requirements The functional requirements are related to the ap-plication, so the functionalities of the application are described throughthese requirements. To test the functionality of web application, the re-quirements are validated against the application. If the requirements aresatisfied, then the applications are said to be passed in the test. The func-tionality testing of web applications is further discussed in section 2.2.1.

• Non-functional requirements The non-functional requirements are more re-lated to the environment of application and helps to improve the quality ofthe application. These requirements are verified and validated by the non-functional testing. Non-functional testing is further discussed in section2.2.2.


2.2.1 Functional testing

Functional testing is used to validate functional requirements against the applica-tion. The faults in web application are identified through this testing. Functionaltesting validates the flow of the application to check whether it is progressing ina desired manner or not. Functional testing validates web links and their faultssuch as broken links, page error, validation of inputs, validation of forms, linksand breadcrumbs validation, validating dynamic verification and content testing.It also validates the existing fields in web application such as when an incorrectformat of data is entered. So functional testing verifies the whole application inorder to check whether the application is working without any faults. Functionaltesting of web application depends generally on four aspects such as testing levels,test strategies, test models, testing process [6]. The functional testing of web ap-plications consists of three test approaches: White box testing, Black box testing,Grey box testing.

The white box testing mainly focuses on testing the structure of applicationand percentage of code covered during testing. Black box testing is related totesting of application behavior in which test cases are written to test the function-ality of application. The grey box testing is a combination of both white box andblack box testing. In this, the environment and the functionality of the applica-tions are tested. The grey box testing is more feasible for testing web applicationsas it identifies the faults and failures that are existing in the environment of webapplication and also the flow of application [6].

Functional testing consists of six sub-types in web application as shown infigure 2.2.

Figure 2.2: Types in functional testing


2.2.1.1 Smoke testing

Basically when developers code the applications, an initial testing is performedand it is termed as smoke testing. It is carried out to verify whether the writtencode works or not. If it fails to work, then the developer will come to know thatthere exists a fault in the written code and make sure to rectify it [14, 6].

2.2.1.2 Unit testing

A certain function or unit piece of code is tested after it is developed. The testingmainly concentrates on the specific functions and does not test the other featuresdepending on it. This kind of testing is termed as unit testing [14, 6].

2.2.1.3 Regression testing

The testing of new functional code with the previously implemented or modifiedcode is said to be regression testing. This testing is performed when there existsa small change or addition of code to the existing code to verify whether themodified or added code creates any faults in the application [14, 6].

2.2.1.4 Integration testing

After certain functionalities are developed, all of them are integrated together toperform the integration testing. It is performed to ensure that the application isworking as expected even after integrating all the individual components [14, 6].

2.2.1.5 System testing

System testing is used to find defects in the entire web application. Generally,the system testing is conducted in three approaches: black box, white box andgrey box. The black box approach is used to identify the failures in user functionsand external behavior of the application. White box approach is used to identifydefects related to incorrect links between pages and grey box testing is usedto identify effects related to application navigation structure and also externalbehavior of the application [14, 6].

2.2.1.6 Acceptance testing

Acceptance testing is performed to ensure that the user requirements and businessrequirements are satisfied by the developed application and it can be ready todeploy and use in the live environment. This testing is conducted as per theacceptance criteria prepared prior to testing, so it validated whether the developedapplication meets the acceptance criteria or not [23].

Functional testing can be done in both manual and automated ways. A testplan is designed on how to perform tests. Later test cases and suites are prepared


to test web applications. For web applications there is also a tool (capture andreplay), which is used to capture or records the functionality and replay or retestit [7]. Using the capture and replay tool many types of scenarios can be tested. Soby repeated testing, it can be ensured that the application can work as expectedin the live environment.

2.2.2 Non-functional testing

Ensuring only functionality of a web application is insufficient in present compet-itive market. As in every field, quality of web applications is a major concern. Sovalidating the non-functional requirements is necessary. According to Hossain,Nikfard [3, 6] there are seven non-functional testing types in web applications,which ensures the quality of web application. For performing testing on theseattributes, there is a need for some benchmarks and test strategies to be defined.

The different attributes in non-functional testing of web applications are asshown in figure 2.3.

Figure 2.3: Types in non-functional testing

2.2.2.1 Performance testing

Performance testing is performed to know the application performance in termsof response time, availability etc. Response time is defined as the time taken toreceive a response from the server when a user submits a request to the applica-tion. So to assure the performance of web application in the live environment,a set up with virtual users is generated, simulating the behavior of real users byusing scenarios to perform certain operations in order to measure the performanceattribute. As web applications are very dynamic in nature the performance test


is a continuous process. So through activity logs, the application performance isneeded to be analyzed [14, 6]. The important subtypes in the performance testingare discussed next.

2.2.2.1.1 Load testing Load testing is defined as stability of the system tohandle maximum amount of work load without any significant degradation inperformance [12]. It is performed under minimum configuration and maximumactivity levels to examine the time taken, in order to complete the set of action. Aset of users is simulated and tested to get various scenarios. Load testing provideshow much load the application can withstand by responding to every request ittakes. It helps in identifying bottlenecks and failures of the system. And thefailures recognized through load testing is due to faults in running environmentconditions [14, 6].

2.2.2.1.2 Stress testing Stress testing is carried out to verify whether theapplication is able to withstand if the load is put beyond the point. It helpsin identifying bottlenecks of the system such as memory leakage, transactionalproblems, resource locking, hardware limitations and bandwidth limits [24]. Mostof the failures or errors detected through stress testing are due to faults in runningenvironment conditions like hardware and software [14, 6, 25].

2.2.2.2 Scalability testing

Scalability testing is defined as the flexibility of a system to deal with the changescaused due to an increase in the load without violating from predefined objectives[1]. It is performed to validate the balance of the load on the resources when acertain load is achieved. The hardware resources are added and tested to knowthe change in response time and effect of adding the resource to the application.Failures detected through scalability testing are due to fault in running environ-ment and hardware resources. The scalability can be implemented in two ways:vertical and horizontal scalability [21].

2.2.2.2.1 Vertical scalability It is achieved by adding extra resources likememory, processors etc. to an existing server. It is also known as scaling up. Thevertical scalability has both positive and negative impact on the system [21]. Thepositive impact on the system is that, it increases the performance and the systemmanageability as resources like memory and processor are added to it. Whereasthe negative impact of vertical scalability is that it decreases the availability andreliability of the system as the load balancing may become difficult among morenumber of resources.

2.2.2.2.2 Horizontal scalability It is also known as scaling out. It is ob-tained by adding extra servers to the system. Like vertical scalability and it also


have both positive and negative impact on the system [21]. The positive impacton the system is that it improves the availability, reliability and performanceas more number of servers are added to it and if one of them fails, others canwork. Whereas the negative is that it reduces the manageability of the system asmanaging more number of servers will become difficult.

2.2.2.3 Usability testing

Usability testing is performed to measure the usability of the application in termsof ease of use, content, navigability, color and look etc. Usability testing is nec-essary for web applications in order to define how easy is the application to use.Based on the obtained results, the application can be improved further. Failuresfound in the usability testing are due to faults that are identified in application[14, 6].

2.2.2.4 Compatibility testing

Compatibility testing is carried out to validate the execution of application indifferent environments and to know in which environment the application is failingto execute. Not all web applications are capable to run in every browser, so atest strategy is defined prior to testing which consists of the details about the setof browser that have to be tested. Failures found through this testing are due tofaults in application and the running environment [14, 6].

2.2.2.5 Security testing

Security testing is carried out to validate the application in terms of how secureit is from intruders or hackers as they may steal the confidential information.Security testing is a challenging task in web testing as even though proper care istaken for application, there might be a chance of existence of attack vectors in theapplication. Using security flaws as a medium, intruders can access confidentialinformation. So security testing is to be done with utmost care. Failures foundthrough this testing may be due to faults in the application and in the runningenvironment [14, 6].

2.2.2.6 Accessibility testing

Accessibility testing is performed to validate content of the application mustbe accessible even in low configuration systems and also to check whether thecontent can be retrieved by physically handicapped people. The accessibility is anecessary attribute in the application, if it is accessed by many users. So, failuredetected through accessibility testing is due to faults in application and in therunning environment [14, 6].


2.2.2.7 Reliability testing

Reliability testing is carried out to validate the application based on the timeit can live and time taken to recover in case if it fails. Reliability testing isnecessary for web application. It helps to gain the information regarding howlong the application can be available and how much time it requires in order torecover from the failure. The failures identified in this testing are mainly relatedto the environment problems [14, 6].

2.3 Selected attributes

The focus of this thesis is on non-functional testing of web applications. The se-lection of all non-functional attributes increases the scope of the study. So to limitthis, we have selected three attributes to conduct the research. The attributes arePerformance, Scalability and Reliability (PSR). These three attributes are collec-tively called quality factors. An overview of these three attributes is provided insection 2.2.2. The motivation regarding selection of these attributes is providedbelow.

The important attributes in the quality criteria are provided by Svenssonet al. [26] where they conducted interviews in 11 different software companies.The results are as follows: usability, performance, reliability, stability and safetyattributes are top five important attributes compared to other quality attributes.Even though usability is top most quality attribute, we did not considered it asthere is a lot of research done on this attribute previously [27, 28].

The importance of performance, scalability and reliability attributes are men-tioned by Iyer et al. [13] where they investigated the issues in the testing methodsof PSR. Smith and Williams [29] described the need for the quality of service inweb applications and it is obtained by focusing on the scalability and performanceattributes. The importance of scalability and its role in gaining the user satis-faction is explained clearly. Data analyzed from literature describe scalabilityas a part of performance attribute [30]. As mentioned by Svensson [26], secondand third attributes are performance and reliability. These are considered in ourresearch because a small amount of research is done on testing of these attributesin web applications collectively till now [13, 31, 32, 33]. Based on the above rea-sons, from the available attributes we mitigated the broad scope by restricting itto three quality attributes for our thesis. They are Performance, Scalability, andReliability (PSR).

One of the main reason for selecting the PSR attributes is, as the case com-pany is more focusing on these three attributes. The remaining two attributesstability and safety are also not considered in our research because the case com-pany does not consider these attributes while testing the web applications. Thecase company is more focusing on the attributes such as performance, scalability,


reliability, usability and compatibility. This is the main reason for consideringthese three attributes(PSR) for our research. According to [27, 28], usabilityattribute has a lot of research and hence we did not focused on it. Whereas com-patibility is more related to the browser and operating system so it differs fromPSR. And as Svensson et al. [26] provided top five quality attributes, in whichcompatibility is not one of them. Due to these reasons we mainly focused on PSRattributes for this study [34].

2.4 Research scope

This section illustrates the scope of the research as it consists of three areas. It isdepicted diagrammatically in figure 2.4.The three areas are testing, software, andquality attribute. As testing is a broad area, we limited our scope to web testingand is further narrowed down into quality testing of web applications which arePSR attributes.

Figure 2.4: Research scope


2.5 Related work

This section discusses about work that has been carried out over the past years,explains the gap in this study and also discusses about how that identified gap isfilled. This section contains four subsections which focuses on the related workof tools, metrics, challenges and describes about the identified gap.

2.5.1 Literature related to metrics

Dogan et al. [35] focused on identifying the tools and metrics available for testingweb applications. They have identified few metrics based on the criteria such ascost and effectiveness in general for testing web applications.

Kanij et al. [32] explained some of the metrics considered while testing the per-formance attribute. They have addressed two metrics for performance attributeand explained the need for further research in finding the related metrics.

Xiaokai Xia et al. [36] proposed a model to evaluate and analyze the perfor-mance attribute of web applications. As the proposed model mainly concentrateson finding the issues that are unidentified during the testing phase. In order toidentify the unidentified issues, they have considered a set of metrics for buildingthe model which mainly concentrates on the attribute performance.

Deepak Dagar and Amit Gupta [37] mainly focused on addressing the types,tools and methods used for testing web applications. As part of the research,they have also explained some of the metrics related to the performance testing.

Rina and Sanjay Tyagi [38] compared the testing tools in terms of some met-rics. In this research, they have evaluated the testing tools using the parametersby conducting an experiment for choosing the appropriate tool for conductingtesting. This research mainly deals with the performance attribute.

R.Manjula et al.[39] mainly concentrated on the reliability attribute. Theyhave explained the need for reliable web applications and proposed a reliabilitybased approach to evaluate web applications. As part of this research, they havealso mentioned some of the reliability parameters used for building this approach.

2.5.2 Literature related to tools

P. Lazarevski et al. [12] conducted a case study to evaluate the performancetesting tool Grinder. Along with that, they also explained some of the othertools mainly related to performance testing and performance monitoring. Theresearch is limited to AJAX based web applications and the tools which mayrelate to other web applications are not considered.

Wang and Du [40] proposed a framework by integrating the functionalities ofthe tools like JMeter and selenium together. This framework mainly concentrateson addressing different types of testing like UI testing, backend testing, load


testing etc. They have also mentioned some of the tools related to performancetesting, but failed to address the other quality attributes.

Hamed and Kafri [23] mainly compared web applications of two different tech-nologies Java and .NET by using performance testing tools and performancemetrics. They evaluated both the technologies in terms of response time andthroughput. The authors find that the Java technology performs better whencompared to .NET technology in web applications.

Arora and Bali [41] conducted a research on the automated tools available forperformance testing. As a process of the research they have conducted a literaturesurvey and identified 18 different automated tools for performance testing of webapplications.

Garousi and Mesbah [42] conducted a mapping study to identify the toolsavailable for testing web applications. And Dogan et al. [35] also focused onidentifying the tools available for testing web applications by conducting sys-tematic literature review. They have identified few tools along with the factoravailability of the tool. They mainly concentrated on the performance attribute.

Arora and Sinha [20] stated the need for testing web applications and alsofocused on two different techniques such as state- based and invariant-based test-ing. They have mainly focused on web testing and provided information related totools of both functional and non-functional attributes of web applications whichare not clear.

Rina and Tyagi [38] compared some of the performance testing tools in termsof some metrics. In this research, they have evaluated the testing tools by con-ducting an experiment in order to select the suitable tool for conducting thetesting.

2.5.3 Literature related to challenges

P. Lazarevski et al.[12] explained some of the tools mainly related to the perfor-mance testing and performance monitoring. Along with these tools, they havealso addressed the drawbacks and limitations existing in selected tools. As thisresearch is mainly addressed the issues related to the performance attribute.

Iyer et al. [13] explained the process of conducting the web testing and issuesin the process while dealing with the quality attributes. They have focused mainlyon the quality attributes like performance, scalability and reliability. They havelimited their research only to find the issues related to testing methods. Andexplained the need for further research on the topic PSR.

Junzan Zhou et al. [43] explained about the testing methods and some of thetraditional testing tools available for performance testing. Along with this, theyalso mentioned some of the challenges related to the area of performance testingtools.

Arora and Sinha [41] stated the need for testing web applications and alsoabout some of the tools and methods. They mainly focused on web testing and


provided some of the challenges related to the functionality of web applications.From the above sections, the literature related to PSR attributes are only

provided by one article which focuses on the issues in the testing methods [13],whereas the remaining articles from the literature discussed on the topics relatedto tools, metrics and challenges separately. From the literature, we have noticedthat only a few authors concentrated particularly related to tools, metrics andchallenges of PSR. Research related to the PSR testing of web applications isnew to the field of software engineering as there are very few studies existing.We came across one such article [13] from the literature, which mainly focuses onthe issues of testing methods for PSR attributes. As of our knowledge till now,we did not find any research dealing with tools, metrics and challenges in PSRattributes.

2.5.4 Research gap

The quality attributes play a key role in web testing. In order to deploy the webapplication on the server, a testing process is conducted in which a set of metricsare considered to validate the web application. These metrics used for testing arenot fixed for web application, but they vary from organization to organization i.e.small to large. Organizations are facing problems with the selection of metricsfor testing due to the constraints such as resources, cost, and time. There existslittle research on the selection of metrics by software testers in organizations [32].Along with this, the general issues and challenges (related to tools, development,metrics, and time) faced by the software testers while testing the PSR attributesof web applications. Some of the challenges related to tools are mainly because ofexisting drawbacks in them. Drawbacks in tools are also not particularly providedin literature and a new set of features needed by testers while using the tools fortesting the quality attributes are not known clearly and need of identifying theexisting testing tools [44] that support quality attributes, hence there is a needfor further research [13, 41, 14]. All these identified issues represent a gap in theresearch on web testing of quality attributes (PSR) and provide a motivation forour research. We mainly focus on identifying the challenges (related to tools,development, metrics, and time) faced by the software testers while testing thePSR attributes of web applications. And it is also deals with finding of tools andmetrics used by software testers for testing PSR attributes [34].

Chapter 3Method

This chapter mainly focuses on the purpose of the research and process carriedout in achieving it. It consists of six sections and the structure is as follows:

• Section 3.1 describes the purpose of the research

• Section 3.2 provides the research questions selected for addressing the aim

• Section 3.3 focuses on the research method selected for answering the selectedresearch questions

• Section 3.4 concentrates on the techniques used for collecting the data

• Section 3.5 explains the method used for analyzing the collected data

• Section 3.6 discusses the validity threats.

3.1 Research purpose

The purpose of this research is to identify the challenges faced by the softwaretesters while testing performance, scalability and reliability attributes in webapplications and also to identify the available tools and metrics for testing thePSR attributes of web applications.

3.1.1 Objectives

For achieving the purpose of this research, six objectives were identified. Theyare as follows:

• O1: To identify the common metrics, that are needed to be considered bysoftware testers in general while testing the PSR attributes of web applica-tions.

• O2: To identify a list of tools available for testing the PSR attributes bythe software testers and also to find the tools used by software testers inpractice.

18

Chapter 3. Method 19

• O3: To identify drawbacks in the tools used by software testers and also theimprovements suggested by them.

• O4: To identify the list of challenges faced and mitigations used by the softwaretesters while testing the PSR attributes of web applications.

• O5: To analyze the identified mitigations which is useful for the softwaretesters to address the challenges faced while testing PSR attributes in prac-tice.

• O6: To identify the most important attribute among the selected three at-tributes (PSR) used by the software testers.

3.2 Research questions

In order to achieve the goals of the research, we have framed the following researchquestions.

• RQ1: What metrics exist for testing PSR attributes of web applications?

• RQ1.1: What metrics are suggested in the literature for testing PSRattributes?

• RQ1.2: What metrics are used by software testers in practice for test-ing PSR attributes?

• RQ1.3: Why are particular metrics used or not used by softwaretesters?

• RQ2: What tools exist in general for testing PSR attributes of web applicationsand drawbacks observed in the tools from practice?

• RQ2.1: What tools are suggested in the literature for testing PSRattributes?

• RQ2.2: What tools are used by the software testers in practice fortesting PSR attributes?

• RQ2.3: What are the drawbacks of the tools used by software testersin practice and improvements suggested by them?

• RQ3: What are the challenges faced by software testers in general and mit-igation strategies available in literature for challenges while testing PSRattributes of web applications?

• RQ3.1: What are the challenges faced by software testers and whatare the mitigation strategies available in literature for testing PSRattributes?


• RQ3.2: What are the challenges faced by software testers in practicewhile testing PSR attributes?

• RQ3.3: Does the existing measures from the literature can solve thechallenges faced by software testers in practice?

• RQ4: Which attribute is considered the most important among PSR attributesby software testers in practice?

3.2.1 Motivation

The main motivation for formulating the research questions are as follows [34]:

• RQ1: The main reason for framing this RQ is to identify the metrics availablefor testing PSR attributes. Kanij et al. [32] discussed a few metrics andthe need for considering other metrics while testing, but only little researchwas carried out in this area. This research question helps to identify themetrics that are used in practice and in literature for PSR.

• RQ2: There are many tools available for testing web applications, but a cleardescription about the availability, type of tool, language support, metricsand supportable platform for each tool are not provided in the literaturefor PSR attributes. The drawbacks existing in tools are identified throughexperience by using tools and also it is possible to collect new informationregarding drawbacks which is not available in literature. So through thisRQ, we provide a clear description of the tools and their characteristicswhich is helpful for software testers or practitioners while selecting toolsand also provides drawbacks observed by software testers.

• RQ3: The software testers may face challenges while testing the web applica-tions. In our study, the testing process at Ericsson is consuming more time.Hence, this RQ helps in identifying the challenges faced by software testerswhile testing and likely to provide a mitigations based on the literature.

• RQ4: The most important attribute among PSR is identified through thisresearch question. This helps the software testers when there is a need todeliver the product early and when there is little time available for testing.This in turn helps to test the most important attribute first and the otherattributes can be tested later based on the available resources and time.

3.3 Research method

This section focuses on the methodology used for carrying out the research.In the software engineering discipline there are mainly four research methods

available [45]. The research methods are:


• Experiment

• Survey

• Case study

• Action research

Each research method has benefits and liabilities. Based on suitability and flexi-bility the research methods are selected. The motivation regarding the selectionand rejection of other methods are explained below.

Experiment: An experiment can be used to identify the cause and effect rela-tionship between selected variables. These variables are of two types dependentand independent variables, the effect on the dependent variables can be identi-fied by changing the independent variables by conducting an experiment. Theexperiments are generally conducted in a specific or controlled environment. Sorepeating the same effect is difficult i.e. creating a similar environment is verydifficult and results also cannot be generalized. As our research questions are notgeared towards identifying cause-effect relationships, hence experiment is not anappropriate method for our study.

Survey: Survey is a method to get a generalized data from many resourcesglobally. The data can be collected from a selected population by sampling froma large population. Survey is not an appropriate research method for our studybecause of the time schedule, as it takes long time to collect the responses fromvarious respondents. As in our case, there is an opportunity to conduct a studyin real environment.

Case study: Case study is a method to better understand a phenomenon andexplore the research area. As case study is conducted in real-world settings andconsists of the high degree of realism [45]. As our research is focusing more onidentifying challenges faced by software testers while testing the non-functionalattributes (PSR) so the data should be collected from the subjects who experi-enced these challenges while testing. As we have access to resources and suitableto our research questions we opted case-study as our research method.

Action research: Action research is a method to investigate and improve theprocess of research area. It is a type of case study in which the researcher inves-tigates and changes the process whereas case study is just observational [45]. Asour research is more focused on exploration of the topic we have not opted actionresearch as our method.

3.3.1 Systematic mapping study

A systematic mapping study is a secondary research method which provides anoverview of the topic to the researchers and gives a brief idea about the topic. Itprovides the frequency of publications in the research area. In this research we


selected systematic mapping study to provide the data regarding the availabletools, metrics and challenges regarding testing of PSR in web applications. Alsothe study helps us to design prompts for the interview i.e. it helps us to be ontrack on the interview by maintaining bullet points which are important for theresearch and the main data to be collected are in terms of the research goal. Theoverview of SMS study is provided in Appendix B.

The systematic mapping study is conducted by following the guidelines pro-vided by Petersen et al [46]. The steps carried out in the systematic mappingstudy are shown in the figure 3.1 and a description of each step is provided in thenext subsections.

Figure 3.1: Systematic mapping study process

3.3.1.1 Design of systematic mapping study protocol

A protocol is designed in order to conduct a systematic mapping study. Thisprotocol consists of the following sections.


3.3.1.1.1 Selection of keywords To form a search string, first we need toidentify the keywords to be included. From the research questions, we extractedkeywords and along with the keywords their synonyms are also considered. Theidentified keywords based on RQ’s are shown in table 3.1.

Table 3.1: Keywords used for search string formulationCategoryId

Category Keywords ResearchQuestions

C1 Web applica-tion

Web applications, web applica-tion, website, websites, webap-plication, website, websites,webapplications

RQ1, RQ2, RQ3

C2 Qualityattributes

Performance, scalability, relia-bility, reliable, scalable

RQ1, RQ2, RQ3

C3 Testing Testing, verify, verification,validate, validation

RQ1, RQ2, RQ3

C4 Tool Tool, tools, framework RQ1C5 Metric Metric, metrics, measures,

evaluate, evaluationRQ2

C6 Challenges Challenges, challenge, mitiga-tions, strategy, strategies

RQ3

3.3.1.1.2 Formation of search strings

3.3.1.1.2.1 Boolean operators for search For searching the literature,two kinds of operators are considered. They are AND and OR. The operatorAND is related to the keywords that must contain in the literature search i.e. theobtained results should contain all the keywords mentioned. The operator ORmust contain any one word used in the search, but not necessarily every wordused in literature search i.e. results obtained from the search contains at leastone keyword referred in the search. These operators act as connectors betweenthe formed keyword sets.

3.3.1.1.2.2 Formation of string We use the Boolean operators and key-word sets which result in the search strings. For example, (set1) AND (set2) theformed search string is:

• ((Web testing) OR (Web application testing) OR (Website testing)) AND((Assessment) OR (Assess) OR (Evaluate) OR (Evaluating) OR (Measur-ing) OR (Measure) OR (Metrics) OR (Metric) OR (Web metrics)) AND((Quality)OR (Attributes) OR (Performance) OR (Scalability) OR (Scal-able) OR (Reliable) OR (Reliability)) AND ((Web applications) OR (Web-site) OR (Web application) OR (Webpage).


3.3.1.1.3 Selection of scientific library databases Based on popularityand relevance the selected databases for searching the articles are:

The selected databases for searching are:

• INSPEC

• IEEE

• SCOPUS

• ACM

• WILEY

3.3.1.1.4 Study selection criteria The criteria for selecting and rejectingthe literature are provided in this section.

3.3.1.1.4.1 Inclusion criteria The inclusion criteria consist of factors weconsidered for accepting or selecting the literature.

• Literature regarding web application testing only.

• Literature which is peer-reviewed.

• Literature which is in English only.

• Literature from the year 2000- 2016.

• Literature which consists PSR attributes or in a combination i.e along withother attributes.

• Literature which focuses on individual attributes (i.e. only related to PSR)are also considered.

3.3.1.1.4.2 Exclusion criteria The exclusion criteria consist of factorswe considered to reject the literature.

• Literature consisting of unrelated abstract from the topic.

• Literature which cannot be accessed or unavailable.


3.3.1.1.5 Formulation of Search strings As keywords are provided in theabove section 3.3.1.1.1, based on keywords search strings are formed and providedin table 3.2.

Table 3.2: Search strings used for selection of literatureDatabase Category Search strings

INSPECTools ("web application" OR "web applications"

OR "web sites" OR "web site" OR website*OR webapplication*) AND (performance ORreliab* OR scalab*) AND (testing OR verifyOR verification OR validation OR validate)AND (tool* OR framework)

Challenges ("web application" OR "web applications"OR "web sites" OR "web site" OR website*OR webapplication*) AND (performance ORreliab* OR scalab*) AND (testing OR verifyOR verification OR validation OR validate)AND (challenge* OR mitigations OR strat-egy OR strategies)

Metrics ("web application" OR "web applications"OR "web sites" OR "web site" OR website*OR webapplication*) AND (performance ORreliab* OR scalab*) AND (testing OR verifyOR verification OR validation OR validate)AND (metric* OR measures OR evaluate ORevaluation)

IEEETools ("web application" OR "web applications"

OR "web sites" OR "web site" OR website*OR webapplication*) AND (performance ORreliab* OR scalab*) AND (testing OR verifyOR verification OR validation OR validate)AND (tool* OR framework)

Challenges (("web application" OR "web,applications"OR "web sites" OR "web site" OR,website*OR webapplication*) AND (performance ORreliab* OR scalab*) AND,(testing OR verifyOR verification OR validation OR validate)AND (challenge*,OR mitigations OR strat-egy OR strategies))


Metrics ("web application" OR "web applications"OR "web sites" OR "web site" OR website*OR webapplication*) AND (performance ORreliab* OR scalab*) AND (testing OR verifyOR verification OR validation OR validate)AND (metric* OR measures OR evaluate ORevaluation)

SCOPUSTools

TITLE-ABS-KEY ((“web application" OR"web applications" OR "web sites" OR"web site" OR website* OR webapplication*)AND (performance OR reliab* OR scalab*)AND (testing OR verify OR verificationOR validation OR validate) AND (tool* ORframework))

Challenges TITLE-ABS-KEY ((“web application"OR,"web applications" OR "web sites" OR"web site",OR website* OR webapplica-tion*) AND (performance OR reliab* ORscalab*) AND,(testing OR verify OR veri-fication OR validation OR validate) AND(challenge*,OR mitigations OR strategy ORstrategies))

Metrics TITLE-ABS-KEY ((“web application" OR"web applications" OR "web sites" OR"web site" OR website* OR webapplica-tion*) AND (performance OR reliab* ORscalab*) AND (testing OR verify OR ver-ification OR validation OR validate) AND(metric* OR measures OR evaluate OR eval-uation))

ACMTools

(+("web site" "website" "web application""webapplication")+(performance reliabilityscalability scalable reliable)+(testing verifyverification validation validate)+(toolframework))

Challenges

(+("web site" "website" "web application""webapplication")+ (performance reliabilityscalability "quality attribute" "qualityrequirement")+(testing verify verificationvalidation validate)+(challenge strategymitigation))


Metrics

(+("web site" "website" "web application""webapplication")+(performance reliabilityscalability "quality attribute" "qualityrequirement") +(testing verify verificationvalidation validate) +(metric measureevaluate evaluation))

WILEYTools “web applications” OR “web application” OR

“web site” OR “web sites” OR website* ORwebapplication* in Abstract AND perfor-mance OR scalab* OR reliab* in AbstractAND testing OR verification OR verify ORvalidation OR validate in Abstract ANDtool* OR framework in Abstract

Challenges “web applications” OR “web application” OR“web site” OR “web sites” OR website* ORwebapplication* in Abstract AND perfor-mance OR scalab* OR reliab* in AbstractAND testing OR verification OR verify ORvalidation OR validate in Abstract ANDchallenge* OR mitigations OR strategy ORstrategies in Abstract

Metrics “web applications” OR “web application” OR“web site” OR “web sites” OR website* ORwebapplication* in Abstract AND perfor-mance OR scalab* OR reliab* in AbstractAND testing OR verification OR verify ORvalidation OR validate in Abstract ANDmetric* OR measures OR evaluate OR eval-uation in Abstract

3.3.1.1.6 Execution of search strings and applying exclusion criteriaInitially, the designed search strings are executed. Along with that, the exclusioncriteria are applied on the obtained results from the databases. The exclusioncriteria mainly contain the year, document type, content type, and language andsubject area. The search results obtained before and after exclusion criteria areshown in table 3.3.

3.3.1.1.6.1 Initial article selection and exclusion of duplicate arti-cles The relevant literature is selected by reading the abstract, If the abstractis relevant to the research then the article is selected and if not relevant then thearticle is rejected. Initial search results and articles remaining after removal of


Table 3.3: Search results before and after applying exclusion criteriaDatabase Category Search results af-

ter executionAfter applying ex-clusion criteria

INSPECTools 377 352Challenges 154 147Metrics 407 380

IEEETools 377 368Challenges 162 155Metrics 583 564

SCOPUSTools 801 397Challenges 368 165Metrics 869 412

ACMTools 521 481Challenges 481 447Metrics 446 407

WILEYTools 80 75Challenges 50 41Metrics 126 108

duplicate articles are shown in table 3.4 and table 3.5.

Table 3.4: Initial search selection resultsDatabase Category Total search re-

sultsSelected articles

INSPECTools 352 98Challenges 147 42Metrics 380 66

IEEETools 368 76Challenges 155 23Metrics 564 52

SCOPUSTools 397 74Challenges 165 27Metrics 412 47

ACMTools 481 8Challenges 447 14Metrics 407 1

WILEYTools 75 2Challenges 41 3Metrics 108 6


Table 3.5: Search results after removing duplicate articles in each databaseDatabase Total number of

articlesRepeatedarticles

Remaining articles

INSPEC 206 36 170IEEE 151 24 127SCOPUS 148 28 120ACM 23 0 23WILEY 11 0 11Total 539 88 451

A total of 451 articles are obtained from all the databases. These 451 articlesalso include duplicates i.e. for example articles obtained from INSPEC databaseare overlapping to some extent with articles obtained from the IEEE database. Atotal of 134 duplicate articles are identified after merging articles obtained fromall the five different databases. After removing all duplicate articles, a total of317 articles remain.

3.3.1.1.7 Selected literature for systematic mapping study After study-ing the introduction and conclusion of all the 317 articles, we selected 97 articleswhich are relevant to our study. The articles are selected by both the authors withcross verification in order to avoid missing of the relevant articles. Verification isdone during the article selection process, i.e. at the time of screening and searchstring execution. At the time of search string execution one author executes theframed search strings. Whereas, the second author again recheck and re-executethe search strings in order to verify whether the obtained results are similar inboth the cases. During screening of articles both the authors discussed aboutarticles whether the articles are relevant to the study.

3.3.1.1.8 Study and assess the articles The selected articles are studiedthoroughly in order to identify the tools, metrics and challenges for PSR at-tributes. Based on specification, the articles are selected and the important datais highlighted while reading.

3.3.1.1.9 Data extraction strategy For data extraction, we use a templateshown in table 3.6. This template consists of data field in which it further consistsof data key and value pair.The overview of SMS study is provided in AppendixB.


Table 3.6: Data extraction formData key Value Research

questionGeneral

Study ID IntegerArticle title Name of articleYear of publica-tion

Article published date

Author name Name of the authorsPublicationvenue

Domain in which article published

Researchmethod

Method used by authors in the study

Process

MetricsWhich metrics are mentioned in thearticle? RQ 1Which metrics are used for evaluationin the article?Which metrics are described in the ar-ticle?

Attributes Which attributes are mentioned? RQ 1, RQ 2, RQ3

Which attributes are described in thearticle?

ToolsWhich tools are mentioned in the ar-ticle? RQ 2Which tools are used and evaluated inthe article?Which tools are given brief descrip-tion?

Challenges What challenges are mentioned or de-scribed in the article? RQ 3

For which challenges the article morefocused on?

The data is extracted by both the authors for each article separately. Andthe extracted data from both the authors is crosschecked with each other to findaccuracy in the data extraction. Most of the data extracted by both the authorsare overlapped to some extent. Whereas, the extracted data for other articlesvary between authors and the reasons for the difference are discussed. Finally,both the authors came to a conclusion after a discussion. The data is extractedfrom our selected primary studies. The mapping of research parameters, research


attributes, and research methods in SMS are provided in the Appendix A.

3.3.2 Case study

The methodology we followed for conducting the research is explained in thissection. In order to carry out the case study, we followed the guidelines providedby [45] for our research. There are mainly five steps involved in the case studyand they are represented in figure 3.2.

Figure 3.2: Case study process steps

3.3.2.1 Case study design

Case study consists of two important concepts. They are case and unit of analysis.In our research, the case we selected is a typical large scale company Ericsson,India and unit of analysis are mainly software testers in the projects who areworking for web applications. The reason for selecting the above case is avail-ability to the researcher and suitability to the research. The case can also bereplicated to include validity to research.

3.3.2.2 Protocol preparation

In order to collect data, a protocol is prepared prior to the collection of data. Theprotocol mainly consists of two sections data collection and data analysis. Datacollection deals with methods which are used to collect data and analysis sectiondeals with the method which is used for analyzing the collected data. These twosections are explained in the sections 3.3.2.3 and 3.4.


3.3.2.3 Data collection

Data collection can be done in three levels mainly for a case study [45]. They arefirst degree, second degree and third degree.

• First degree: In this, the researcher can collect required data from the subjectsdirectly by interacting with them.

• Second degree: The researcher collects the raw data from the subjects withoutinteracting with them. The raw data is collected through monitoring andobserving the subjects.

• Third degree: In this the researcher collects the information from the previ-ously available artifacts.

In our research, we collected the data from two levels as a part of data triangula-tion, which helps to validate the data obtained from two sources. The two levelsare first degree and third degree. In first degree, we opted for interviews and inthird degree we opted for analysis of the project documentation i.e. test reports.

3.3.2.3.1 Interviews In order to conduct interviews, the first step is to selectthe appropriate type of interview to be carried out. We selected semi-structuredinterviews for our research, as it can provide the dialogue making capabilitythrough which a discussion can be done in the interview. The population forinterview was selected based on convenience sampling. As convenience sampling[47] provides the way to select the subjects based on availability and convenienceto researchers. After the type of interview is fixed, we designed a questionnairewhich consists of two types of questions. First one is demographic questions whichare as follows.

• Age

• Qualification

• Experience

• Number of projects worked

• Testing experience

Second one is technical questions which deals with questions related to researchquestions. The technical questions are divided into four parts.

• The general questions related to web testing and quality attributes

• The questions related to tools and subjects experience of using the tools


• The questions related to metrics and subjects experience of measuring themetrics

• The questions related to challenges in terms of tools, time, metrics and devel-opment

A consent form is prepared in order to ensure confidentiality and anonymity to theinterviewee. Before questioning the subjects, the consent form is given to themto read and if they agree to the terms then the interview process is started. Twocopies of consent form are maintained for each interview i.e. one is for subject andthe other copy is for interviewer. The consent form contains the contact detailsof the researchers and supervisor, so that the subject can contact them if he/shefeels anything to share. The consent form is provided in appendix H.

In our research, we mainly selected software testers as interviewees as theresearch is related to web testing. A list of testers is collected and intimation isgiven to them in the form of e-mail to schedule an interview with them.

For the interview, a beginning script and ending script is prepared. Eachinterview is conducted by introducing the research topic and case study, thenasking the basic demographic questions and then moving to the technical ques-tions. Following the suggestions by Runeson and Höst [45], we used a pyramidmodel for the interview session in which the questions begin with specific oneswhich are demographic questions followed by open questions which are technicalquestions as shown in figure 3.3 and the questions are provided in Appendix F.

Figure 3.3: Pyramid model for interview questions

The interview is audio recorded and is conducted in a private room wherethere are only two persons present during interview, the researcher and subject.After the interview is done the audio recording is listened again and transcribedinto notes for analysis purpose. The transcribed notes are provided to respectivesubject and get reviewed by them if there are any mistakes they can inform us.This strengthens the validity of the recorded data. A total of 12 intervieweesare selected in this case, of which eight interviewees are from case company and


four interviewees are from other three organizations. All the three organizationsare software companies from India and the interviewees are mainly experiencedin web testing and the number of interviewees addressed from each company areprovided in table 3.8. The details of the 12 interviewees are given in the table 3.7.Among these interviewees one to eight represent interviewees from case companyand remaining four interviewees from nine to 12 are from other organizations.

Table 3.7: Details of intervieweeInter--vieweeID

Qualification Role Numberofprojects

Experiencein testing

Interviewdura-tion(min-utes)

1 B Tech inCSE

Test qualityarchitect

8 5.9 34

2 BE in ECE Senior solutionintegrator

5 8.5 40

3 MS in soft-ware systems

Verificationspecialist

3 10.5 38

4 B. Tech Senior QA 10 6+ 345 B. Tech Senior solution

integrator8 7.5 36

6 MCA Senior solutionintegrator

2 5.8 50

7 B Tech inCSE

Verificationspecialist

6 10 43

8 B Tech inCSE

Senior QA 3 7 32

9 B.sc in com-puter science

Tester special-ist

6 8 45

10 MCA Senior tester 5 7.4 3511 BE in me-

chanicalTester special-ist

6 9+ 36

12 B. Tech Software tester 2 3 38


Table 3.8: Overview of selected companiesCompany Domain Number of

intervieweesCase company Telecom 8Company 1 E-commerce 2Company 2 E-commerce 1Company 3 Retail 1

3.3.2.3.2 Documentation Along with the interviews, we have also selectedthis technique as another data source for collecting the information which is use-ful for our study. In this case, the available documents of previous projects fromthe case company are collected based on convenience sampling [47]. As conve-nience sampling provides flexibility to collect the documents which are availableto researchers. The collected documents consist of the information related to theprocess of non-functional testing carried out in the previous project of the casecompany. It is not possible to obtain all the information from the interviews,there might be chances of missing information during the interview process i.e.the interviewee might not provide all the information or does not remember. Atotal of 18 documents from the previous projects are collected for data trian-gulation and also used to identify the information that is not addressed in theinterviews. The documents mainly consist of test reports which address scala-bility and performance testing. Along with that, it is also used to validate theresults obtained from the interview. These documents help in identifying whatwork has been really carried out in the company.

The selected data collection techniques are used to answer the research ques-tions and to fulfill the objectives, these are presented in table 3.9. In the table3.9, ’X’ represents the selected data collection technique.

Table 3.9: Research questions and their respective data collection techniqueResearchquestions

Sub researchquestions

SMS Interviews Documents Objectivefulfilled

RQ 1RQ 1.1 X

O1RQ 1.2 X XRQ 1.3 X

RQ 2RQ 2.1 X O2RQ 2.2 X XQ 2.3 X O3

RQ 3RQ 3.1 X O4RQ 3.2 X XRQ 3.3 X X O5

RQ 4 - X O6


3.4 Data analysis

The data collected from the systematic mapping study and interviews are ana-lyzed using thematic analysis. The thematic analysis is a method, from whichthe useful data can be identified, analysis can be done and the themes can beobserved and reported. The thematic analysis is selected as it is mainly used forreporting the reality, meanings and experience of participants [48].

As there are many qualitative data analysis techniques such as content anal-ysis, grounded theory but the reason for selecting thematic analysis is becausethematic analysis identifies important data from large set of data corpus, it pro-vides a way to analyze data collected from different time and situations. Soimprovement in data analysis can be done through thematic analysis [49].

According to Braun and Clarke [49], there are six phases in thematic analysisas represented in figure 3.4 and the way we approached these six phases aredescribed next.

Figure 3.4: Steps for thematic analysis

3.4.1 Familiarizing yourself with the data

3.4.1.1 Systematic mapping study

In this phase, the data which is extracted while reading the literature are givena further reading. The highlighted data is cross-studied by both the authors toget an initial idea.


3.4.1.2 Interview

In this phase, the interviews we recorded are listened first and we transcribed thedata into the document by playing the recorded audio at a very low speed. Duringtranscribing process, the field notes is also verified to check whether the accuratedata is transcribed or not. After transcribing the data, the authors thoroughlyread the extracted data to get an initial idea. The transcribed documents areimported into Nvivo tool which is a qualitative analysis tool from which we canread the data and analyze it more easily. A repeated reading is done beforegenerating initial codes.

3.4.2 Generating initial codes


In this phase, the collected data from literature are imported into Nvivo tool.This tool is used for analyzing, coding and visualizing the data. It is a qualitativeanalysis tool mainly used for data analysis purposes, as it presents the data ina simple and organized way. The open coding and closed coding is used forcategorizing the data.

Open coding focuses on identifying the codes during and after the analysis.These codes are formed based on the evolved ideas during the analysis process.In closed coding, a set of codes are formed before analysis. As these codes areformed based on the framed research questions and on the aim of the research.

For example: In open coding, we coded different categories of tools from thedata collected from interviews for tools. These are based on the collected dataand not defined before the collection of data. They are commercial, opensource,internal, framework, freeware or trial, monitoring. So open coding helps to iden-tify the codes from the data. In closed coding for example the main aim of theRQ 1.1 is to identify the tools that are available from the literature. Based onthe research question we identified three codes such as performance, scalabilityand reliability before analysis.

As the categorization is basically depending on the research questions. So forRQ 1.1, RQ 2.1, RQ 3.1, RQ3.3 we used literature. So the classification of codesis initially depending on the RQ’s mentioned above. During the coding process aset of new codes are also obtained from literature which are different from initialcodes. The obtained codes are used in the implementation of data extractionstrategy as provided in the section 3.3.1.1.9.

3.4.2.2 Interview

As said above, the same tool is used to analyze the data obtained for interviews,initially the transcribed documents are imported into tools and data is initiallycoded based on RQ1.2, RQ 1.3, RQ 2.2, RQ 2.3, RQ3.2, RQ3.3 and RQ 4. The


codes are formed based on the RQ’s and from reading the text from transcriptions,we have also created other codes which are not initially formed. A total of 12transcriptions are used for analyzing the data.

3.4.3 Searching for themes


After the initial codes are defined, a set of themes are obtained. The obtainedthemes are again divided into subthemes, which simplifies the process of classifi-cation. The data which does not relate to the existing themes, are formed intonew themes.

3.4.3.2 Interview

As the documents are imported into the tool, while reading the documents if anysentence or paragraph is found as important then an initial node is created usingthe tool. If any data relevant to the created node is found while reading in thelater stages, they are coded into that node. As the tool provides a feature to codethe selected sentence or paragraph, so we used this feature to include the codeddata into a node. In this way all the interviews are analyzed and arranged theminto codes.

As the above step is an initial generation of codes, after thorough analysisof collected data the nodes are again rearranged into useful themes based onthe extracted data. There is also an extra information which is collected frominterviews, we have also created the themes for the extra information. And thethemes are classified as per the four research questions and each theme is againdivided into sub-themes. These subthemes help in classifying the informationinto low level. The themes are organized by the tool which makes the processeasy to conduct.

3.4.4 Reviewing themes

3.4.4.1 Systematic mapping study and interview

The themes obtained from the above step are reviewed by the authors of the re-search. We have compared the coded data with the data corpus to find whetherthe formed themes are relevant to the topic. The formed themes are againrechecked to make sure that all the coded data are available in relevant themes.So based on this, some themes are added and some are removed or merged intoone theme if they find any data that is not relevant to the theme. Later thethemes are used for generating the thematic maps.


3.4.5 Defining and naming themes


Choosing an appropriate name for the theme is a challenging task. The name ofthe theme itself explains the content with which it is dealing. So we used severalnames for the theme initially and fixed to one name once we are satisfied withit. So we just used normal names to identify easily what that theme consists of.We have used brainstorming technique (between authors) and applied it severaltimes until we find a proper name and later we fixed to one name to each of thetheme and all the names of the themes are provided in the result section 4.

3.4.6 Producing the report


The analyzed data and results derived from data are need to be reported properlyto ensure the results accurately without providing any unnecessary data. Theresults are provided in an easy follow up way based on RQ’s and the methods.The documented results are provided in section 4.

The basic thematic structure for interview is shown in the figure 3.5. Thisstructure is obtained from the tool NVivo.

3.5 Validity threats

The results from the research should not be biased and should be trustworthythen only the particular research is said to be conducted in an ethical way [45].In order to establish confidence on the conclusions of the research, validity threatsneed to addressed and mitigated. Runeson and Host [45] discussed four types ofthreats which need to be addressed. They are:

3.5.1 Construct validity

Construct validity is related to representation of what researchers have in mindand what they have investigated related to research questions [45].

The protocol prepared for the SMS was sent to the supervisor and based onthe approval the protocol was implemented i.e. the search strings were reviewedfour times. The data was analyzed by the researchers and for validation purposethe analyzed data was cross-checked by two researchers that are available in ourresearch. The limitation is some of the articles which are relevant but could notinclude in the selection as they are not available to download.

The interview protocol was prepared and a mock interview was conductedprior to the main interview to validate the questionnaire. Based on the feedback


of the mock-interview some modifications are done. The questionnaire was sent tosupervisor for approval and based on the approval from the supervisor, interviewswere conducted. To get the accurate data, the subjects were provided with aconsent form prior to the interview which ensures the anonymity and confidentialof the subject. We prepared a script for interview which consists of basic definitionof PSR attributes so the interviewee may not confuse due to the terms used inquestionnaire. This ensures the interviewer and interviewee are on the same track.For reduction of errors in data analysis, a tool was used for analyzing the dataand interpreting the relation in the data corpus.

3.5.2 Internal validity

Internal validity is related to an unknown factor affecting the studying factor. Itmay cause the changes but the researcher does not know the changes are causedby the third factor.

Improper selection of literature can be a threat, so in our research the selectionof literature was done by thorough reading of abstracts and later the main sectionsin the articles. The paper selection was done by both the authors, so the chanceof missing relevant literature for the study is low. Both the authors performedthe selection process individually and cross checked with each other. The articleswhich are not similar for both the authors are discussed with each other andselected the articles only if both authors were satisfied. The selection of databasesis the limitation to the study. It is not possible to select all the technical databases,so only the databases which are popular and known to researchers were consideredfor this study.

The selection of interviewees may affect the collected data, so the subjects withprevious experience in web testing were selected based on convenience samplingand the selected respondents have more than three years of experience. So thechances for getting the misinterpretation of questions are very low. Since ourstudy contains testing of three attributes PSR, there may be chance of imposingthe particular attribute with another attributes or confusion may arise, so a propercare is taken to ensure whether the interviewee is considering only PSR and noother attributes.

We have also used another data source called documentation. The documenta-tion is a third degree method which ensures the data collected from the interviewsis validated.

3.5.3 External validity

External validity is related to generalization of the results. According to Runesonand Host [45], external validity is termed as “to what extent it is possible togeneralize the findings, and to what extent the findings are of interest to otherpeople outside the investigated case.”


SMS was conducted by preparing a protocol and search strings are formedprior to literature search. These search strings are consistent for all databasesand selection of literature is done as per inclusion and exclusion criteria. As perour knowledge, we selected the relevant literature for the study but there maybe a chance of missing important literature due to unavailability. So this is thelimitation of our study. We can say that results from the SMS can be generalizedas we covered all the relevant literature excluding the unavailable ones. And alsothe results may find interesting as the systematic mapping study mainly revealsthe state of art in the research area. Further researchers can find the area whichhas a very less research done, which helps to keep more focus on it.

The interviews were conducted with different roles of subjects and also otherinterviews were also conducted from other organization in order to validate theresults for more number companies and individuals outside the studied company.As we did less number of interviews with other organizations, we consider theseinterviews as sanity check to validate the results from case company are typicalor not. So we can find the partial generalized results in this study. Even thoughthe case study is a qualitative process, replication of the results is quite difficultas same environment and situations are difficult to replicate. But according toRuneson and Host [45], the qualitative study is more of exploring the area, whichis less concerned regarding replication of findings. The details of the intervieweesare mentioned in table 3.8.

3.5.4 Reliability

Reliability is related to what extent the research is based on the researcher per-spective. To be a reliable research, it should not be affected by researcher knowl-edge and the data should be analyzed as per the obtained data.

The systematic mapping study was conducted in a certain process where wefollowed the protocol which was initially prepared. So the researcher does notinclude any new details which are not in protocol. So the author’s perspectivedoes not affect the result of systematic mapping study. Even though there maybe a chance of missing some data from the study which is a limitation for ourresearch.

The data collected from the interview were in the form of audio recordingsand through these recordings we transcribed into document. The transcribeddocument was send to the interviewees to verify if there was any error or mis-takes available in the transcription. Almost all interviewees replied with minorto no errors in the document. So this way the data is not misinterpreted. Assaid above a mock interview was conducted prior to interview which validates thequestionnaire and the data analysis method was applied by studying the guide-lines provided by [49]. So the results derived from this research can be replicatedagain which ensures reliability in the research.


Figure 3.5: Themes formed in Nvivo tool for interviews

Chapter 4Results and Analysis

This chapter presents the results obtained from the systematic mapping studyand case study. The collected data is analyzed by using thematic analysis. Thethematic analysis is conducted by following certain guidelines provided by Braunand Clarke [49]. The data required for our research is gathered from three sources.

• Systematic mapping study For systematic mapping study a total of 97 arti-cles are selected out of which 85 articles focuses on performance, 25 articlesfocus on scalability and 27 articles focus on reliability. Based on our re-search questions, three facets are formed in the systematic mapping study.They are metrics, tools and challenges.

• Interviews For interviews a total of 12 interviewees are selected, eight of themare selected from Ericsson and remaining four interviewees are selected fromthe other three organizations. The remaining four interviewees are consid-ered from the other organizations, in order to validate the results collectedfrom the Ericsson. Out of these 12 interviewees, six interviewees addressedreliability, ten interviewees addressed scalability and 12 interviewees ad-dressed about performance attributes.

• Documents The documents available from the previous projects are also col-lected. The documents are collected only from case organization as we areunable to retrieve documents from other organizations due to confidentialityand insufficient contacts. From case organization a total of 18 documentswere collected from the previous projects. All the collected documents focusonly on the performance and scalability attributes. We are unable to collectthe documents that focus on reliability, they were not available to us due toconfidentiality. Out of these 18 documents, 14 documents concentrate onperformance and 6 documents on scalability.

The data collected from the three sources i.e. SMS, interviews, and documentsare presented in sections 4.1,4.2, and 4.3. The number of sources addressing theresearch attributes are shown in the figure 4.1.

43

Chapter 4. Results and Analysis 44

Figure 4.1: Number of sources addressing the research attributes

4.1 Facet 1: Metrics for testing PSR attributes

In this section, we present the results of our first research question by using thedata collected from systematic mapping study, interview and documents. Thissection is structured into three subsections, first subsection consists of metricsobtained from systematic mapping study which answers the RQ1.1, second sub-section consists of metrics obtained from both interview and documents which an-swers RQ1.2 and third subsection provides criteria for selection of metrics whichanswers the RQ1.3.

Generally metrics are the measures used for measuring the attributes of soft-ware entities [50]. For this study, the metrics related to PSR are collected. Atotal of 69 metrics are identified from SMS, 30 from interviews and 16 metricsfrom documents. A total of 115 metrics identified from all the sources and thedescription of each metric is provided as list in the Appendix C.


This section mainly addresses the data collected from systematic mapping study.Out of 97 articles, 80 articles deal with the metrics related to PSR. The metrics


are classified into themes and based on these themes the metrics are collectedfrom the systematic mapping study. The thematic map for metrics is providedin figure 4.2.

Figure 4.2: Thematic map for metrics from SMS

According to the thematic map, metrics are classified into three themes orschemas as shown in figure 4.2. Each schema is described below with the list ofmetrics. A total of 39 metrics are categorized under performance, 17 metrics un-der scalability and 13 metrics under reliability. From this study, we also identifiedtop most two metrics mentioned in the articles for each attribute by calculatingthe percentage of their occurrence in a total of 80 articles. The metrics obtainedfrom SMS in each schema are provided below.

4.1.1.1 Performance

Performance schema is related to the performance attribute, in which the perfor-mance of the web application is measured using metrics. So the metrics availablein the literature are collected and analyzed. We observe that response time (80%)and throughput (57%) is most covered metrics from the literature.

We observed that more number of articles focused on performance relatedmetrics than the scalability and reliability attributes. We can observe the pat-tern that performance attribute in web applications received more attention thanscalability and reliability. The list of metrics is provided in the table 4.1.

4.1.1.2 Scalability

Scalability schema is related to the scalability attribute, in which the scalabil-ity of web application is measured by using scalability metrics. So the metricsavailable in literature are collected and analyzed. According to Guitart et al.[21],scalability is considered as a sub part of performance. The scalability is per-formed to check whether the system is capable to handle the heavy load without


Table 4.1: Performance metricsID Performance metric Frequency

count1 Response time 632 Throughput 453 Number of concurrent users 234 CPU utilization 195 Number of hits per sec 146 Memory utilization 137 Disk I/O (access) 88 Latency 89 Think time 710 Elapsed time (disk) 611 Processor time 512 Roundtrip time 513 Number of transactions per sec (http) 414 Number of HTTP requests 315 Load time 316 Cache hit ratio 317 Hit value 318 Session length 319 Capacity 220 Cache hit 221 Disk utilization 222 Network traffic (bandwidth) 223 Requests in bytes per sec 224 Disk space 125 Hit ratio 126 Page load time and request time 127 Availability 128 Number of connections per sec (user) 129 Session time 130 Transaction time 131 Connect time 132 Request rate 133 Total page size 134 Total page download time 135 First byte time 136 DNS lookup time 137 Cache memory usage 138 Number of successful virtual users 139 Available memory 1


any performance degradation. This performance degradation can be overcome byadding additional resources. From the analysis, we observed that scalability ismainly used for measuring performance of system after the addition of resources.As it is measuring performance of the system, some metrics may overlap for bothscalability and performance.

From the analysis, we can observe that scalability is second most mentionedattribute among the selected literature. We have also observed that response time(80%) and throughput (57%) are the most covered metrics from the literature.The list of metrics identified for scalability are provided in the table 4.2

Table 4.2: Scalability metricsID Scalability metric Frequency

count1 Response time 632 Throughput 453 Number of concurrent users 234 CPU utilization 195 Number of hits per sec 146 Memory utilization 137 Disk I/O (access) 88 Latency 89 Number of connections per sec (user) 610 Disk queue length(request) 511 Number of transactions per sec (http) 412 Disk space 113 CPU model 114 CPU clock 115 Number of cores 116 Max. CPU steal 117 Available memory 1

4.1.1.3 Reliability

Reliability schema is related to the reliability attribute, in which reliability of theweb application is measured using reliability metrics. So the metrics available inthe literature are collected and analyzed. We observed that MTBF (10%) andNumber of errors (10%) are the most covered metrics from literature for reliability.

From the analysis, we can observe that reliability is least mentioned attributeamong the selected literature. The number of articles addressing the reliabilityattribute is less when compared to the other two attributes. The list of metricsidentified for reliability attribute is provided in the table 4.3.


Table 4.3: Reliability metricsID Reliability metric Frequency

count1 MTBF 82 Number of errors 83 Number of sessions 54 Failure rate (request) 45 MTTF 36 MTTR 27 Errors percentage 28 Error ratio 29 Number of connection errors 210 Number of timeouts 211 Successful or Failed Hits 112 Number of deadlocks 113 Rate of successfully completed requests

(good put)1

4.1.2 Interviews and documents

The data obtained from interviews is used to answer the research question RQ1.2i.e. metrics used by software testers in practice for testing PSR attributes. Themetrics are classified into themes and based on these themes, metrics are collectedfrom the interviews. The thematic map for metrics is provided in figure 4.3.

Figure 4.3: Thematic map for metrics from interviews

The metrics theme as shown in figure 4.3. is classified into three sub themes.They are performance, scalability and reliability. The performance theme con-tains the data coded from 12 interviewees, scalability contains the coded data of8 interviewees and reliability from 4 interviewees.


4.1.2.1 Performance

All Interviewees (12 out of 12) reported metrics for the performance attribute.All the interviewees are familiar with the metrics they use in performance testing.They even mentioned some metrics which we did not found in literature such asrendezvous point, queue percentage and rampup and rampdown time.

• Rampup and Rampdown time: ramup increases load on server and measurebreakpoint and rampdown is decreasing the load gradually inorder to re-cover from ramup.

• Rendezvous point: point where all expected users wait until all are emulated,and then all virtual users send request at one time.

• Queue percentage: percentage of work queue size currently in use.

The metrics which are used in practice for measuring performance attribute aregiven below.

• Number of transactions per sec

• CPU utilization

• Memory utilization

• Processor time

• Throughput

• Disk I/O

• Number of hits per sec

• Number of requests per sec

• Number of concurrent users

• Network usage

• Server requests and response

• Speed

• Response time

• Rendezvous point

• Transactions pass and fail criteria

• Rampup time and rampdown time


• Error percentage

• Queue percentage

• Bandwidth

• Network latency

4.1.2.2 Scalability

Eight out of 12 interviewees mentioned metrics they use for measuring scalabilityin practice. One of the interviewee stated that scalability is the main attributefor the studied company. So they have focused more on the attribute scalabilityas their domain is telecom. The scalability metrics obtained from interviews aregiven below.

• Load distribution

• Throughput

• Number of concurrent users

• CPU utilization

• Response time

• Memory utilization

• Disk I/O

• Number of requests per sec

4.1.2.3 Reliability

Four out of 12 interviewees mentioned metrics they used for measuring reliabilityattribute. Some of the interviewees are not familiar with reliability testing, so theydid not provide any information related to metrics. Majority of the intervieweesprovided the same type of metrics for reliability. The metrics which are used inpractice for measuring reliability are given below.

4.1.2.3.1 MTBF MTBF is the mean time between failures and defined as thetime gap between the identified failure to the next failure. 4 out of 12 intervieweesprovided this metric. The main reason mentioned by the interviewees is thatthe developed application should be fault tolerant. So to make sure that theapplication is reliable, it is tested from different scenarios.


One of the interviewee stated that “whatever is the scenario the applicationshould work; the fault tolerance is a must in the developed application.” – Testspecialist

Another interviewee mentioned that “There should be very less amount oferrors or zero error in order to be web application reliable, if the application islife critical then it should definitely be reliable without any error chance.” – Seniorsolution integrator.

4.1.2.3.2 Number of failures Number of failures an application has can beable to tell the application reliability. So the number of failures have their leniencebased on the type of application. If the application is a normal application whichprovides basic information it is not a big issue, but if the number of failures ismore in an e-commerce type of web applications then the business will fail. Sonumber of failures metric is used to provide the information about number oftimes an application has failed.

4.1.2.4 Summary

The interviews are conducted from both the case company and other organiza-tions; we find that the metrics provided by eight interviewees from the case studyoverlapped with the metrics provided by the interviewees from other companies.The metrics obtained from case company and other companies are provided inAppendix G. As we conducted other interviewees in order to find the results fromcase company are typical or not.

4.1.2.5 Documents

The metrics which are collected from the previous project test reports are pro-vided here. The documents are analyzed and the metrics for PSR attributes aredistinguished. The documents collected consists of metrics for performance andscalability. Among 18 documents, 14 documents are related to performance andsix are related to scalability. All the identified metrics of PS attributes from thedocuments overlapped with the metrics identified from interviews. We are unableto collect the reliability related documents as they are not available for us. Met-rics are classified into themes and based on these themes, metrics are collectedfrom these documents. The thematic map for metrics is provided in figure 4.4.A total of 16 metrics are obtained from documents, out of which nine belongs toperformance and seven to scalability. The identified metrics are given in table4.4.


Figure 4.4: Thematic map for metrics from documents

Table 4.4: Metrics obtained from documentsS. No Performance Scalability1 Number of transactions per

secLoad distribution

2 CPU utilization CPU utilization3 Number of bytes per sec Memory utilization4 Number of sessions Response time5 Execution time Throughput6 Response time Number of threads7 Number of bytes per sec Number of users8 Latency9 Memory utilization

4.1.3 Criteria for selection of metrics

A total of 12 interviews are conducted for identifying the criteria for selectionof metrics in practice. All interviewees were provided the criteria in a similarway as the metric selection is mainly depend upon the type of web applicationbeing tested, metric dependency, customer and market requirements. One ofthe interviewee stated "Metric selection mainly depends on customer and marketrequirements, as customer may use different configuration settings and hardwarewe need to test on their settings and provide figures (outputs or measured values)to them that the given applications is working on provided settings" - Verificationspecialist.


4.2 Facet 2: Tools for testing PSR attributes

In this section, we present the results of our second research question by using thedata collected from systematic mapping study, interview and documents. Toolsare used to evaluate web applications against selected criteria. The tools sim-plify the testing process and it helps the testers to solve the complex problemswithout any strain. The tools can be of two types: manual and automated. Soin our study, tools related to PSR attributes are collected from different datasources. The tools collected from all the sources are as follows: 54 from SMS, 18from interviews and four from documents. A total of 76 tools identified from allthe sources and the information about the availability, language support, plat-form support, developer, URL, Source type and quality attribute for each tool isprovided as a list in the Appendix D.

This section is structured into three subsections as follows, first subsectionconsists of tools obtained from the systematic mapping study which answers theRQ2.1, second subsection consists of tools obtained from both the interview andthe documents which answers RQ2.2 and third subsection consists of the draw-backs and improvements identified in the tools from the practice which answersthe RQ2.3.


This section mainly addresses the data collected from the systematic mappingstudy. Out of 97 articles, 76 articles deal with the tools related to PSR. The toolsare classified into themes and based on these themes the tools are collected fromthe systematic mapping study. The thematic map for tools is provided in figure4.5. So in our research, the tools obtained from the systematic mapping study are

Figure 4.5: Thematic map for tools from SMS

classified into different schemes. They are performance, scalability and reliability


as shown in the figure 4.5. This classification provided an interesting outcomewhere the tools related to scalability and reliability are merely less mentioned inthe articles. A total of 53 tools are identified for performance attribute, 23 toolsfor scalability and five tools for reliability. Based on the analysis, we identifiedtools related to performance can be used to measure the scalability attributealso. Because scalability is performed by adding additional resources and tomeasure the performance deviations in the system after adding resource. Whereasreliability majorly consists of markov chain model and few tools i.e. only fivetools are identified from the literature. From this study, we also identified twotop most mentioned tools for each attribute by calculating the percentage of theiroccurrence in the total 76 articles. And the list of all tools obtained from SMSalong with the frequency count are provided in the table 4.5.

4.2.1.1 Performance

Performance scheme is related to the tools which can be used to perform testingto measure performance. The performance tools are sub divided into load testingtools and stress testing tools, performance monitoring and profiling tools. Themost mentioned tools in literature are Apache JMeter (25%) and LoadRunner(31%) based on the frequency count. The list of tools related to performance arementioned in the table 4.5.

4.2.1.2 Scalability

Scalability scheme is related to the tools which can be used to perform testingin order to calculate the performance measures after adding additional resources.The scalability tools are sub divided into monitoring tools and scalability tools.The most mentioned tools in literature are Webload (9%) and Silk performer(5%) based on the frequency count. The list of tools related to scalability arementioned in the table 4.5.

4.2.1.3 Reliability

Reliability scheme is related to the tools which can be used to perform testingto calculate error and fault measures in certain criteria’s. The criteria can benumber of users, number of hits, number of sessions. The reliability tools arementioned fewer times compared to other two attributes. The most mentionedtools in literature are test complete (3%) based on the frequency count. The listof tools related to reliability are mentioned in the table 4.5.


Table 4.5: Identified PSR toolsID Tools Attribute1 Apache JMeter™ Performance (Load testing)2 LoadRunner Performance(Load testing)3 WebKing Performance(Load testing), Re-

liability4 iPerf Performance5 Tsung Performance (Load testing,

Stress), Scalability6 WAPT Performance(Load, Stress)7 openSTA Performance(Load, Stress)8 SOAtest Reliability, Perfor-

mance(Load,stress)9 Microsoft Web Application

Stress ToolPerformance(Stress)

10 httperf Performance(Load)11 The Grinder Performance(Load)12 WebLOAD Performance(Load, stress), Scal-

ability13 Silk Performer Performance(Load, stress), Scal-

ability14 Webserver Stress Tool Performance(Load, stress)15 QAload Performance(Load, stress), Scal-

ability16 Wireshark Performance17 Firebug Performance ( web page perfor-

mance analysis)18 Oprofile Performance ( performance

counter monitor profiling tools )19 Xenoprof Performance ( performance

counter monitor profiling tools )20 SoapUI Performance (load testing)21 CloudTest Performance (load testing) &

scalability22 collectl Performance (monitoring tool)23 ApacheBench Performance (load testing)24 TestComplete Performance, scalability and re-

liability25 MBPeT: A performance

testing toolPerformance and scalability

26 collectd Performance (load testing)


27 Cacti Performance28 FastStats Log File Analyzer Reliability, Perfor-

mance(Load,stress)29 Rational TestManager Performance30 Pylot Peeformance and Scalability31 loadstorm Performance (load testing)32 Rational Performance

TesterPerformance

33 Testmaker Peeformance and Scalability34 Siege Performance (load testing)35 LOADIMPACT Performance(Load testing)36 Advanced Web Monitoring

Scripting (KITE)Performance monitoring

37 Visual Studio Performance(Load testing,stress testing)

38 testoptimal Performance(Load testing)39 WebSurge Performance(Load, stress)40 Application Center Test Performance(Stress, Load),

Scalability41 e-TEST suite Performance, Reliability42 Watir-webdriver Performance43 Selenium WebDriver Performance, Scalability44 AppPerfect Load Test Performance(Load, Stress)45 Yslow Performance analysis46 BrowserMob Performance(Load), Scalability47 NeoLoad Performance(Load, Stress)48 perf Performance monitoring49 Blazemeter Performance(Load)50 Zabbix Scalability(monitoring tool)51 Nagios Scalability(monitoring tool)52 Opsview Scalability(monitoring tool)53 HyperHQ Scalability(monitoring tool)54 HP QuickTest Professional Performance(load)


The data obtained from interview is used to answer the research question RQ2.2i.e. tools used in practice for web testing by software testers. The tools areclassified into themes and based on these themes the tools are collected frominterviews. The thematic map for tools is provided in figure 4.6.


Figure 4.6: Thematic map for tools from interviews

The tools theme as shown in figure 4.6 is classified into seven sub themes.They are commercial, frameworks, freeware or trial, internal tool, monitoring tool,open source and simulators. The commercial theme contains the data coded fromnine interviewees, frameworks theme contains the coded data of two interviewees,freeware or trial theme contains the coded data of three interviewees, internaltool theme contains the coded data of six interviewees, monitoring tool themecontains the coded data of five interviewees, open source theme contains thecoded data of 11 interviewees and simulators theme from two interviewees. Theclear representation of these sub themes and number of interviewees for each subtheme are provided using the bar chart as shown in the figure 4.7.

Figure 4.7: Types of tools obtained from interviews

4.2.2.1 Commercial

Commercial theme consists of the tools which are licensed products. Nine out of12 interviewees provided the list of commercial tools they have used from their


experience. Commercial tools are used because the company have provided themwith a licensed software so the software testers have used them. The selection oftools is not in the hands of software testers; it depends on the company. One ofthe reasons which we identified from interviews regarding the reason behind theselection of specific tools (while leaving other tools) is based on customer require-ment and company needs. The list of commercial tools for testing performance,scalability and reliability provided in the interview is presented in table 4.6.

Table 4.6: Commercial tools obtained from interviewsS. No Tool name Number of

intervieweesmentioned

1 HP LoadRunner (Old name: Mercury loadrunner)

6

2 VMware Vcenter 13 QuickTest Professional 44 HP Quality Center 25 IBM RPT 16 Silk performer 17 Sahi pro 1

4.2.2.2 Frameworks

Two out of 12 interviewees provided the data about scalability tools in which theyuse clustering frameworks to enable high scalability in the applications. Theseclustering framework enables to identify the node which is going to fail and makethe application fault tolerant also. The details provided regarding clusteringframework are listed in table 4.7.

Table 4.7: Frameworks obtained from interviewsS. No Tool name Number of


1 AKKA clustering 22 Zookeeper clustering 13 Oracle RAC clustering 1

4.2.2.3 Freeware / trial

Three out of 12 interviewees mentioned about freeware or trial version capabilitytools for testing the non-functional attribute performance. All the three intervie-wees mentioned the same tool and it can be available both in standard version


and pro version for trial. The tool is SOAP UI, where tester can generate theload and view the request and response between the application.

4.2.2.4 Internal

Six out of 12 interviewees mentioned a tool or framework used in Ericsson, asit is used for non-functional testing. One of the interviewee stated that “Erics-son prefer in making their own tools rather than depending on other, so now weare working on a performance benchmark tool for non-functional testing.” – Testspecialist.

The reason for developing and preferring the internal tools over commercialtools in the case company is because of its flexibility to update and support theneeded requirements. One of the interviewee mentioned that “As the commercialtools does not satisfy all the required needs of the company. Whereas in the case ofinternal tool we can develop the tool based on our own requirements” – Verificationspecialist.

4.2.2.5 Monitoring

Five out of 12 interviewees mentioned monitoring tools which are used to monitorthe usage of network, data, utilization of memory, CPU. These tools are generallyused in scalability and performance testing. The tools are provided in the table4.8.

Table 4.8: Monitoring tools obtained from interviewsS. No Tool name Number of


1 M1 - Monitor One 22 Wireshark 1

4.2.2.6 Open source

11 out of 12 interviewees mentioned open source tools and they currently use theJMeter tool in the company. The open source tools are freely available in onlineand no need to buy, so many companies encourage the use of open source tools.The tools which are provide in the interview are listed in the table 4.9.


Table 4.9: Open source tools obtained from interviewsS. No Tool name Number of


1 Apache JMeter 112 Selenium 33 Ixia 1

4.2.2.7 Simulators

Two out of 12 interviewees mentioned the usage of simulators in performancetesting of web applications. The simulators generate the load and play the sce-nario to measure the performance. The simulator mentioned by the intervieweesis DMI simulator which generate the load as per request. One of the intervieweestated that “for performing like if you want to create a lot of devices in general, alot of traps to them and test the load on the given system you can use simulators.”

4.2.2.8 Summary

The interviews are conducted from both the case company and other organiza-tions; we find that the general tools used by case company and other organizationsare Apache JMeter tool and LoadRunner tool. The tools obtained from the casecompany and other companies are provided in the Appendix G. So the data re-lated to tools is collected from interviewees are general rather than typical data.

4.2.2.9 Documents

By the analysis of test reports of previous projects, we collected some tools wherethe tools overlap with the interviewees provided tools. The tools are classifiedinto themes and thematic map for tools is provided in figure 4.8. A total of fourtools are identified from the available 18 documents and provided in the table4.10. These documents are mainly related to the performance and scalability testreports.


Figure 4.8: Thematic map for tools from documents

The list of the tools identified from the documents are

Table 4.10: Tools obtained from documentsS. No Tool name1 Apache JMeter2 SOAP UI3 Jconsole4 Pure load enterprize

4.2.3 Tool drawbacks and improvements

Seven out of 12 interviewees provided drawbacks in tools they used. Some com-mon drawbacks are mostly mentioned on tool JMeter. There are five drawbackswhich are identified from seven interviewees.

• Limit in the number of virtual users for JMeter tool.

• JMeter makes the interaction between the systems very complex during thesimulation process.

• The update to identified bug in the commercial tool is not possible as it is nota proprietary tool.

• JMeter also fails to handle when the number of hits from the virtual usersincrease. Which leads to the generation of deadlock.

• Load runner tool fails to work properly in the low configuration system as itis a high-end tool.


The case company developed an internal tool to overcome these drawbacks.The interviewees did not mentioned any improvements, they only mentioned asby overcoming limitations or drawbacks the tools can be improved.

4.3 Facet 3: Challenges faced by software testers

In this section, we present the results of our third research question RQ3 by usingthe data collected from systematic mapping study, interview and documents.Generally, challenges are the issues or problems faced. In this study, all thechallenges related to PSR are collected and analyzed. A total of 18 challengesrelated to metrics, development, user and tools are identified from SMS. Whereasfrom the interviews, a total of 13 challenges related to tools, development, time,metrics, and network are identified. Only three challenges related to the metricsare observed from the documents.

This section is structured into three subsections, first subsection consists ofchallenges obtained from systematic mapping study which answers the RQ3.1,second subsection consists of challenges obtained from both the interview and thedocuments which answers RQ3.2 and third subsection provides if any mitigationsfrom literature mitigates challenges identified in practice.


This section mainly addresses the data collected from the systematic mappingstudy. Out of 97 articles, 33 articles deal with the challenges related to PSR. Thechallenges are classified into themes and the thematic map for challenges in thesystematic mapping study is provided in figure 4.9.

Figure 4.9: Thematic map for challenges from SMS

In our research the challenges faced by software testers are collected from SMSand they are classified into 4 schemas. They are metrics, user, development, and


tools. The metrics theme contains the data coded from six articles, user themecontains the data coded from three articles, tools contain the data coded from 11articles and development theme contains the coded data from 10 articles. Theclear representation of these sub themes and number of articles for each sub themeare provided using the bar chart as shown in the figure 4.10.

Figure 4.10: Number of articles addressed each theme from SMS

4.3.1.1 User

User schema consists of user based challenges and issues which depend upon theuser behavior. The data coded from the three articles represent three differentchallenges related to user. So from the literature we identified three challengesrelated to user schema. Each challenge is described below.

4.3.1.1.1 Challenge 1 According to Abbors et al. [51], the behavior of theuser need to be simulated based on the real environment. If the simulated userbehavior does not match with the real user behavior, it may cause faults whendeployed in the live environment which are not observed during testing phase. Sothe challenge is to know how to simulate the real user behavior.

4.3.1.1.2 Challenge 2 According to Arkels and Makaroff [52], the challengeis to know whether the improvement in the identified bottlenecks can improvethe overall performance of the system or it leads to the cause of any anotherbottleneck due to different user actions.


4.3.1.1.3 Challenge 3 According to Gao et al. [53], the user satisfaction ismainly depending upon the performance of the application. The challenge is tofind how users are reacting to different response times and what actions are beingperformed by the users related to server responses.

Mitigation: To find the user reactions, a framework has developed by theauthors [53], where it can monitor and retrieve user patterns from the web logsand generate the performance test cases.

4.3.1.2 Tools

Tools schema consists of challenges related to tools and their drawbacks. Thedata coded from the eleven articles are mapped into five different challenges. Sofrom the literature we identified five challenges related to tools schema. Eachchallenge is described below.

4.3.1.2.1 Challenge 1 As already mentioned in the user theme, imposingthe real user behavior is a challenge [54]. According to Shojaee et al. [55], theyhave find two different ways to simulate the real user behavior. One way is torandomize the user data by replacing it with the recorded user input and otherway is by providing the inputs manually. Both the identified ways are not efficient,first approach will not really simulate the user behavior and whereas the otherapproach is a difficult task.

4.3.1.2.2 Challenge 2 The challenge is related to the environment of tooli.e. the improper environment of tools may also cause problems in testing [16].

Mitigation: In order to overcome the challenge posed by the environment.The following factors need to be done properly. They are

• Installation of tool

• Tool setup

• Flexibility of the tool to perform the test

4.3.1.2.3 Challenge 3 The challenge is related to Apache JMeter tool; manytesters face challenge to create more number of virtual users as JMeter only sup-ports limited number of virtual users [56]. To overcome this challenge, JMeterprovided a distribution setup which consists of master and slave server. The ad-dition of distribution setup leads to raise of other challenges such as configurationissues in script while executing with distributed setup.

Another challenge regarding the JMeter tool is generation of test scripts [56,12]. There is a need for the support of external plugins, through which testscripts can be generated. Even though there is a plugin named badboy, but the


compatibility between JMeter and badboy is not efficient so the challenge needto be mitigated.

Kiran et al. [57] stated that “JMeter script does not capture all the dynamicvalues, such as SAML Request, Relay State, Signature Algorithm, AuthorizationState, Cookie Time, Persistent ID (PID), JSession ID and Shibboleth, generatedusing single sign-on mechanism of Unified Authentication Platform.” Anotherchallenge of JMeter tool is the inability to record test cases. Along with that italso provides confusing charts and unclear terminology.

4.3.1.2.4 Challenge 4 According to Quan et al. [58], most of the tools whichare available for testing the quality performance attributes only support the cre-ation of simple test case scenarios. So it may not be sufficient to know thetransaction time and number of simultaneous user from these test case scenariosand also difficult to identify the bottlenecks existing in the application.

4.3.1.2.5 Challenge 5 The tools which use random user sessions and logfile based sessions for simulating virtual users does not able to provide a realworkload.

Mitigation: Xu et al. [54] provided a configuration file for each virtual userbased on a continuous markov chain. It basically provides the information re-garding the visiting paths, stay time, and visiting moments.

4.3.1.3 Metric

Metric schema consists of challenges related to metrics while performing the test-ing. The data coded from the six articles are mapped into six different challenges.So from the literature we identified six challenges related to metrics schema. Eachchallenge is described below.

4.3.1.3.1 Challenge 1 According to Shams et al. [59], identifying the exist-ing dependencies between requests i.e. previous requests while validating presentrequest is one of the main challenge in performance testing.

4.3.1.3.2 Challenge 2 According to Jiang et al. [60], stated that the selec-tion of parameters and the criteria for testing is an important issue in performancetesting. So the selection of metrics is a challenge as there are several parametersavailable, but selecting a suitable set of parameters is a difficult task.

4.3.1.3.3 Challenge 3 Nikfard et al. [6] addressed that most of the chal-lenges faced during the performance testing are related to faults identified in therunning environment i.e. because of poorly deployed resources.


4.3.1.3.4 Challenge 4 Guitart et al. [21] stated that during scalability test-ing the main challenge is related to the resources such as CPU, server, memory,and disk.

Mitigation: This challenge can be mitigated by identifying the type of resourcerequired. By adding particular resource and by measuring the effect from it canhelps in reducing the challenge.

4.3.1.3.5 Challenge 5 Zhou et al. [61] stated that problems related to CPUbottlenecks and I/O bottlenecks can be avoided by controlling the virtual users.

Another challenge is related to network delay and server computation ability,which are not able to control manually. So the issues are generally faced due tonetwork connection and server processor.

4.3.1.3.6 Challenge 6 Specifying the load test parameters like generation offorms, recognition of the returned pages is a major challenge according to theLutteroth et al. [62].

Mitigation: These challenges can be overcome by adding the specifications toform specific model.

4.3.1.4 Development

Development schema consists of challenges related to development area, codingerror etc. The data coded from the 10 articles are mapped into four different chal-lenges. So from the literature we identified four challenges related to developmentschema. Each challenge is described below.

4.3.1.4.1 Challenge 1 The testing contains scenarios to be tested. In caseof large application there may be 100 scenarios for login page itself. So handlinglarge number of scenarios is a challenge. The mitigation to the challenges isprovided by Talib et al. [63].

Mitigation: A metric-based test case partitioning algorithm is used to generatethe test cases. As it produces three equivalence classes and reduce the number oftest cases.

4.3.1.4.2 Challenge 2 Sometimes the errors in the application may not befound in the developed environment, sometimes it may not work properly in theother environment due to faults [64]. The challenge is to reveal the actual errorbefore the end user encounters it.

4.3.1.4.3 Challenge 3 If a site becomes popular, then the load which isconsidered during the development is not sufficient [3], so it is a challenge toknow the number of users may hit the site at the same time.


4.3.1.4.4 Challenge 4 The development challenge is regarding the unneces-sary sleep statements. And garbage collection heap may cause the socket errorswhich leads to decrease in the number of hits and garbage collection heap maydecrease server response which is due to increase in number of virtual users [24].


The challenges faced by software testers are obtained from interviews. The chal-lenges are analyzed by using thematic analysis and the thematic map for chal-lenges are represented in figure 4.11.

Figure 4.11: Thematic map for challenges from interviews

The challenges obtained from the interviews are classified into five themes asshown in figure 4.11. They are metrics, network, development, time and tools.The metrics theme contains the data coded from two interviewees, network themecontains the data coded from three interviewees, development contain the datacoded from eight interviewees, tools theme contains the coded data from seveninterviewees and the time theme contains the data coded from five interviewees.The clear representation of these sub themes and number of interviewees for eachsub theme are provided using the bar chart as shown in the figure 4.12.


Figure 4.12: Number of interviewees addressed the themes

4.3.2.1 Development

In this section the challenges related to development such as developing a scriptor code for test cases are provided. Eight out of 12 interviewees addressed thechallenges related to development.

Majority of these challenges are related to the script issues. In general, testingcan be performed by using scripts. The software testers are facing challengeswhile developing the testing scripts and some of the challenges related to scriptare provided below.

The challenge related to unclear or changing non-functional requirements,one of the interviewee stated that “As we are working in agile, the requirementssometimes are never clear and there may be a sudden change in requirements orfrom other module team so automating of scripts need to be reworked sometimes.”-Senior software engineer.

Another interviewee stated that “Sometimes in LoadRunner while creatingscenarios, we have to capture browser request while building scripts. While build-ing those scripts sometimes the parameters which are captured are not exactlywhat we want. To achieve this, we need to add additional addins to capture thosedetails because browsers do not support by default. Then there are several pa-rameters which are changing the values dynamically. For one request it will beone value and for other request there will be other value, so that part needs to becaptured. And it is very hard to know it, if you don’t see each and every requestwhile building the scripts.” – Verification specialist.


The other challenges are related to the technology expertise. Some of thechallenges identified related to this are provided below.

One of the challenge is that, it is difficult to learn other technology scriptsand commands, one of the interviewee stated that “Unix commands are needed inorder to use some tools so it is difficult for me and also the database scripting isdifficult for me but might be other guys able to do it.” -Senior solution integrator.

Challenge related to simulator development, one of the interviewee statedthat “Sometimes developing a simulator could be something challenging, becauseif I talk about myself I am not very good in java and most of the simulators arebeing built in java. So of course we need developers help sometimes for simulatedenvironment.” – Senior test engineer.

The challenge is related to testability and test automation; it may arise duringtesting. One of the interviewee stated that “Only at the code level that we facechallenges but not as far as I like, yes there are certain areas which are not testableor which are to be manually testable.” – Test quality architect.

The challenge related to reliability may be due to developer fault, one of theinterviewee stated that “If there is some issue with reliability, then the developermight have forgot to put a condition check. For example if the users are limited to1000, then what if 1001 user arrives so validation check of user more than 1000need to be there. Where developers are missing these logics while developing thecode” – Test specialist.

4.3.2.2 Metrics

Two out of 12 interviewees mentioned challenges related to metrics. Some of thechallenges addressed by interviewees related to metrics are provided below. Oneof the interviewee stated that “metrics might cause to rerun the tests and alsochange in benchmarks may be a huge challenge as all the tests need to be rerunagain to check the quality of the application. Again re-testing the whole processis taking so much time.”- Verification specialist

Interviewee stated that “Suppose a maximum number of users for an applica-tion is provided, the tests are run by generating load. If there is any change inthe number of users, then the load need to be regenerate for all the users. In thiscase, understanding the breakpoint is a challenge.” -Software tester.

4.3.2.3 Network

Three out of 12 interviewees mentioned the challenges related to network. Themajor challenge faced by the software testers is network loss. The network loss isa challenge which prolongs the work of tester further.

One of the interviewee stated that “In case of performance testing, challengesare related to network issues, suppose I have to test a game, so it depends onmany factors of that internal network or internet. Some times a game can work


easily even on an old system with an old line of 512 kbps if the network loss isnot too much. If you are getting 100% network, it will work. Same game maynot work on 10 mbps line because 75% will be network loss if there are networkissues.” – Test specialist.

4.3.2.4 Time

Five out of 12 interviewees mentioned the challenges related to time. The timeprovided for testing is not sufficient for the software testers which is a majorchallenge observed. The challenges provided by the interviewees are as follows.

One of the interviewee stated that “Development team may have chance tocomplete their work late but whereas the testers team need to complete their jobwithin the given time. Sometimes we cannot say certain amount of time is suffi-cient for testing there may be issues which we don’t know.” - Test quality architect.

One of the interviewee stated that “Time for testing is never sufficient, al-though you plan a lot of things, like you plan that two weeks for development workwill finish and one week for testing, the real thing is development took 2 to 2.5week and sometimes only 1 day available for testing which is not sufficient.”- Testspecialist.

Some of the interviewees mentioned that they have become used to the littleamount of time for testing. But still it is a challenge which need to be fixed, asit can be mitigated only if the developers finish their work on time.

4.3.2.5 Tool

Seven out of 12 interviewees mentioned that they face challenges related to tools.The data extracted from these seven interviewees provide three different chal-lenges and they are provided below.

The challenges related to the lack of functionalities provided by tools, lackof knowledge in scripting. One of the interviewee stated that “So in JMeter itis not easy to simulate another system that is the main problem. Now If I needto use JMeter, I have to modify it in such way that I need to first simulate therequired system or I need to use the simulator.” And also further stated that “ifsomething like this arises, then because there are proprietary tools like internaltools in Ericsson we can update it but updating open source tools are not in ourhand.” – Test engineer.

Another interviewee stated that “The JMeter tool fails when you hit simulta-neously thousand users or two thousand users, thread locks created may respondto the request but other request may be in deadlock state.” – Senior solution inte-grator.

The challenges related to the system configuration, one of the intervieweestated that “Tools like LoadRunner are quiet heavy tools like if we are normally


running on Pentium 2 or 4GB RAM then it is quiet hard for the tool to run onthese machines.” – Test specialist.

4.3.2.6 Summary

The interviews are conducted from both the case company and other organiza-tions; we find that the challenges identified from other organizations are almostoverlapped with the challenges identified in the case company. The challengesobtained from case company and other organizations are provided in appendixG.

4.3.2.7 Documents

The challenges analysed from the previous project test reports are provided below.The thematic map for the challenges are provided in the figure 4.13.

Figure 4.13: Thematic map for challenges from documents

The challenges obtained from the documents are classified into one schema asshown in figure 4.13. It is metrics and number of identified challenges related tometrics are three.

One of the challenge is regarding the slow response of server, which is causeddue to large amount of data being requested. The solution to this challenge issessions groups i.e. less number of requests, smaller amount of data at the timetaken.


Another challenge is related to the limited number of user connections; itneeds to wait for some time until all the required data files get loaded. This ismitigated by loading once and caching the files, so in a repeated use it might notwait for the files.

The another challenge while testing is more amount of time taken to knowwhether the test case is passing or not as test data needed to be exchangedbetween client and server. In order to mitigate this challenge, the case companyused a temporary solution i.e. by installing a local database.

4.3.3 Does mitigations available in literature mitigates chal-lenges in practice?

This section provides answer to RQ 3.3, first of all the mitigations availablefor the challenges in SMS are identified. Later the challenges from practice areidentified through interviews and documents. We analyzed the data by comparingchallenges identified from state of art and state of practice and noticed that thereare no proper mitigation strategies available to mitigate the identified challengesfrom practice.

4.4 Facet 4: Important attribute among PSR

In order to find the answer to this research question RQ4, we have opted interviewsas a source for data collection.

4.4.1 Interviews

The data obtained from interview are used to answer the research question RQ4.The data collected from the interviews was analysed to identify the importantattribute among PSR in web testing. Based on the collected data, we formed threethemes such as all are important, application based and priority order based. Thethematic map for important attribute among PSR is represented in figure 4.14.

All the interviewees have answered this question. All are important themecontains the data coded from three interviewees, application based theme containsthe coded data of six interviewees and priority order theme consists of the codeddata from three interviewees.


Figure 4.14: Thematic map for important attribute from interviews

4.4.1.1 All are important

Three out of 12 Interviewees mentioned that all the three attributes are equallyimportant. Regardless of application all the three attributes i.e. PSR need to betested to give enough competition in the competitive market. So by testing allthe three attributes PSR, we can reduce the chances of failing the product in thelive environment.

One of the interviewee stated that “I think all of them are interdependent andeach one has its own importance and all the three should go in hand and hand.Because if you take performance is good and if errors keep coming like number ofusers logged in are more. In this case, speed is good but if errors keep forming upthen the reliability is not there in that. If it is the case, then it is a problem. Ifscalability is more like number of user’s it can accommodate, if more users it canaccommodate then the speed will be low. It will not be good suppose if we load apage it takes 10-15 seconds then users usually get bored and they get irritated firstof all. So it is a compromise of all those, I think opting should be there and threeof them should be at the optimum level” -Software tester.

4.4.1.2 Application based

Six out of 12 interviewees mentioned that the importance of attribute can bedefined based on the application you are trying to test. Sometimes it is notnecessary to test the attribute scalability for a website which has less number ofuser’s. So the answer depends upon the application type.

From the words of interviewee, “Actually it depends on application, normalapplication mainly goes with performance no need of scalability. Suppose if a webapplication has n number of users as a requirement then the application need tobe delivered with a capability of n+1 users. For banking application reliability ismain aspect then performance, they already have certain number of known users.


There will be no sudden raise in the number of users. So you need not to gowith performance testing always, whereas the reliability is most important as theapplication need to provide the service to all the users without any fail. For asimple web application, performance is most important and for telecom domain,scalability is very important. So it depends on application.” – Test specialist.

One of the interviewee stated that “the selection of attribute mainly depends onapplication type and also on the requirements provided by customer and market-driven.” – Verification specialist.

4.4.1.3 Priority order

Three out of 12 interviewees mentioned the attributes in an order i.e. prioritizedtheir importance. By analysing the data, we found that reliability and perfor-mance interchange for first and second places but the scalability attribute isalways ranked last. Out of these three interviewees, reliability is given first placeby two interviewees and second place by one interviewee. whereas performanceis given second place by two interviewees and first place by one interviewee. Butthe scalability remains in the last place by all the three interviewees.

One of the interviewee stated that “Reliability is most important attribute asit is the basic thing system should be able to do, if it is not doing that then I thinkit is a fail complete fail. Followed by performance and then scalability.” – Seniorsoftware engineer.

4.4.1.4 Summary

The interviews are conducted from both the case company and other organiza-tions; we find that the most important attribute varies with each intervieweeregardless of case company and other organization, the answer to this particularquestion is answered more from the interviewee perspective. The important at-tributes mentioned by the interviewees of case company and other companies areprovided in the Appendix G.

Chapter 5Discussion

This chapter mainly discusses about the findings presented in section 4, by relat-ing it with the results obtained from all the sources such as systematic mappingstudy, interviews, and documents. The structure of this chapter is as follows

• Section 5.1 discusses about the metrics existing for PSR attributes

• Section 5.2 discusses about the tools existing for PSR attributes

• Section 5.3 discusses about the challenges related to the PSR attributes.

• Section 5.4 discusses about the important attribute among the PSR attributes.

• Section 5.5 discusses about the implications.

5.1 Metrics for testing PSR attributes of web ap-plications

The metrics used for testing the PSR attributes for web applications are collectedfrom systematic mapping study, interviews and documents. The existing metricsare provided in the results section 4.1.

From the available list of metrics, we have observed that the response timeand throughput are the most commonly mentioned metrics from all data sourcessuch as interviews, documents and systematic mapping study. The overlap anddifferences in metrics from three data sources are provided in figure 5.1.

Figure 5.1 provides the information about the overlapped metrics among allthe data sources and also provides the remaining metrics that are identified ineach data source. Eclipse presented in the figure represents the data sources,rectangles represent the remaining different metrics identified from each datasource, rounded rectangle represents the overlapped metrics among all the datasources and the connections between them are represented by using an arrow.For example, the collected metrics from SMS will be obtained by combining theoverlapped metrics (i.e. metrics in rounded rectangle) and metrics identified fromSMS (i.e. metrics in rectangle).

75

Chapter 5. Discussion 76

Figure 5.1: Overlap and differences in metrics among all data sources

In our systematic mapping study, we have identified that response time andthroughput are the most commonly mentioned metrics in every article that fo-cused on metrics. Almost all the interviewees also addressed these two metricsduring the interview process and along with this we have also extracted the datafrom documents. Most of the documents related to performance and scalabilityalso focused on these two metrics. By observing the data extracted from all thedifferent data sources, we have noticed that the response time and throughputare commonly used metrics. According to authors [7, 61, 65, 33] the responsetime and throughput metrics are commonly tested in both the performance andscalability testing which supports our results.

Through interviews, we came to learn about some metrics which are not par-ticularly mentioned in the SMS. The metrics such as rampup time, rampdowntime and rendezvous point and the description of these metrics are provided inthe Appendix C. The rampup and rampdown time metrics are related to manualsetting configuration and these are only mentioned in one article [7].

From interviews we have observed that MTBF, number of failures are mostcommonly mentioned metrics for reliability. Whereas in the systematic mappingstudy, we have observed 13 metrics related to reliability out of which MTBF,number of errors are the commonly mentioned metrics. According to subrayaand subrahmanya [4], fault tolerance and recoverability are calculated by using


MTBF, MTTR and number of failures. As from both the interviews and SMS,we have noticed MTBF is commonly used metrics for reliability.

We observed that all the available metrics in literature are not considered fortesting of web application. We have identified from interviews that the selectionof metrics for testing the applications are also depending on other metrics. Oneof the interviewee stated that “the selection of performance metrics is interlinkedwith other performance metrics. Hence, all the interlinked metrics must be con-sidered while testing the web application”. In general, the selection of metrics fortesting the PSR attributes is based on the criteria such as customer requirements,market requirements, metric dependencies and application type. This observationis further supported by the literature. Jiang et al. [60] stated that the selectionof metrics is based on the type of the application they are going to test andalong with this they have also specified some of the metrics which are commonfor any type of application. According to the Xia et al. [36], all the metrics ofperformance are interlinked with other performance metrics.

Finally, we have observed that the number of articles addressing the perfor-mance metrics are more from SMS. While in the case of interviews, we have ob-served that all interviewees provided required information about the performanceattribute easily when compared to other attributes scalability and reliability. Notonly from the interviews but also from the collected documents focused on per-formance mainly. From this, we noticed that there is a lot of research carried outon performance attribute compared to other two attributes.

5.2 Tools for testing PSR attributes of web appli-cations

The tools that exist for testing the PSR attributes of a web application are re-trieved by conducting systematic mapping study, interviews, and documents. Thelist of identified tools from all the data sources is presented in the result section4.2.

From the interviewees, we observed that Apache JMeter and LoadRunnerare the most commonly used tools for testing web applications in practice. Inaddition to this, from the systematic mapping study we found that in most ofthe literature, it was mentioned that load runner and Apache JMeter tools areavailable for testing the web applications. Also from the documents we can seethat JMeter is the commonly mentioned tool. From all these data sources, weobserved JMeter as the most commonly used tool for testing web applications.The overlap and differences in tools from all the data sources is provided in thefigure 5.2. The findings in this study regarding tools are also supported from theliterature. Xia et al.[36] stated that, Apache JMeter is an open source tool whichis preferred by software testers as it provides more functionality even though it


is an open source.

Figure 5.2: Overlap and differences in tools among all data sources

Figure 5.2 provides the information about the overlapped tools among allthe data sources and also provides the remaining tools that are identified in eachdata source. Eclipse presented in the figure represents the data sources, rectanglesrepresent the remaining different tools identified from each data source, roundedrectangle represents the overlapped tools among all the data sources and theconnections between them are represented by using an arrow. For example, thecollected tools from SMS will be obtained by combining the overlapped tools (i.e.tools in rounded rectangle) and tools identified from SMS (i.e. tools in rectangle).

We have observed that after JMeter, LoadRunner is the most commonly ad-dressed tool both from SMS and interviews. One of the interviewee stated that“LoadRunner is the perfect tool for performance testing of web applications”. Asthe LoadRunner tool is a licensed product, the case company does not prefer touse this tool. One of the interviewee stated that the reason for excluding Load-Runner tool as “Ericsson prefers to use open source tools or internal tools fortesting web applications” - Test specialist.

From the above findings we have identified JMeter and LoadRunner as the


common tools from interviews and SMS. As the interviewees are familiar withthese two tools, they have addressed some of the drawbacks related to thesetools. In JMeter the drawbacks are related to the number of virtual users, andparameter scripting are mentioned through interviews. According to Krizanic[12], the JMeter have some disadvantages related to setup of virtual users, testcase recordings, improper terminology which supports the findings from the in-terview.

From SMS and interviews, we have identified majority of the tools relatedto performance testing. Only few tools are identified for both scalability andreliability attributes. The identified tools for PSR attributes for testing the webapplications are provided in the Appendix D. We have faced a problem regardingthe scalability and reliability attributes, as we did not find more number of toolsrelated to these two attributes in SMS and also from interviews and documents.Lack of research and lack of knowledge on these two attributes may be the reasonwhy the interviewees faced difficulty while answering SR related questions.

Some of the performance tools are also used for testing the scalability of webapplications as the scalability mainly focus on limiting the performance failures.In case of reliability, markov chain models are identified in SMS. A markov modelcontains all the information regarding the possible states and transition pathsbetween states that are exist in the system. In reliability analysis, markov modelholds the information about failures and repairs in the transitions. By usingthis model, common assumptions about failure rate distributions are avoided andalso suitable for the appropriate reliability analysis of web applications. Frominterviews we observed that they are using an internal tool. The internal toolused by the case company supports performance, and reliability attribute.

5.3 Challenges in PSR testing of web applications

The challenges in PSR testing of web applications are retrieved from the system-atic mapping study, interviews and documents. Challenges identified from thedata sources are presented in section 4.3.

Most frequent challenges identified from SMS and interviews are mainly re-lated to the tools. Challenges and issues that are related to the tool JMeter areencountered both in practice and SMS. According to Krizanic [12], the ApacheJMeter has some issues related to number of virtual users, test case recordings,and improper terminology. We have also observed that to overcome the challengesand drawbacks identified in JMeter tool, Ericsson has started developing its owntool for non-functional testing.

Challenges related to the development of web applications are also identifiedfrom both SMS and interviews. The identified challenges related to developmentare related to faults in code, scripts related issues, environment related issues, andprogramming skills of the testers. Articles [56, 57] also supports the mentioned


challenges such as issues related to scripts and environment are observed more inthe web applications.

The challenges related to time are not identified from the literature, whereasfive interviewees mentioned challenges that are related to time. One of the majorchallenges faced by the software testers related to time is that the time availablefor testers is not sufficient to carry out the testing process.

Development methodologies such as agile, kanban are time efficient for theprojects. One of the interviewee stated that "Many organizations are failing tocomplete the project on time even though they are using time efficient method-ologies in the project”. So the main reason for the challenge related to time wasobtained from interviews as the delay caused in development phase. Challengerelated to time is further supported by authors from the literature. According toSubraya [2], pressure on the delivery leads to less time for testing phase whichmeans it is improperly tested product.

Another challenge which we noticed from both SMS and interviews are re-lated to the metrics. The number of metric related challenges identified fromSMS is more when compared to interviews. The common challenge identifiedfrom both the interviews and SMS is related to the selection of parameter i.e.metric. According to Xia et al. [36], all the metrics are interlinked, so the met-rics should be selected carefully by considering the type of web application. Wehave also identified some of the challenges related to metrics faced by the casecompany as mentioned in the section 4.3.2.2. The challenges identified from thedocuments are related to metrics category. The overlapped challenges from allthe data sources are related to metrics as provided in figure 5.4. Finally, from allthe data sources we observed that more number of challenges are related to theperformance attribute than other two attributes.

Figure 5.3 provides the information about the overlapped challenges amongall the data sources and also provides the remaining challenges that are identifiedin each data source. Eclipse presented in the figure represents the data sources,rectangles represent the remaining different challenges identified from each datasource, rounded rectangle represents the overlapped challenges among all the datasources and the connections between them are represented by using an arrow.From all the data sources we did not find any common metric.

Figure 5.4 provides the information about the overlapped challenge area amongall the data sources and also provides the remaining challenges area that areidentified in each data source. Eclipse presented in the figure represents the datasources, rectangles represent the remaining different challenge areas identifiedfrom each data source, rounded rectangle represents the overlapped challenge ar-eas among all the data sources and the connections between them are representedby using an arrow. From all the data sources we find the challenges mainly relatedto metrics are most common.


Figure 5.3: Overlap and differences in challenges among all data sources

Figure 5.4: Overlap and differences in challenge areas among all data sources


5.4 Most important attribute among PSR

The most important attribute among PSR attributes is retrieved by conduct-ing interviews. The data obtained from interviews about the most importantattribute are classified into three schemes as mentioned in section 4.4.

The data collected from the interviews vary. The extracted data from theinterviews generally comes under one of the three categories such as all are im-portant, application based and order based. We observed that three intervieweesmentioned all the three attributes are important for testing of web applications,six interviewees mentioned that the selection of the attributes depend upon thetype of the application and finally three interviewees prioritized the PSR at-tributes.

Along with the interviews, we have also found that the type of web applicationsplays a crucial role in the selection of attributes. Depending upon the type of webapplication the attribute may be selected. This is supported by many articles.According to [4, 11, 60] authors stated that the attribute is considered dependingon the type of web application to be tested. So the literature supports the actualresponse provided by interviewees.

The main reason for choosing the category all are important by interviewees isbecause of the importance for quality. The other category focuses on prioritizingthe PSR attributes order, based on their experience the interviewees provideda ranking order among the PSR attributes. This ranking order varies from oneto other. The main reason for the variation in response may be due to theirexperience, due to lack of knowledge, due to their intake of quality aspect.

We also find that among three interviewees, two interviewees highlighted theattribute reliability as the most important attribute and one interviewee pointedperformance as the most important. As from our observation performance is themost mentioned attribute in the literature. But whereas from the interviews weidentified reliability as the most important attribute in the case of priority orderbased theme. Hence, the results obtained from both the sources vary. As thepractitioners feel that reliability is an important attribute, there is more scopefor the research related to reliability.

5.5 Implications

The overlap and differences identified for metrics both from state of art and stateof practice as provided in figure 5.5. It is identified that the number of metrics aremore in literature when compared to practice. The reason for this observation isbecause in practice all metrics are not considered while testing web application.In practice the metrics are selected based on the type of application and othercriteria such as customer and market requirements and metric dependency. Fromthe figure 5.5 observed pattern is almost all the metrics identified from state


Figure 5.5: Overlap and differences in metrics between state of art and state ofpractice

of practice are overlapped with state of art. Along with this, few metrics areidentified from state of practice which are not available in state of art. Thedifference is due to indepth knowledge to software testers in testing PSR attributesand some metrics mostly related to company specific application.

The overlap and differences identified for tools both from state of art and stateof practice as provided in figure 5.6. It is identified that the number of tools aremore in literature when compared to practice. The reason for this observationis because in practice all tools are not considered while testing web application.In practice the tools are selected based on the type of application and companyprovided guidelines. From the figure 5.6 observed pattern explains that moredifference exists between state of art and practice. As tools identified from thepractice are newly available tools in the market, company specific tools and sometools which are known to practitioners through experience. Whereas literaturedoes not contain current information regarding tools, so some new research needto be done for providing data regarding new tools. The overlapped tools are mostcommonly used tools for testing web applications as more number of articles andinterviewees mentioned them.


Figure 5.6: Overlap and differences in tools between state of art and state ofpractice

The overlap and differences identified for challenges both from state of artand state of practice as provided in figure 5.7. It is identified that challengesfrom practice are different from state of art due to software testers face differentchallenges while testing and based on situation and environment new challengesmay arise which are not commonly retrieved from literature. The overlappedchallenges are due to studied case company used JMeter tool, as most of thechallenges identified from practice are related to this tool. Along with this net-work issues is the common challenge identified in practice, there are no propermitigations for this challenge because it mainly depends upon environment andnetwork strength.

Figure 5.5, 5.6 and 5.7 provides the information about the overlapped metrics,tools, challenges between state of art and state of practice and also providesthe remaining metrics, tools, challenges that are identified from state of art andstate of practice. Eclipse presented in the figure represents the state of art andpractice, rectangles represent the remaining different metrics, tools, challengesidentified in each of them, rounded rectangle represents the overlapped metrics,tools, challenges between state of art and practice and the connections between


Figure 5.7: Overlap and differences in challenges between state of art and stateof practice

them are represented by using an arrow. For example, the collected metrics fromstate of art will be obtained by combining the overlapped metrics (i.e. metrics inrounded rectangle) and metrics identified from SMS (i.e. metrics in rectangle).

The collected information from the research will help the practitioners to gainthe information which was previously not known to them. And it also acts as areference for the new practitioners in future.

It also helps the researchers to know the current status in the research areaof PSR attributes. It acts as a reference for carrying out the further research inthis area.

Chapter 6Conclusions and Future Work

This section mainly focuses on answering the research questions. Along with that,it also deals with the conclusion and future work.

6.1 Research questions and answers

This section consists of answers to the research questions mentioned in the section3.2. It is divided into four subsections where each subsection answers each of theresearch questions.

In order to validate the results, we have conducted eight interviews from thecase company and four interviews from other organizations. The senior memberswith lot of experience in testing are selected from these three organizations. Alongwith the interviews, documents are also used for collecting the information that ismissing or failed during the interviews. These are also used for data triangulation.

6.1.1 RQ 1: Metrics used for testing the PSR attributes

The following research questions answer the research question 1.

6.1.1.1 RQ1.1 What metrics are suggested in the literature for testingPSR attributes?

In order to identify the metrics that are used for testing PSR attributes of webapplication a systematic mapping study is conducted. The data required forthis study is collected from five different databases and a total of 97 articlesare selected. As this question is mainly dealing with the metrics, articles thatare dealing with the metrics are selected in order to answer this question. Atotal of 80 articles are selected and analyzed. After complete analysis, we haveidentified a total of 69 metrics related to PSR. Out of which 39 metrics related toperformance, 17 to scalability and the remaining 13 metrics deals with reliabilityattribute. The identified metrics are specified in the section 4.1.1. Out of thesemetrics, we identified response time and throughput as the most important andcommon metrics for both scalability and performance.

86

Chapter 6. Conclusions and Future Work 87

6.1.1.2 RQ1.2 What metrics are used by software testers in practicefor testing PSR attributes?

In order to identify the metrics that are used for testing the PSR attributes of webapplication in practice, interviews are conducted and along with that documentsi.e. test reports from the previous projects are also collected. A total of 12interviews and 18 documents are utilized to gather the data required for answeringthis question. All the selected 12 interviewees addressed the metrics used in theircompany for testing the PSR attributes of web applications. After analyzing thedata collected from the interviews a total of 30 metrics are identified, out of which20 metrics related to performance, eight to scalability and the remaining twometrics deals with reliability. The identified metrics are specified in the section4.1.2. We have also collected a total of 18 documents. Among these 18 documentsa total of 16 metrics are identified, out of which nine are performance, seven dealswith scalability and we did not observe any metrics related to reliability. Themetrics collected from the documents are mentioned in the section 4.1.2.5. Wealso found that the metrics collected from the documents overlapped with metricscollected from interviews. The results from the documents and interviews helpedin answering this question.

6.1.1.3 RQ1.3 Why are particular metrics used or not used by soft-ware testers?

We have selected interviews as a medium for answering this question. A totalof 12 interviews are conducted and all of them have specified the same reason.After analyzing the data collected from the interview, the main reason we foundis that the selection of metrics is not fixed and they vary depending on the typeof web applications, customer requirements, market requirements and metric de-pendency.

6.1.2 RQ 2: Tools used for testing the PSR attributes

The answer to this research question is obtained by answering the research ques-tions below. This research question mainly focuses on identifying the tools usedfor testing the PSR attributes of web applications. All the identified tools fromthe systematic mapping study, interview and documents are combined and pre-sented in the appendix D. Along with the parameters such as developer, platformsupport, availability, testing attribute, resource URL, tool type, source and pro-gramming language for each specified tool is provided.


6.1.2.1 RQ2.1 What tools are suggested in the literature for testingPSR attributes?

A systematic mapping study is selected for answering this question. Five differentdatabases are selected for gathering the data required for answering this question.A total of 97 articles are selected, out of which only 76 articles address the toolsused for testing the PSR attributes. After analyzing these selected 76 articles atotal of 54 tools are obtained out of these 46 are related to performance, 23 arescalability and five tools specify reliability. We have observed that JMeter andLoad Runner are commonly used tools for performance testing. All the identifiedtools are mentioned in the table 4.5 in section 4.2.1.

6.1.2.2 RQ2.2 What tools are used by the software testers in practicefor testing PSR attributes?

This research question helps in identifying the tools that are currently used bythe software testers in practice for testing the PSR attributes of web applications.So this question helped in collecting the tools that are used by the testers in prac-tice. For answering this question, we have used 12 interviews and 18 documents.The collected data from these 12 interviews contains a total of 18 tools. Thetools collected from the interviews are specified in the section 4.2.2. From theseinterviews we have noticed that JMeter and Load Runner are specified by mostof the interviewees. Along with that we have also identified the usage of internaltools for the testing of web applications in the case company. We also identifiedsome new tools which are not observed from the mapping study. From these 18documents a total of four tools are identified. We did not come across any toolsused for scalability and reliability among these documents. The identified toolsare addressed in the section 4.2.2.9.

6.1.2.3 RQ2.3 What are the drawbacks of the tools used by softwaretesters in practice and improvements suggested by them?

It is inaccurate to identify the current drawbacks of the existing tools from theliterature. As the mentioned drawbacks in the literature might be solved orovercome. We didn’t know whether it has solved or not. So in order to findthe present drawbacks that are existing in the tools, gathering the informationfrom the software testers who are using the tools for testing the PSR attributes isappropriate. For this purpose, we have selected 12 interviews and 18 documentsfor finding the answers to this question. From these 12 interviews, a total of seveninterviewees specified the drawbacks that are existing in the tools. The drawbacksspecified by the interviewees are mainly concerned with the load runner andJMeter. As these are the most commonly used tools for testing the performanceattributes as from the data obtained from the RQ 2.2. The drawbacks addressedby the interviewees are provided in section 4.2.3.


In order to overcome the drawbacks, the case company developed an internaltool which consists of needful functionalities of the JMeter. The internal tool alsocontains some additional functionalities. Along with that we have noticed thatthe specified drawbacks can be improved by the changing the limitations existingin the tools.

From 18 documents we did not observe any drawbacks related to the tools.

6.1.3 RQ 3 Challenges identified while testing the PSR at-tributes

This research question mainly concentrated on identifying the challenges facedby the software testers while testing the PSR attributes in web applications. Thesolutions for this research question is obtained by answering the below questions.The answers to these questions obtained from different sources, one is from liter-ature and the other from the interviews and documents.

6.1.3.1 RQ3.1 What are the challenges faced by software testers andwhat are the mitigation strategies available in literature fortesting PSR attributes?

The answers to this question is obtained from the systematic mapping study. Atotal of 97 articles are selected from five different databases in order to gather thedata regarding the challenges faced while testing these PSR attributes. We haveidentified 18 challenges among the selected 33 articles out of which three are aboutusers, five about tools, six about metrics and four about development analyzingall the articles, we came across different challenges in these PSR attributes. Wehave divided and assigned all the identified challenges into four categories theyare user category contains three challenges, metric contains six challenges, toolcontains five challenges and development contains four challenges. All the identi-fied challenges are mentioned in the section 4.3.1. The main challenges identifiedare simulating the real user behavior, unable to support the virtual users by tools,network delay and identifying the suitable metrics for testing and handling thelarge number of scenarios.

6.1.3.2 RQ3.2: What are the challenges faced by software testers inpractice while testing PSR attributes?

As a part of answering this question we have conducted 12 interviews. All the12 interviewees mentioned the challenges faced during testing PSR attributes ofweb applications based on their experience from previous and current projectsthey are working. After analyzing the results obtained from the interviewees, atotal of 13 challenges are identified. All these challenges are mentioned in thesection 4.3.2. After analyzing the data, we have noticed that main challenges


are due to scripts used for testing, limitations in the tools and issues related tothe development. As these challenges are specified in majority of the interviews.And a total of three challenges are identified from the documents and all thesechallenges are presented in the section 4.3.2.6.

6.1.3.3 RQ3.3: Does the existing measures from the literature cansolve the challenges faced by software testers in practice?

After analyzing all the collected data from literature and interviews, we havenoticed that both the literature and interview pointed some similar challenges.The identified mitigations from literature are provided in section 4.3.1. And wenoticed that there were no proper mitigation strategies available in the literaturefor identified challenges in practice. Even though some of the authors specifiedsome strategies and models but they are not validated empirically and propermeasures for these challenges. There is a need to address the identified challengesfrom interviews by conducting further research.

6.1.4 RQ 4: Important attribute among PSR

Quality attributes are very important for holding the competition in the com-petitive market. But because of factors like early delivery and pressure, webapplications are deploying into the market without proper testing. From thisquestion we thought to find the important attribute to be considered for testing,so that the software tester can conduct the testing on most important attributefirst by excluding the remaining in case of time pressure. For answering this ques-tion, we have conducted a total of 12 interviews and all the 12 interviewees wereanswered. After analyzing the answers obtained from the interview, we catego-rized the answers from interviewees into three themes such as all are important,application based, priority order. Finally, from all these categories we noticedthat the priority of the attributes varies and they are not always fixed. From ourobservations through SMS and interviews, it is concluded that the importance ofthe attribute depends on the type of the web application.

6.2 Conclusion

The study was conducted in order to identify the metrics, tools and challengesthat exist while testing the PSR attributes of web applications. By conducting asystematic mapping study and a case study at Ericsson, we are able to accomplishthe objectives of the research. In order to obtain the required information forour research, we used three different data sources such as interviews, SMS, anddocuments. The documents and interviews are obtained from the case company.The available documents i.e. test reports of the previous projects are collected


from the case company for the data triangulation. The data collected from allthe sources are analyzed by using thematic analysis with the help of a tool namedNvivo, which is used for coding the data.

Based on the obtained results, the existing metrics are identified from all thedata sources. We have identified that most of the literature mainly focused on theperformance attribute than the other two attributes i.e. scalability and reliability.The response time and throughput are viewed as the most commonly used andmentioned metrics for both performance and scalability from all the sources. Inthe case of reliability, we observed that MTBF is the most commonly mentionedmetric from the interviews and SMS.

The tools available for testing the PSR attributes are collected from all thedata sources. An interesting finding is that from all the identified tools, JMeteris the most important and commonly mentioned open source tool for testing theperformance attribute of web applications. Along with this from the interviewsand SMS, we have also identified another tool named Load Runner (commercialtool) next to JMeter. These two tools are commonly preferred for testing theperformance attribute of web applications. Whereas in the case of scalability andreliability, there are very few tools available from both SMS and interviews.

The challenges while testing the PSR attributes are also identified from theliterature, interviews, and documents. And the majority of the challenges iden-tified from this study are related to the development, metrics, and tools. Thechallenges faced by the software testers regarding the development are related tothe scripting issues, metric issues are related to the dependencies and the issuesrelated to tools are mainly due to the test case scenarios and lack of proper inputto the tool. In order to overcome the challenge posed by the JMeter tool, thecase company has developed its own tool to mitigate the challenges.

The most important attribute among the PSR attributes is identified frominterviews. We have identified that the selection of the attribute mainly dependsupon the type of the application they are testing, customer and market require-ments.

Through this study, all the analyzed data regarding the tools, metrics andchallenges are used to generate a list. These lists will help the software testers togain the knowledge. Hence we conclude that the PSR attributes are the essentialquality factors which play a major role in the testing of web applications to decidethe quality.

6.3 Research contribution

The study deals with the metrics and tools used by software testers for testingthe PSR attributes of web applications. Along with this, it also concentrateson identifying the challenges faced by the software testers while testing the PSRattributes of web applications. As there is a lack of research related to PSR at-


tributes in web applications, we have opted the present study as our research tocontribute some knowledge to the existing body of knowledge in software engi-neering.

The study adds the information to the existing knowledge regarding metricsand tools available for testing the PSR attributes of web applications. Whereasthe previous studies are not much focused on providing the information regardingthe tools and metrics which are already available for testing the PSR attributes.So the contribution of the study will help the practitioners and researchers to gainthe information regarding the tools and metrics which are available for testingthe PSR attributes of web applications. The identified data regarding the metricsand tools are provided in section 4.1 and 4.2 and also provided in a list as shownin the Appendix C and D.

The study also identified the challenges that are faced by the software testerswhile testing the PSR attributes of web application. Identified challenges fromall the data sources are provided in section 4.3 and as a list in the Appendix E.As per our knowledge, there are no previous studies existing on the challengesfaced by software testers while testing the PSR attributes of web applications.There is less attention sought regarding the PSR of web applications in the priorstudies.

6.4 Future work

The study provides a knowledge regarding the metrics, tools and challenges re-lated to PSR testing of web applications. As from the obtained results, it is clearthat more research has carried on the performance attribute of the web applica-tions. Whereas the research available on the scalability and reliability attributesof the web applications are very less. Future research can be carried out in fourdifferent areas such as it can be done on the scalability and reliability testingof web applications, a systematic literature review can be conducted in order toidentify all the information related to the performance testing of the web applica-tions, a research can be conducted from more number of companies to identify themetrics and tools available for all the non-functional attributes and a research canbe conducted to investigate the difficulty level in testing these PSR attributes.

Bibliography

[1] V. Varadharajan. Evaluating the Performance and Scalability of Web Appli-cation Systems. Third International Conference on Information Technologyand Applications (ICITA’05), 1:111–114, 2005.

[2] B. M. Subraya, S. V. Subrahmanya, J. K. Suresh, and C. Ravi. Pepper: a newmodel to bridge the gap between user and designer perceptions. In ComputerSoftware and Applications Conference, 2001. COMPSAC 2001. 25th AnnualInternational, pages 483–488, 2001.

[3] Md Safaet Hossain. Performance evaluation web testing for ecommerce websites. In Informatics, Electronics & Vision (ICIEV), 2012 International Con-ference on, pages 842–846. IEEE, 2012.

[4] B. M. Subraya and S. V. Subrahmanya. Object driven performance testing ofweb applications. In Quality Software, 2000. Proceedings. First Asia-PacificConference on, pages 17–26, 2000.

[5] Amira Ali and Nagwa Badr. Performance testing as a service for web ap-plications. In 2015 IEEE Seventh International Conference on IntelligentComputing and Information Systems (ICICIS), pages 356–361. IEEE, 2015.

[6] Hossein Nikfard, Ibrahim. A Comparative Evaluation of approaches for WebApplication Testing. International Journal of Soft Computing and SoftwareEngineering [JSCSE], 3(3):333–341, 2013.

[7] Elder Rodrigues, Maicon Bernardino, Leandro Costa, Avelino Zorzo, andFlavio Oliveira. PLeTsPerf - A Model-Based Performance Testing Tool. 2015IEEE 8th International Conference on Software Testing, Verification and Val-idation (ICST), pages 1–8, 2015.

[8] M Pinzger and G Kotsis. AWPS - Simulation based automated web perfor-mance analysis and prediction. Proceedings - 7th International Conference onthe Quantitative Evaluation of Systems, QEST 2010, (c):191–192, 2010.

[9] Jeff Tian and Li Ma. Web testing for reliability improvement. Advances inComputers, 67:177–224, 2006.

93

BIBLIOGRAPHY 94

[10] Thanh Nguyen. Using control charts for detecting and understanding per-formance regressions in large software. Proceedings - IEEE 5th InternationalConference on Software Testing, Verification and Validation, ICST 2012,pages 491–494, 2012.

[11] Ping Li, Dong Shi, and Jianping Li. Performance test and bottle analysisbased on scientific research management platform. 2013 10th InternationalComputer Conference on Wavelet Active Media Technology and InformationProcessing (ICCWAMTIP), pages 218–221, 2013.

[12] J. Križanić, A. Grgurić, M. Mošmondor, and P. Lazarevski. Load testingand performance monitoring tools in use with ajax based web applications.In MIPRO, 2010 Proceedings of the 33rd International Convention, pages428–434, May 2010.

[13] Lakshmi S. Iyer, Babita Gupta, and Nakul Johri. Performance, scalabilityand reliability issues in web applications. Industrial Management & DataSystems, 105(5):561–576, June 2005.

[14] Giuseppe A. Di Lucca and Anna Rita Fasolino. Testing Web-based appli-cations: The state of the art and future trends. Information and SoftwareTechnology, 48(12):1172–1186, 2006.

[15] Anna Rita Fasolino, Domenico Amalfitano, and Porfirio Tramontana. Webapplication testing in fifteen years of WSE. Proceedings of IEEE InternationalSymposium on Web Systems Evolution, WSE, pages 35–38, 2013.

[16] Rizal Suffian, Dhiauddin. Performance testing: Analyzing differences of re-sponse time between performance testing tools. In Computer & InformationScience (ICCIS), 2012 International Conference on, volume 2, pages 919–923.IEEE, 2012.

[17] Xingen Wang, Bo Zhou, and Wei Li. Model-based load testing of web appli-cations. Journal of the Chinese Institute of Engineers, 36(1):74–86, 2013.

[18] H.M.a Aguiar, J.C.a Seco, and L.b Ferrão. Profiling of real-world web ap-plications. In PADTAD 2010 - International Workshop on Parallel and Dis-tributed Systems: Testing, Analysis, and Debugging, pages 59–66, 2010.

[19] Amal Ibrahim. Quality Testing. Signal Processing, pages 1071–1076, 2007.

[20] Arora A and Sinha M. Web Application Testing: A Review on Techniques,Tools and State of Art. International Journal of Scientific & EngineeringResearch, 3(2):1–6, 2012.

BIBLIOGRAPHY 95

[21] Jordi Guitart, Vicenç Beltran, David Carrera, Jordi Torres, and EduardAyguadé. Characterizing secure dynamic web applications scalability. Pro-ceedings - 19th IEEE International Parallel and Distributed Processing Sym-posium, IPDPS, 2005.

[22] Chia Hung Kao, Chun Cheng Lin, and Juei-Nan Chen. Performance Test-ing Framework for REST-Based Web Applications. 2013 13th InternationalConference on Quality Software, pages 349–354, 2013.

[23] Hamed O. and Kafri N. Performance testing for web based application archi-tectures (.NET vs. Java EE). 2009 1st International Conference on NetworkedDigital Technologies, NDT 2009, pages 218–224, 2009.

[24] M. Kalita, S. Khanikar, and T. Bezboruah. Investigation on performancetesting and evaluation of PReWebN: a Java technique for implementing webapplication. IET Software, 5(5):434, 2011.

[25] Kunhua Zhu, Junhui Fu, and Yancui Li. Research the performance test-ing and performance improvement strategy in web application. In 2010 2ndInternational Conference on Education Technology and Computer, volume 2,pages 328–332, June 2010.

[26] Richard Berntsson Svensson, Tony Gorschek, Björn Regnell, Richard Torkar,Ali Shahrokni, and Robert Feldt. Quality Requirements in Industrial Practice- An Extended Interview Study at Eleven Companies. IEEE Transactions onSoftware Engineering, 38(4):923–935, 2012.

[27] Tasha Hollingsed and David G. Novick. Usability inspection methods after15 years of research and practice. In Proceedings of the 25th Annual ACMInternational Conference on Design of Communication, SIGDOC ’07, pages249–255. ACM, 2007.

[28] Muhammad Junaid Aamir and Awais Mansoor. Testing web applicationfrom usability perspective. In Computer, Control & Communication (IC4),2013 3rd International Conference on, pages 1–7. IEEE, 2013.

[29] Connie U Smith and Lloyd G Williams. Building responsive and scalableweb applications. In Int. CMG Conference, pages 127–138, 2000.

[30] Giovanni Denaro, Andrea Polini, and Wolfgang Emmerich. Early perfor-mance testing of distributed software applications. In ACM SIGSOFT Soft-ware Engineering Notes, volume 29, pages 94–103. ACM, 2004.

[31] Sangeeta Phogat and Kapil Sharma. A statistical view of software reliabilityand modeling. In Computing for Sustainable Global Development (INDIA-Com), 2015 2nd International Conference on, pages 1726–1730. IEEE, 2015.

BIBLIOGRAPHY 96

[32] Tanjila Kanij, Robert Merkel, and John Grundy. Performance assessmentmetrics for software testers. In 2012 5th International Workshop on Co-operative and Human Aspects of Software Engineering, CHASE 2012 - Pro-ceedings, pages 63–65, 2012.

[33] Niclas Snellman, Adnan Ashraf, and Ivan Porres. Towards Automatic Per-formance and Scalability Testing of Rich Internet Applications in the Cloud.2011 37th EUROMICRO Conference on Software Engineering and AdvancedApplications, pages 161–169, 2011.

[34] Akshay and Nikhil. Thesis project plan. pages 1–15, 2016.

[35] Serdar Doğan, Aysu Betin-Can, and Vahid Garousi. Web application testing:A systematic literature review. Journal of Systems and Software, 91:174–201,2014.

[36] Xiaokai Xia, Qiuhong Pei, Yongpo Liu, Ji Wu, and Chao Liu. Multi-levellogs based web performance evaluation and analysis. ICCASM 2010 - 2010 In-ternational Conference on Computer Application and System Modeling, Pro-ceedings, 4(Iccasm):37–41, 2010.

[37] Deepak Dagar and Amit Gupta. Performance testing and evaluation of webapplications using wapt pro. International Journal of Innovative Research inComputer and Communication Engineering, 3(7):6965–6975, 2015.

[38] Tyagi Rina. A Comparative Study of Performance Testing Tools. Inter-national Journal of Advanced Research in Computer Science and SoftwareEngineering, 3(5):1300–1307, 2013.

[39] R Manjula and Eswar Anand Sriram. Reliability evaluation of web applica-tions from click-stream data. International Journal of Computer Applications,9(5):23–29, 2010.

[40] Fei Wang and Wencai Du. A test automation framework based on WEB.Proceedings - 2012 IEEE/ACIS 11th International Conference on Computerand Information Science, ICIS 2012, pages 683–687, 2012.

[41] Isha Arora. A Brief Survey on Web Application Performance Testing ToolsLiterature Review. International Journal of Latest Trends in Engineering andTechnology, 5(3):367–375, 2015.

[42] Vahid Garousi, Ali Mesbah, Aysu Betin-Can, and Shabnam Mirshokraie.A systematic mapping study of web application testing. Information andSoftware Technology, 55(8):1374–1396, 2013.

BIBLIOGRAPHY 97

[43] Junzan Zhou, Shanping Li, Zhen Zhang, and Zhen Ye. Position paper. Pro-ceedings of the 2013 international workshop on Hot topics in cloud services -HotTopiCS ’13, (April):55, 2013.

[44] C. Kallepalli and J. Tian. Usage measurement for statistical Web testingand reliability analysis. Proceedings Seventh International Software MetricsSymposium, pages 148–158, 2001.

[45] Per Runeson and Martin Höst. Guidelines for conducting and reportingcase study research in software engineering. Empirical Software Engineering,14(2):131–164, 2009.

[46] Kai Petersen, Robert Feldt, Shahid Mujtaba, and Michael Mattsson. System-atic mapping studies in software engineering. In 12th international conferenceon evaluation and assessment in software engineering, volume 17, pages 1–10.sn, 2008.

[47] Martin N Marshall. Sampling for qualitative research. Family practice,13(6):522–526, 1996.

[48] Daniela S. Cruzes and Tore Dyba. Recommended Steps for Thematic Syn-thesis in Software Engineering. 2011 International Symposium on EmpiricalSoftware Engineering and Measurement, (7491):275–284, 2011.

[49] V. Braun and V. Clarke. Using thematic analysis in psychology. QualitativeResearch in Psychology, 3(May 2015):77–101, 2006.

[50] Emilia Mendes, Nile Mosley, and Steve Counsell. Web metrics - estimatingdesign and authoring effort. IEEE Multimedia, 8(1):50–57, 2001.

[51] Fredrik Abbors, Tanwir Ahmad, Dragos Truscan, and Ivan Porres. MBPeT: A Model-Based Performance Testing Tool. Fourth International Conferenceon Advances in System Testing and Validation Lifecycle, (c):1–8, 2012.

[52] A Arkles and D Makaroff. MT-WAVE: Profiling multi-tier web applications.In ICPE’11 - Proceedings of the 2nd Joint WOSP/SIPEW International Con-ference on Performance Engineering, pages 247–258, 2011.

[53] Wu Gongxin Jinlong Gao, Tiantian. A Reactivity-based Framework of Au-tomated Performance Testing for Web Applications. In 2010 Ninth Interna-tional Symposium on Distributed Computing and Applications to Business,Engineering and Science, pages 593–597. IEEE, 2010.

[54] L. Xu, W. Zhang, and L. Chen. Modeling users’ visiting behaviors for webload testing by continuous time markov chain. In Web Information Systemsand Applications Conference (WISA), 2010 7th, pages 59–64, Aug 2010.

BIBLIOGRAPHY 98

[55] Aida Shojaee, Nafiseh Agheli, and Bahareh Hosseini. Cloud-based load test-ing method for web services with vms management. In 2015 2nd InternationalConference on Knowledge-Based Engineering and Innovation (KBEI), pages170–176. IEEE, 2015.

[56] Muhammad Arslan, Usman Qamar, Shoaib Hassan, and Sara Ayub. Au-tomatic performance analysis of cloud based load testing of web-application& its comparison with traditional load testing. In Software Engineering andService Science (ICSESS), 2015 6th IEEE International Conference on, pages140–144. IEEE, 2015.

[57] S. Kiran, A. Mohapatra, and R. Swamy. Experiences in performance test-ing of web applications with unified authentication platform using jmeter. InTechnology Management and Emerging Technologies (ISTMET), 2015 Inter-national Symposium on, pages 74–78, Aug 2015.

[58] Xiuxia Quan and Lu Lu. Session-based performance test case generationfor web applications. In Supply Chain Management and Information Systems(SCMIS), 2010 8th International Conference on, pages 1–7. IEEE, 2010.

[59] Diwakar Krishnamurthy, Mahnaz Shams, and Behrouz H Far. A model-based performance testing toolset for web applications. Engineering Letters,18(2):92, 2010.

[60] Guangzhu Jiang and Shujuan Jiang. A quick testing model of web perfor-mance based on testing flow and its application. In 2009 Sixth Web Informa-tion Systems and Applications Conference, pages 57–61. IEEE, 2009.

[61] Junzan Zhou, Bo Zhou, and Shanping Li. LTF: A Model-Based Load TestingFramework for Web Applications. 2014 14th International Conference onQuality Software, pages 154–163, 2014.

[62] Christof Lutteroth and Gerald Weber. Modeling a realistic workload forperformance testing. Proceedings - 12th IEEE International Enterprise Dis-tributed Object Computing Conference, EDOC 2008, pages 149–158, 2008.

[63] Manar Abu Talib, Emilia Mendes, and Adel Khelifi. Towards reliable webapplications: Iso 19761. In IECON 2012-38th Annual Conference on IEEEIndustrial Electronics Society, pages 3144–3148. IEEE, 2012.

[64] Yuta Maezawa, Kazuki Nishiura, Hironori Washizaki, and Shinichi Honiden.Validating ajax applications using a delay-based mutation technique. In Pro-ceedings of the 29th ACM/IEEE international conference on Automated soft-ware engineering, pages 491–502. ACM, 2014.

BIBLIOGRAPHY 99

[65] FA Torkey, Arabi Keshk, Taher Hamza, and Amal Ibrahim. A new method-ology for web testing. In Information and Communications Technology, 2007.ICICT 2007. ITI 5th International Conference on, pages 77–83. IEEE, 2007.

[66] Yunming Pu, Mingna Xu, Pu Yunming, Xu Mingna, and Xu M Pu Y. Loadtesting for web applications. 2009 1st International Conference on Informa-tion Science and Engineering, ICISE 2009, (1):2954–2957, 2009.

[67] R Thirumalai Selvi and N V Balasubramanian. Performance Measurementof Web Applications Using Automated Tools. I:13–16, 2013.

[68] J.W. Cane. Performance measurements of Web applications. IEEE South-eastCon, 2003. Proceedings., 55(5):1599–1605, 2003.

[69] Joydeep Mukherjee, Mea Wang, and Diwakar Krishnamurthy. PerformanceTesting Web Applications on the Cloud. 2014 IEEE Seventh InternationalConference on Software Testing, Verification and Validation Workshops, pages363–369, 2014.

[70] M. R. Dhote and G. G. Sarate. Performance testing complexity analysis onajax-based web applications. IEEE Software, 30(6):70–74, Nov 2013.

[71] Qinglin Wu and Yan Wang. Performance testing and optimization of J2EE-based web applications. 2nd International Workshop on Education Technologyand Computer Science, ETCS 2010, 2:681–683, 2010.

[72] Supriya Gupta and Lalitsen Sharma. Performance analysis of internal vs.external security mechanism in web applications. Int. J. Advan. NetworkApplic, 1(05):314–317, 2010.

[73] R Thirumalai Selvi, Sudha, N V Balasubramanian, and E N G Ia. Per-formance analysis of proprietary and non-proprietary software. Imecs 2008:International Multiconference of Engineers and Computer Scientists, Vols Iand Ii, I:982–984, 2008.

[74] Harry M. Sneed and Shihong Huang. WSDLTest - A tool for testing webservices. Proceedings of the Eighth IEEE International Symposium on WebSite Evolution, WSE 2006, pages 14–21, 2006.

[75] Jianfeng Yang, Rui Wang, Zhouhui Deng, and Wensheng Hu. Web softwarereliability analysis with Yamada exponential testing-effort. ICRMS’2011 -Safety First, Reliability Primary: Proceedings of 2011 9th International Con-ference on Reliability, Maintainability and Safety, pages 760–765, 2011.

[76] G. Ruffo, R. Schifanella, M. Sereno, and R. Politi. Walty: a tool for eval-uating web application performance. In Quantitative Evaluation of Systems,

BIBLIOGRAPHY 100

2004. QEST 2004. Proceedings. First International Conference on the, pages332–333, Sept 2004.

[77] Filippo Ricca and Paolo Tonella. Testing processes of web applications.Annals of Software Engineering, 14(1):93–114, 2002.

[78] K. I. Pun and Y. W. Si. Audit trail analysis for traffic intensive web ap-plication. In e-Business Engineering, 2009. ICEBE ’09. IEEE InternationalConference on, pages 577–582, Oct 2009.

[79] Filippo Ricca. Analysis, testing and re-structuring of Web applications.IEEE International Conference on Software Maintenance, ICSM, pages 474–478, 2004.

[80] Arlitt Martin Hashemian, Krishnamurthy. Overcoming web server bench-marking challenges in the multi-core era. Proceedings - IEEE 5th Interna-tional Conference on Software Testing, Verification and Validation, ICST2012, pages 648–653, 2012.

[81] Mehul Nalin Vora. A Nonintrusive Approach to Estimate Web Server Re-sponse Time. International Journal of Computer and Electrical Engineering,5(1):93–97, 2013.

[82] R. Aganwal, B. Ghosh, S. Banerjee, and S. Kishore Pal. Ensuring websitequality: a case study. In Management of Innovation and Technology, 2000.ICMIT 2000. Proceedings of the 2000 IEEE International Conference on, vol-ume 2, pages 664–670, 2000.

[83] J Zinke, J Habenschuß, and B Schnor. Servload: Generating representativeworkloads for web server benchmarking. Simulation Series, 44(BOOK 12):82–89, 2012.

[84] S. Mungekar and D. Toradmalle. W taas: An architecture of website analysisin a cloud environment. In Next Generation Computing Technologies (NGCT),2015 1st International Conference on, pages 21–24, Sept 2015.

[85] A. Keshk and A. Ibrahim. Ensuring the Quality Testing of Web Using a NewMethodology. 2007 IEEE International Symposium on Signal Processing andInformation Technology, 2007.

[86] Hidam Kumarjit Singh and Tulshi Bezboruah. Performance metrics of acustomized web application developed for monitoring sensor data. In 2015IEEE 2nd International Conference on Recent Trends in Information Systems(ReTIS), pages 157–162. IEEE, jul 2015.

BIBLIOGRAPHY 101

[87] Hugo Saba, Eduardo Manuel, De Freitas Jorge, and Victor Franco Costa.Webteste: a Stress Test Tool. Proceedings of WEBIST 2006 - Second In-ternational Conference on Web Information Systems and Technologies, pages246–249, 2006.

[88] I.a Jugo, D.b Kermek, and A.a Meštrović. Analysis and evaluation of webapplication performance enhancement techniques. Lecture Notes in ComputerScience (including subseries Lecture Notes in Artificial Intelligence and Lec-ture Notes in Bioinformatics), 8541:40–56, 2014.

[89] Wen-Kui Chang and Shing-Kai Hon. Evaluating the Performance of a WebSite via Queuing Theory, pages 63–72. Springer Berlin Heidelberg, 2002.

[90] R. Srinivasa Perumal and P. Dhavachelvan. Performance Analysis of Dis-tributed Web Application: A Key to High Perform Computing Perspective.In 2008 First International Conference on Emerging Trends in Engineeringand Technology, pages 1140–1145. IEEE, 2008.

[91] Zhang Huachuan, Xu Jing, and Tian Jie. Research on the parallel algo-rithm for self-similar network traffic simulation. Proceedings - 2009 2nd IEEEInternational Conference on Computer Science and Information Technology,ICCSIT 2009, pages 355–359, 2009.

[92] C. Kallepalli and J. Tian. Usage measurement for statistical Web testingand reliability analysis. Proceedings Seventh International Software MetricsSymposium, 27(11):148–158, 2001.

[93] Kaiyu Wang and Naishuo Tian. Performance evaluation of J2EE web appli-cations with queueing networks. Proceedings - 2009 International Conferenceon Information Technology and Computer Science, ITCS 2009, 1:437–440,2009.

[94] Rizal Suffian, Dhiauddin. Performance testing: Analyzing differences of re-sponse time between performance testing tools. In Computer & InformationScience (ICCIS), 2012 International Conference on, volume 2, pages 919–923.IEEE, 2012.

[95] Manish Rajendra Dhote and GG Sarate. Performance testing complexityanalysis on ajax-based web applications. Software, IEEE, 30(6):70–74, 2013.

[96] Reza NasiriGerdeh, Negin Hosseini, Keyvan RahimiZadeh, and MortezaAnaLoui. Performance analysis of web application in xen-based virtualizedenvironment. In Computer and Knowledge Engineering (ICCKE), 2015 5thInternational Conference on, pages 256–261. IEEE, 2015.

BIBLIOGRAPHY 102

[97] Vipul Mathur, Preetam Patil, Varsha Apte, and Kannan M Moudgalya.Adaptive admission control for web applications with variable capacity. InQuality of Service, 2009. IWQoS. 17th International Workshop on, pages 1–5. IEEE, 2009.

[98] Ana Cavalli, Stephane Maag, and Gerardo Morales. Regression and perfor-mance testing of an e-learning web application: Dotlrn. Proceedings - Interna-tional Conference on Signal Image Technologies and Internet Based Systems,SITIS 2007, pages 369–376, 2007.

[99] John W Cane. Measuring performance of web applications: empirical tech-niques and results. In SoutheastCon, 2004. Proceedings. IEEE, pages 261–270.IEEE, 2004.

[100] Elhadi Shakshuki, Chao Chen, Yihai Chen, Huaikou Miao, and Hao Wang.Usage-pattern based statistical web testing and reliability measurement. Pro-cedia Computer Science, 21:140 – 147, 2013.

[101] Shyaamini B and Senthilkumar M. A novel approach for performance test-ing on web application services. volume 10, pages 38679–38683. ResearchIndia Publications, 2015.

[102] Kai Lei, Yining Ma, and Zhi Tan. Performance comparison and evaluationof web development technologies in php, python, and node. js. In Computa-tional Science and Engineering (CSE), 2014 IEEE 17th International Con-ference on, pages 661–668. IEEE, 2014.

[103] Rigzin Angmo and Mukesh Sharma. Performance evaluation of web basedautomation testing tools. In Confluence The Next Generation InformationTechnology Summit (Confluence), 2014 5th International Conference-, pages731–735. IEEE, 2014.

[104] Martti Vasar, Satish Narayana Srirama, and Marlon Dumas. Framework formonitoring and testing web application scalability on the cloud. In Proceedingsof the WICSA/ECSA 2012 Companion Volume, WICSA/ECSA ’12, pages53–60. ACM, 2012.

[105] Izzat Alsmadi, Ahmad T. Al-Taani, and Nahed Abu Zaid. Web structuralmetrics evaluation. Proceedings - 3rd International Conference on Develop-ments in eSystems Engineering, DeSE 2010, pages 225–230, 2010.

[106] Thirumalai Selvi, N. V. Balasubramanian, and P. Sheik Abdul Khader.Quantitative evaluation of frameworks for web applications. InternationalJournal of Computer, Electrical, Automation, Control and Information Engi-neering, 4(4):708 – 713, 2010.

BIBLIOGRAPHY 103

[107] Breno Lisi Romano, Gláucia Braga E Silva, Henrique Fernandes De Cam-pos, Ricardo Godoi Vieira, Adilson Marques Da Cunha, Fábio FagundesSilveira, and Alexandre Carlos Brandão Ramos. Software testing for web-applications non-functional requirements. ITNG 2009 - 6th InternationalConference on Information Technology: New Generations, pages 1674–1675,2009.

[108] Elder M. Rodrigues, Rodrigo S. Saad, Flavio M. Oliveira, Leandro T. Costa,Maicon Bernardino, and Avelino F. Zorzo. Evaluating capture and replay andmodel-based performance testing tools: An empirical comparison. In Proceed-ings of the 8th ACM/IEEE International Symposium on Empirical SoftwareEngineering and Measurement, ESEM ’14, pages 9:1–9:8. ACM, 2014.

[109] Minzhi Yan, Hailong Sun, Xu Wang, and Xudong Liu. Building a taasplatform for web service load testing. In Cluster Computing (CLUSTER),2012 IEEE International Conference on, pages 576–579. IEEE, 2012.

[110] Xingen Wang, Bo Zhou, and Wei Li. Model based load testing of web ap-plications. Proceedings - International Symposium on Parallel and DistributedProcessing with Applications, ISPA 2010, pages 483–490, 2010.

[111] Sara Sprenkle, Holly Esquivel, Barbara Hazelwood, and Lori Pollock. We-bVizOR: A visualization tool for applying automated oracles and analyzingtest results of web applications. Proceedings - Testing: Academic and Indus-trial Conference Practice and Research Techniques, TAIC PART 2008, pages89–93, 2008.

[112] Marc Guillemot and Dierk König. Web testing made easy. In Compan-ion to the 21st ACM SIGPLAN Symposium on Object-oriented ProgrammingSystems, Languages, and Applications, OOPSLA ’06, pages 692–693. ACM,2006.

[113] Harry M Sneed and Shihong Huang. The design and use of wsdl-test: atool for testing web services. Journal of Software Maintenance and Evolution:Research and Practice, 19(5):297–314, 2007.

[114] Yasuyuki Fujita, Masayuki Murata, and Hideo Miyahara. Performancemodeling and evaluation of web server systems. Electronics and Communica-tions in Japan(Part II Electronics), 83(12):12–23, 2000.

[115] Jianhua Hao and Emilia Mendes. Usage-based statistical testing of web ap-plications. Proceedings of the 6th international conference on Web engineering,pages 17–24, 2006.

[116] Sebastian Lehrig, Hendrik Eikerling, and Steffen Becker. Scalability, Elas-ticity, and Efficiency in Cloud Computing: A Systematic Literature Review

BIBLIOGRAPHY 104

of Definitions and Metrics. Proceedings of the 11th International ACM SIG-SOFT Conference on Quality of Software Architectures, (MAY):83–92, 2015.

[117] Samer Al-Zain, Derar Eleyan, and Joy Garfield. Automated user interfacetesting for web applications and TestComplete. Proceedings of the CUBEInternational Information Technology Conference on - CUBE ’12, pages 350–354, 2012.

[118] Tahani Hussain. An Approach to Evaluate the Performance of Web Ap-plication Systems. Proceedings of International Conference on InformationIntegration and Web-based Applications & Services - IIWAS ’13, pages 692–696, 2013.

[119] Rubén Casado, Javier Tuya, and Muhammad Younas. Testing the reliabilityof web services transactions in cooperative applications. Proceedings of the27th Annual ACM Symposium on Applied Computing - SAC ’12, page 743,2012.

Appendices

105

Appendix ASystematic maps

Figure A.1: Research parameters vs research attributes in SMS

106

Appendix A. Systematic maps 107

Figure A.2: Research methods vs research attributes in SMS

Figure A.3: Research methods vs research parameters in SMS

Appendix BSMS overview

Table B.1: SMS overviewAuthor Name Quality At-

tribute’sFacet 1:Metrics

Facet 2:Tools

Facet 3:Challenges

J. Križani , A. Gr-guri , M. Mošmon-dor, P. Lazarevski[12]

Performance Provided Provided Provided

Fei Wang, WencaiDu [40]

PerformanceScalability Not provided Provided Not provided

Chia Hung Kao,Chun Cheng Lin,Juei-Nan Chen [22]


Pu Yunming, XuMingna [66]

Performance Provided Provided Not provided

Junzan Zhou, BoZhou, Shanping Li[61]


R. ThirumalaiSelvi, N. V. Bal-asubramanian[67]


John W. Cane [68] Performance Provided Not provided Not provided

Osama Hamed,Nedal Kafri [23]

Performance,Reliability,Scalability

Provided Provided Not provided

108

Appendix B. SMS overview 109

Elder M. Rodrigues,Maicon Bernardino,Leandro T. Costa,Avelino F. Zorzo,Flávio M.Oliveira [7]


Joydeep Mukher-jee, Mea Wang,Diwakar Krishna-murthy [69]


Manish RajendraDhote, G.G. Sarate[70]

Performance Not provided Provided Not provided

Amira Ali, NagwaBadr [5]

Performance,Reliability


Qinglin Wu, YanWang [71]

Performance,Scalability


Muhammad Dhi-auddin MohamedSuffiani, FairulRizal Fahrurazi [16]

Performance Provided Not provided Not provided

Ping Li, Dong Shi,Jianping Li [11]


Supriya Gupta,Lalitsen Sharma[72]


R. ThirumalaiSelvi, Sudha, N. V.Balasubramanian[73]


Harry M. Sneed,Shihong Huang [74]


Not provided Provided Not provided

Thanh H. D.Nguyen [10]


Zao-Bin GAN,Deng-Wen WEI,Vijay Varadhara-jan [1]




Jianfeng Yang,Zhouhui Deng, RuiWang, WenshengHu [75]

Reliability Provided Not provided Not provided

G. Ruffo, R. Schi-fanella, and M.Sereno [76]



F. A Torkey, ArabiKeshk, TaherHamza, AmalIbrahim [65]

Performance,Reliabilty

Provided Provided Provided

F. Ricca, P. Tonella[77]



Jordi Guitart,Vicenç Beltran,David Carrera,Jordi Torres andEduard Ayguadé[21]

Performance,Reliability,Scalability


Tiantian Gao, Yu-jia Ge, GongxinWu, and Jinlong Ni[53]

Performance,Scalability,Reliability


Ka-I Pun, Yain-Whar Si [78]


Filippo Ricca [79] Reliability Provided Not provided Not providedP. Nikfard, S. binIbrahim, M. Hos-sein [6]


Provided Not provided provided

Diwakar Krishna-murthy, MahnazShams, Behrouz H.Far [80]


Mehul Nalin Vora[81]



Guangzhu Jiang,Shujuan Jiang [60]




Lakshmi S.Iyer, B.Gupta, N. Johri[13]


Not provided Not provided Not provided

Niclas Snellman ,Adnan Ashraf yz,Ivan Porres [33]



R. Aganwal, B.Ghosh, S. Baner-jee, S. Kishore Pal[82]

Scalability,Reliability,Performance


Sandhya Kiran,Akshyansu Mohap-atra, RajashekaraSwamy [57]



Jörg Zinke, JanHabenschuss,Bettina Schnor [83]

scalabilityPerformance Provided Provided Not provided

M. Arslan, U. Qa-mar, S. Hassan, S.Ayub [56]

Performance Provided Provided provided

B.M. Subraya, S.V.Subrahmanya, J.K.Suresh, C. Ravi [2]

Performance,Scalability,Reliabilty


Christof Lutteroth,Gerald Weber [62]


Kunhua Zhu, Jun-hui Fu, Yancui Li[25]


Provided Not provided Not provided

M. Kalita, S.Khanikar, T.Bezboruah [24]

scalabilityreliabilityperformance


M. Kalita, T.Bezboruah [24]


Provided Provided provided


ShraddhaMungekar,Dhanashree Torad-malley [84]


Not provided Provided Not provided

B.M. Subraya, S.V.Subrahmanya [4]


Arabi Keshk, AmalIbrahim [85]



Hidam Kumar-jit Singh, TulshiBezboruah [86]



Hugo Saba, Ed-uardo Manuel deFreitas Jorge, Vic-tor Franco Costa[87]



Igor Jugo,Dragutin Kermek,Ana Meˇstrovi´c[88]



G. Ruffo, R. Schi-fanella, M. Sereno,R. Politi [76]


Wen-Kui Chang,Shing-Kai Hon [89]


R. Srinivasa Peru-mal, P. Dhavachel-van [90]



Weifeng Zhang,Lianjie Chen, LeiXu [54]


ZHANGHuachuan, XUJing, TIAN Jie [91]



ChaitanyaKallepalli, JeffTian [92]



Md. Safaet Hossain[3]


Provided Not provided Provided


Xiaokai Xia, Qi-uhong Pei, YongpoLiu, Ji Wu, ChaoLiu [36]


Kaiyu Wang,Naishuo Tian [93]


Martin Pinzger,Gabriele Kotsis [8]


Xiuxia Quan, LuLu [58]

Performance Not provided Provided Provided

Muhammad Dhi-auddin MohamedSuffiani, FairulRizal Fahrurazi [94]


Manish RajendraDhote, G.G. Sarate[95]


Not provided Provided Provided

RezaNasiriGerdeh’l, Ne-gin Hosseinit, Key-van RahimiZadeh,Morteza AnaLoui[96]


Vipul Mathur,Preetam Patil,Varsha Apteand Kannan M.Moudgalya [97]


A. Shojaee, N.Agheli , B. Hos-seini [55]


Ana Cavalli,Stephane Maag,Gerardo Morales[98]



John W. Cane [99] Performance Not provided Provided Not providedChao Chen, YihaiChena, HuaikouMiao, Hao Wang[100]

Reliability Provided Provided provided


Manar Abu Talib,Emilia Mendes,Adel Khelifi [63]

Reliability Not provided Not provided Provided

RaoufehsadatHashemian, Di-wakar Krishna-murthy, MartinArlitt [80]



Mahnaz Shams,Diwakar Krishna-murthy , BehrouzFar [59]


Ms.B.Shyaamini,Dr.M.Senthilkumar[101]


Kai Lei1, YiningMa, Zhi Tan [102]



Ms. Rigzin Angmo,Mrs. MonikaSharma [103]


Fredrik Abbors,Tanwir Ahmad,Dragos¸ Trus¸can,Ivan Porres [51]


Martti Vasar,Satish NarayanaSrirama, MarlonDumas [104]

Scalability, Per-formance


Izzat Alsmadi, Ah-mad T. Al-Taani,and Nahed AbuZaid [105]



Thirumalai Selvi,N. V. Balasub-ramanian, andP. Sheik AbdulKhader [106]



Breno Lisi Romano,Gláucia Bragae Silva,Henrique Fernandesde Campos,Ricardo [107]



ChaitanyaKallepalli andJeff Tian [44]

Reliability Provided Provided Not provided

Yuta Maezawa,Kazuki Nishiura,Shinichi Honiden,Hironori Washizaki[64]

Reliability Not provided Provided Provided

Elder M. Ro-drigues, Flavio M.Oliveira,. MaiconBernardino, Ro-drigo S. Saad,Leandro T.Costa,AvelinoF. Zorzo [108]


Minzhi Yan, Hai-long Sun, XuWang, Xudong Liu[109]


M. Kalita1, T.Bezboruah [24]



Anthony Arkles,Dwight Makaroff[52]

Performance Not provided Provided Provided

Xingen Wang, BoZhou, Wei Li [110]


Sara Sprenkle†,Holly Esquivel,Barbara Hazel-wood, Lori Pollock[111]

NA Not provided Provided provided

Marc Guillemot,Dierk König [112]

Performance, re-liability

Not provided Not provided Not provided


Martin Arlitt,Carey Williamson[80]


Harry M. Sneed,Shihong Huang[113]

Performance Not provided Provided provided

Yasuyuki,Masayuki Mu-rata and HideoMiyahara [114]


Jianhua Hao,Emilia Mendes[115]

Reliability Provided Provided Not provided

Hugo Menino,Aguiar João CostaSeco, Lúcio Ferrão[18]

Performance Not provided Not provided Not provided

Sebastian Lehrig,Hendrik Eikerling,Steffen Becker [116]

Scalability Provided Not provided Not provided

Samer Al-Zain,Derar Eleyan, JoyGarfield [117]


Tahani Hussain[118]


Rubén Casado,Javier Tuya,MuhammadYounas [119]

Reliability Provided Not provided Not provided

Appendix CList of metrics

Table C.1: Metrics descriptionMetric Name Metric DescriptionResponse time Time taken from the request provided by user

until the last character of response received [12,25]

Throughput Number of requests received per second by a net-work or server [12, 25]

Number of concurrentusers

Total number of users using the application at agiven period of time [12]

CPU utilization Amount of work handled by CPU in order to ex-ecute task [12]

Disk I/O (access) NAMemory utilization Amount of physical memory or (RAM) consumed

by a process [12, 25]Number of transac-tions per sec(http)

Total number of transactions completed in a sec-ond [12]

Resource utilization Amount of resources utilized by a taskMTBF Average time between failures of a application [65]MTTR Average time required to repair the failed appli-

cation [65]Latency The time taken for sending a packet and receiving

the packet sent by senderProcessor time Amount of time the CPU is utilized by a appli-

cation or taskThink time Time the user pause between performing taskRamp up and rampdown

Ramup increases load on server and measurebreakpoint and Rampdown is decreasing the loadgradually inorder to recover from ramup

Disk space Amount of memory available in logical disk [12]Number of hits per sec The total number of hits on a web server for each

second

117

Appendix C. List of metrics 118

Number of errors Total number of errors in an applicationErrors Percentage The number of samples failing (Percentage of re-

quests with errors)Error ratio The number of samples failing/total no of samples

passedMTTF Average time the application work before it fails

[65]Failure rate (request) The frequency of failures as per number of re-

quests [65]Number of HTTP re-quests

Total number of requests received by a server ina given unit of time

Capacity NALoad time Time taken to load the web page at client sideDisk I/O transactions Read or write transactions per second and bytes

per secondHit ratio Ratio of number of cache hits to number of misses

[65]Page load time and re-quest response time

Time taken to load a web page in seconds andrequest response time is time for single request

Availability Probability of the application work when requiredduring a period of time

Roundtrip time The total time between the data sent and datareceived

Cache hit Cache hit is the data required is found in cachememory [65]

Cache hit ratio Ratio of number of cache hits to number of missesNetwork latency Time taken to transmit one packet of data from

source to destinationPhysical disk Time Amount of time the read and write request are

executed by diskSuccessful or FailedHits

Total number of succesful hits and failed hits [65]

Number of connec-tions per sec (user)

Total number of connections requested to serverin a given second

Number of deadlocks Measure of frequency of deadlocking in databaseElapsed time (disk) Time between the request and response transmis-

sion processNumber of sessions Total number of times a users with unique IP

address acceseed application [25]Number of requestsper sec

The total number of requests received by a serverin a given second


Hit value NAExecution time Time taken to execute a particular requestSession time The time between the user enters and leaves the

applicationDisk queuelength(request)

Average number of read and write request queuedin the selecte disk [25]

Transaction time A period of time where a data entryDisk utilization The usage of disk space by the applicationSession length The total time between the user enter and leaves

the application [25]Successful requestsrate

The number of connection who have received theresponse for their requests

Connect time The elapsed time the user connected to networkor application

Number of connec-tion errors, Number oftimeouts

The number of connections rejected by server andnumber of timeouts at client side due to time limit

Request rate The number of requests requested to a server ata given time

Computing power Metric is related to how fast a machine or servercan perform a task

Processing speed Number of instructions the computer executesper second

Cache memory usuage NANumber of successfulvirtual users

Total number of virtual users created to performload testing

Available memory The amount of physical memory which is free ornot using by any resoure

Requests in bytes persec

Number of bytes transmitted per a request froma server [25]

Rate of successfullycompleted requests(goodput)

The total number of useful information bits perrequest delivered to a certain destination in a unittime

Connectiontime(server)

The time taken for the user to connect with server

Load distribution NARendezvous point Point where all expected users wait until all are

emulated, and then all virtual users send requestat one time

Speed Speed at which processor executes instruction re-ceived


Queue percentage Percentage of work queue size currently in use [25]

TID Developer Tool Name Platform Support Programming Language Tool Type Source Reference Link Availability Attribute1 Apache Apache JMeter™ Cross-platform Java Manual Open Source http://jmeter.apache.org/ Available Performance (Load testing)

2 HP LoadRunner Windows, Linux C Automated Commercial http://www8.hp.com/us/en/software-solutions/loadrunner-load-testing/index.html?jumpid=va_uwxy6ce9tr Available Performance(Load testing)

3 Parasoft WebKing Windows, Linux, Solaris Java Automated Commercial https://www.parasoft.com/press/simplifies-functional-testing-in-webking/ Not Available Performance(Load testing), Reliability

4 ESnet / Lawrence

Berkeley National

Laboratory

iPerf Cross-platform C Manual Open Source https://iperf.fr/ Available Performance

5 Ericsson Tsung Cross-platform Erlang Manual Open Source http://tsung.erlang-projects.org/ Available Performance (Load testing, Stress), Scalability6 Softlogica WAPT Windows NA Automated Freeware http://www.loadtestingtool.com/index.shtml Available Performance(Load, Stress)7 Cyrano openSTA Windows C++ Manual Open Source http://opensta.org/ Available Performance(Load, Stress)8 Parasoft SOAtest Cross-platform NA Manual and Automated Commercial https://www.parasoft.com/product/soatest/ Available Reliability, Performance(Load,stress)

9 Microsoft Microsoft Web Application Stress Tool Windows NA Manual and Automated Freewarehttp://www.microsoft.com/downloads/details.aspx?FamilyID=e2c0585a-062a-439e-a67d-

75a89aa36495&DisplayLang=enNot Available Performance(Stress)

10 HP httperf Linux, Windows C Manual Open Source http://www.labs.hpe.com/research/linux/httperf/ Available Performance(Load)11 Paco Gomez The Grinder Independent Python or Jython Manual Open Source http://grinder.sourceforge.net/ Available Performance(Load)12 RADVIEW WebLOAD Linux, Windows C++ Automated Freeware http://www.radview.com/ Available Performance(Load, stress), Scalability

13 Micro Focus

InternationalSilk Performer Windows NA Automated Freeware http://www.borland.com/en-GB/Products/Software-Testing/Performance-Testing/Silk-Performer Available Performance(Load, stress), Scalability

14 Paessler AG Webserver Stress Tool Windows NA Automated Freeware https://www.paessler.com/tools/webstress/features Available Performance(Load, stress)

15 Micro Focus

InternationalQAload Windows NA Automated Freeware http://www.borland.com/en-GB/Products/Other-Borland-products/Qaload Not Available Performance(Load, stress), Scalability

16The Wireshark team

Wireshark Cross-platform C, C++ Manual Open Source https://www.wireshark.org/ Available Performance

17 Firebug Working

GroupFirebug Cross-platform JavaScript, XUL, CSS Automated Open Source http://getfirebug.com/ Available Performance ( web page performance analysis)

18 John Levon Oprofile Cross-platform C Automated Open Source http://oprofile.sourceforge.net/news/ Available Performance ( performance counter monitor profiling tools )19 HP & Open source Xenoprof Linux NA Automated Open Source http://xenoprof.sourceforge.net/ Available Performance ( performance counter monitor profiling tools )

20

Open source standard

version, SmartBear

Software Pro version

SoapUI Cross-platform NA Automated Open Source & commercial https://www.soapui.org/ , https://sourceforge.net/projects/soapui/ Available Performance (load testing)

21 SOASTACloudTest

Cross-platform NA Automated Commercial http://www.soasta.com/load-testing/ Available Performance (load testing) & scalability

22 Mark Seger collectl Linux NA Manual Open source https://sourceforge.net/projects/collectl/ Available Performance (monitoring tool)

23 ApacheApacheBench

Cross-platform NA Automated Open source https://httpd.apache.org/docs/2.4/programs/ab.html Available Performance (load testing)

24 SmartBear Software TestComplete Windows NA Automated Commercial https://smartbear.com/product/testcomplete/overview/ Available Performance, scalability and reliability

25 TANWIR AHMADMBPeT: A performance testing tool

NA NA Manual NA Not available Not available Performance and scalability

26 Florian Forster collectd Unix-like C Manual Open source http://collectd.org/ Available Performance (load testing)

27The Cacti Group, Inc.

Cacti Cross-platform PHP, MySQL Manual Open source http://www.cacti.net/ Available Performance

28 Mach5 FastStats Log File Analyzer Cross-platform NA Automated Commercial https://www.mach5.com/index.php Available Performance, Scalability and reliability analysis tool based on log29 IBM Rational TestManager Windows, Linux NA Manual and Automated Commercial Not available Not Available Performance30 Corey Goldberg Pylot NA Python Manual and Automated Open source http://www.pylot.org/ Available Peeformance and Scalability31 CustomerCentrix loadstorm Cross-platform NA Automated and manual Commercial http://loadstorm.com/ Available Performance (load testing)

32 Rational SoftwareRational Performance Tester

Windows, linux NA Automated Commercial http://www-03.ibm.com/software/products/en/performance Available Performance

33 Pushtotest Testmaker Cross-platform NA Automated and manual Commercial, Opensource, Trial http://www.pushtotest.com/intrototm.html Available Peeformance and Scalability

34Armstrong World

IndustriesSiege Cross-platform NA Manual Open source https://www.joedog.org/siege-home Available Performance (load testing)

36 LOADIMPACT AB LOADIMPACT Cross-platform NA Automated Commercial, Trial https://loadimpact.com/ Available Performance(Load testing)37 Dynatrace Advanced Web Monitoring Scripting (KITE) Windows NA Automated Freeware http://www.keynote.com/solutions/monitoring/web-monitoring-scripting-tool Available Performance monitoring 38 Microsoft Visual Studio Windows C++, C# Manual, Automated Commercial, Trial https://www.visualstudio.com/en-us/features/testing-tools-vs.aspx Available Performance(Load testing, stress testing)39 testoptimal testoptimal Windows, linux NA Automated Commercial, Trial http://testoptimal.com/ Available Performance(Load testing)40 Westwind WebSurge Windows NA Manual OpenSource https://websurge.west-wind.com/ Available Performance(Load, stress)41 Microsoft Application Center Test Windows NA Automated Freeware https://msdn.microsoft.com/en-us/library/aa287410(v=vs.71).aspx Available Performance(Stress, Load), Scalability42 EMPIRIX e-TEST suite Windows, linux NA Automated Commercial http://www.empirix.com/ Not Available Performance, Reliability43 Watir Watir-webdriver Cross-platform Ruby Automated Open source https://watir.com/ Available Performance44 SeleniumHQ Selenium WebDriver Cross-platform Java Automated, Manual Freeware http://docs.seleniumhq.org/ Available Performance, Scalability

45AppPerfect

CorporationAppPerfect Load Test Cross-platform NA Automated Commercial http://www.appperfect.com/index.html Available Performance(Load, Stress)

46 Yahoo Yslow Cross-platform NA Manual Open Source http://yslow.org/ Available Performance analysis47 neustar BrowserMob Cross-platform NA Automated Commercial, Trial NA Not Available Performance(Load), Scalability48 Neotys NeoLoad Cross-platform Java Automated Commercial,Trial http://www.neotys.com/ Available Performance(Load, Stress)49 Brendan gregg perf Linux C Manual NA https://perf.wiki.kernel.org/index.php/Main_Page Available Performance monitoring 50 Alon Girmonsky Blazemeter Cross-platform NA Automated Open source https://www.blazemeter.com/ Available Performance(Load)51 Zabbix Company Zabbix Cross-platform C,PHP, Java Automated Open source http://www.zabbix.com/ Available Scalability(monitoring tool)52 Ethan Galstad Nagios Cross-platform C Automated Open source https://www.nagios.org/ Available Scalability(monitoring tool)53 Opsview Limited Opsview Linux,Solaris Perl, c, ExJS Automated trial, commercial https://www.opsview.com/ Available Scalability(monitoring tool)54 Atlassian HyperHQ Cross-platform NA Automated Open source https://github.com/hyperic/hq Available Scalability(monitoring tool)

55 HP, HP Software

DivisionHP QuickTest Professional Windows NA Automated Commercial http://www8.hp.com/us/en/software-solutions/unified-functional-automated-testing/index.html Available Performance(load)

56 Oracle Corporation Jconsole Windows NT, OS X, Linux,Solaris Java Automated Open source http://docs.oracle.com/javase/8/docs/technotes/guides/management/jconsole.html Available Performance

57PureLoad Software

Group AB Pure load enterprize Cross-platform NA Automated Commercial,Trial http://www.pureload.com/products-pureload Available Performance (load)

58 Ixia IxExplorer Windows NA Automated Commercial,Trial NA NA Performance (load)

59HP/Mercury

Interactive

HP Quality CenterMicrosoft Windows or LINUX NA Automated Proprietary http://www8.hp.com/us/en/software-solutions/website-testing-stormrunner-load/index.html Available Performance (load)

60 IBM IBM RPT Windows, Linux NA Automated Commercial http://www-03.ibm.com/software/products/en/performance Available Performance (load)61 Tyto software Sahi pro Cross platform Java and Javascript Automated Commercial http://sahipro.com/ Available Performance (load)62 Vmware Vmware Vcenter Cross-platform NA Automated Commercial http://www.vmware.com/in/products/converter Available Performance and scalability

Appendix DList of tools

Appendix EList of challenges

Table E.1: List of challengesChallenge Area Challenge

UserSimulating the real,user behavior for testingImproving the identified bottlenecks canimprove the overall performance of the system orit leads to the cause of any another bottleneckdue to the different user actions

How users are reacting to different response timesand what actions are being performed by theusers related to server responses

Tools

Tools,Improper environment,of tools i.e.system configuration, tool installation, toolsetup„flexibility to perform testTo create more number of virtual users as JMeteronly supports limited number of virtual usersJMeter tool does not support the generation oftest scriptsJMeter script does not capture all the dynamicvalues, such as SAML Request, Relay State,Signature Algorithm, Authorization State,Cookie Time, Persistent ID (PID), JSession IDand Shibboleth, generated using single sign-onmechanism of Unified Authentication Platform.In JMeter it is not easy to simulate another sys-tem that is the main problemJMeter tool is unable to record test cases and itprovides a very confusing charts.

122

Appendix E. List of challenges 123

Some of the tools available for testing thequality performance attributes only support thecreation of simple test case scenarios. It maynot be sufficient to know the transaction time andnumber of simultaneous user from these testcase scenarios and also difficult to identify thebottlenecks existing in the application

Tools using random user sessions and log filebased sessions for simulating virtual users doesnot provide a real workload.

Metrics

Identifying,the existing dependencies between re-questsSelection of parameters and the criteria for test-ing is an important issue in performance testingScalability testing related to the resources suchas CPU, server, memory, and disk.Challenges related to the network connection andserver processorSpecifying the load test parameters like genera-tion of forms, recognition of the returned pages

Development

Handling,large number of test scenarios is a chal-lengeChallenge is to know the number of users may hitthe site at the same time i.e. loading and turingchallengesChallenges related to code i.e. unnecessary sleepstatements, loops.Challenges related to the enhancement of testscripts

Appendix FInterview questions

F.1 Technical Questions

• Do you have any previous experience in Web testing? If so, how many years?

• What is your experience with PSR testing?

• What are the subtypes of testing that you would perform in order to conductthe performance, Reliability, and Scalability testing?

F.1.1 Tools

F.1.1.1 Performance

• What are the tools used in your current company for conducting the perfor-mance testing of a web application?

• Are there any other tools that you have worked with for conducting perfor-mance testing?

• What are the additional tools that you would specify for performance testing?

• What are the main reasons for considering the specific tools among existingtools?

• What are the difficulties that you have faced while working with these tools?

• What are the drawbacks that you noticed in the specified tools?

• What are the suggestions/improvements that you would suggest?

F.1.1.2 Scalability

• What are the tools used in your current company for conducting the scalabilitytesting of a web application?

124

Appendix F. Interview questions 125

• Are there any other tools that you have worked with for conducting scalabilitytesting?

• What are the additional tools that you would specify for scalability testing?





F.1.1.3 Reliability

• What are the tools used in your current company for conducting the reliabilitytesting of a web application?

• Are there any other tools that you have worked with for conducting reliabilitytesting?

• What are the additional tools that you would specify for reliability testing?





F.1.2 Metrics

F.1.2.1 Performance

• Which metrics are considered for conducting performance testing in your com-pany?

• What are the other metrics do you know for conducting performance testing?

• What are the reasons for considering only the specified metrics?

• What are the reasons for excluding the other metrics?

• Which is the most important metric among the specified metrics that need tobe considered while testing? What is the reason?

• Which is the least important metric among the specified metrics? What is thereason?


F.1.2.2 Scalability

• Which metrics are considered for conducting scalability testing in your com-pany?

• What are the other metrics do you know for conducting scalability testing?





F.1.2.3 Reliability

• Which metrics are considered for conducting reliability testing in your com-pany?

• What are the other metrics do you know for conducting reliability testing?





F.1.2.4 Other

• Are the tools specified earlier address all the specified metrics?

• If no, how are these metrics handled? Are they using any tailored tools?

F.1.3 Challenges

• What are the challenges faced while testing these PSR attributes?

• What are the causes for facing the specified challenges?

• Does your company able to address all the challenges identified during thetesting process?


• What are the mitigation strategies that are employed by your company forovercoming the identified challenges?

• Are there any challenges that your company was unable to address?

F.1.4 General

• Do you think thus testing the PSR attributes is necessary for web applications?

• Which is the most important attribute among PSR attributes? What is thereason?

• Which is the least important attribute among PSR attributes? What is thereason?

• Do you have any more suggestions regarding the PSR attributes that wouldhelp our research?

Appendix GMTC and IA identified between case

company and other companies

G.1 Metrics

Table G.1: Identified metrics between case company and other companiesCase company Other companiesNumber of transactions per sec Number of transactions per secCPU utilization CPU utilizationMemory utilization Memory utilizationProcessor time Processor timeThroughput ThroughputDisk I/O Disk I/ONumber of hits per sec Number of hits per secNumber of requests per sec Number of requests per secNumber of concurrent users Number of concurrent usersNetwork usage Network usageServer requests and response SpeedSpeed Response timeResponse time Error percentageRendezvous point Number of failuresTransactions pass and fail criteria BandwidthRampup time and rampdown time Transactions pass and fail criteriaError percentageQueue percentageBandwidthNetwork latencyLoad distributionMTBFNumber of failures

128

Appendix G. MTC and IA identified between case company and other companies129

G.2 Tools

Table G.2: Identified tools between case company and other companiesCase company Other companiesHP LoadRunner HP LoadRunnerVMware Vcenter IBM RPTQualityTest Professional Sahi proHP Quality Center Apache JMeterSilk performer SeleniumAKKA clustering QualityTest ProfessionalZookeeper clusteringOracle RAC clusteringM1 - Monitor OneWiresharkApache JMeterSeleniumIxia

G.3 Challenges

Table G.3: Identified challenges between case company and other companiesCase company Other companiesLimited number of virtual users inJMeter tool

Limited number of virtual users inJMeter tool

Technology expertise challenge Insufficient timeScript related issues (capturingbrowser requests)

Compatibility issues

Metric related issues Metric related issuesInsufficient timeNetwork related issues

G.4 Important attribute

Table G.4: Identified important attribute between case company and other com-panies

Application based Priority order All are importantCase company 4 2 2

Other companies 2 1 1

The following is a consent form for a research project. It is a research project on Performance,

Scalability, and Reliability (PSR) challenges, metrics and tools for web testing: A Case Study,

carried out by Akshay Kumar Magapu and Nikhil Yarlagadda of this project from the Blekinge

Tekniska Högskola (BTH), Karlskrona, Sweden. Before the interview can start, the investigator

and the interviewee should sign two copies of this form. The interviewee will be given one copy

of the signed form.

Consent for Participation in Interview Research

1. I volunteer to participate in a research project conducted by the students of this research. I

understand that the project is designed to gather information about Performance, Scalability,

and Reliability (PSR) challenges, metrics and tools for web testing. I will be one among the

members being interviewed for this research.

2. My participation in this project is voluntary.

3. I understand that most interviewees in will find the discussion interesting and thought-

provoking. If, however, I feel uncomfortable in any way during the interview session, I have

the right to decline to answer any question or to end the interview.

4. Participation involves being interviewed by researchers from BTH. The interview will last

approximately 30-45 minutes. Notes will be written during the interview. An audio tape of

the interview and subsequent dialogue will be make. If I don't want to be taped, I will not be

able to participate in the study.

5. I understand that the researcher will not identify me by name in any reports using information

obtained from this interview, and that my confidentiality as a participant in this study will

remain secure.

6. Employees from my company will neither be present at the interview nor have access to raw

notes or transcripts. This precaution will prevent my individual comments from having any

negative repercussions.

7. I have read and understand the explanation provided to me. I have had all my questions

answered to my satisfaction, and I voluntarily agree to participate in this study.

8. I have been given a copy of this consent form.

____________________________ ________________________

My Signature Date

____________________________ ________________________

My Printed Name Signature of the Investigator

For further information, please contact:

Akshay Kumar Magapu Nikhil Yarlagadda

[email protected] [email protected]

Supervisor:

Michael Unterkalmsteiner

Postdoc at Blekinge Tekniska Högskola (BTH)

[email protected]

Appendix HConsent form