Transcript
Empirical Evaluations of Regression Test
Selection TechniquesA Systematic Review
Emelie Engström,Mats Skoglund,Per RunesonLund University
PresenterShyam Rajendran
ProfessorDarko Marinov
Agenda• Introduction to Regression Testing • Our Questions about Regression Testing Research• Our Application of The Systematic Review • The Results• Our Experiences• Future Work
More importantly why study this paper?
Regression Testing• Retesting of software• after a change• verify behavior• of parts that used to work
Research Topics on Regression Testing
• Change Impact Analysis• Regression Testing on specific applications e.g. databases or
GUIs• Test Automation / GUI Testing • Test Process Enhancement• Development of Techniques for Regression Testing • Methods for Evaluating Regression Testing Techniques
Our Research Questions1. Which techniques for regression test selection in the
literature have been evaluated empirically? 2. Can these techniques be classified, and if so, how?3. Are there significant differences between these techniques
that can be established using empirical evidence? 4. Can technique A be shown to be superior to technique B,
based on empirical evidence?
As recommended by Kitchenham guidelines…
Systematic reviews should aim to present a fair evaluation of a research topic by using a trustworthy, rigorous, and auditable
methodology
Electronic Sources
• Inspec (<www.theiet.org/publishing/inspec/>) • Compendex (<www.engineeringvillage2.org>) • ACM Digital Library (<portal.acm.org>) • IEEE eXplore (<ieeexplore.ieee.org>) • ScienceDirect (<www.sciencedirect.com>) • Springer LNCS (<www.springer.com/lncs>) • Web of Science(<www.isiknowledge.com>)
Search Criteria• 1969 -2006• Keywords <regression> and <test or testing> and <software> • Only papers in English:
• Sorry can’t read • No grey literature
Manual Filtering!
Results – Study Selection Procedure
Exclusion based on titles
#2923
Stage 1 Stage1 : Exclusion of duplicates and irrelevant papers
Stage2 : Exclusion of papers not presenting empirical research or not focusing on Regression test selection
Ex: Papers on software for statistical regression testing
Ex: Test suite maintenance or test automation
Stage3 : Based on full text analysis and …
Iterative fashion among the researchers
Final inclusion• Is a specific regression test selection method evaluated?
• Paper excluded if it presents RT but is evaluated from another point of view
• Are the metrics and the results reported in the studies relevant for a comparison of methods?• Paper excluded if it presents ability to predict fault prone code
but not on cost of RTS or effectiveness of detection system.• Is data collected in a sufficiently rigorous manner?
• Paper excluded if conclusions drawn based on subset of components analyzed.
But in general, more inclusive than exclusive
Agenda• Introduction to Regression Testing • Our Questions about Regression Testing Research• Our Application of The Systematic Review • The Results• Our Experiences
1. Which techniques exist in the literature?• 32 unique regression test selection techniques and• 5 reference techniques
• Retest-all• Intuitive, experience based selection• Random (25, 50, 75)
have been evaluated empirically
1. Which techniques exist in the literature?
Number of identified techniques is relatively high compared to the number of studies
32 techniques in 38 studies covering 28 papers!
18 techniques appear in one study!
Represented in no studies
Number of techniques
12 1
7 1
6 0
5 1
4 1
3 3
2 7
1 18
Total 32
Empirically Studied Relations Between Techniques
2. How can these Techniques be Classified?
•No commonly accepted classification scheme were identified.•Information in the selected papers were used to identify important properties assigned to the techniques.
safe Un - safe
• Source Code Text• Intermediate code/machine code
• for Virtual machine
• Specific Data Format • UML/Metadata input
• Programming language paradigm- OOP or Procedural
Threats to validity• Construct validity
• Terminology• Might miss other relevant studies!
• Reliability• Can the data collected be repeated by others?• Issues with research databases
• Non-determinism search results
• Internal Validity• Analysis of data
• mostly qualitative
• External Validity• Can the study be generalized for full industry context?
3 - 4. The empirical evidence• 38 studies (23 experiments and 15 case studies)• Half of the experiments are conducted on the same set of
small programs referred to as the Siemens programs.• Few large scale real life evaluations
ExperimentDeliberate introduction of intervention
Case StudyAn empirical inquiry within real life context
3-4.The Empirical Evidence
Evaluated Metrics Number %
Cost reduction Test suite reduction 29 76
Test execution time 7 18
Test selection time 5 13
Total time 16 42
Precision (omission of non-fault revealing tests)
1 3
Ability to detect faults
Relative Fault detection effectiveness 5 13
Absolute Fault detection effectiveness
8 21
Challenge?
Inclusiveness,
Precision
Efficiency,
Generality.
Fault detection and precision
Space and time requirements
Theoretical Reasoning
Techniques studied in detail
T2 : Most efficient in reducingTime and/or testcases “Unsafe!”
T6: SafeTakes too long!
But later found to run good !Proves the importance ofregression testing context .
Experiences
• Varying empirical quality • Few replications.• Benchmarking problem - what criteria defines a good
Regression Test Selection technique? • Different measurements/metrics• Different reporting of evaluation contexts
• No clear definition what constitutes a technique • Great variance in uniqueness of the techniques in the papers.
(novel or variant)• No clear difference in a specification of a technique and its
implementation• Different level of abstraction
What the study aimed to achieve ?
• Most techniques are not evaluated.• Cannot make decision based on research alone!
But
• Can existing literatures on RTS techniques provide base for selecting a RTS method for a given system?
Conclusions: A Recap
• 32 empirically evaluated techniques for regression test selection were identified.
• Which may be classified according to:• Input needed,• Type of code or programming paradigm,• Safe/unsafe.
• The empirical basis for differences between the techniques is not very strong
• and hence there is no basis for selecting one superior technique.
Future work• Agree on what is considered a Regression Test Selection
Technique• Encourage systematic replications• Agree on which variation factors in the study context are
important to report
Thank you!
top related