November 5, 2007 ACM WEASEL Tech Efficient Time-Aware Prioritization with Knapsack Solvers Sara Alspaugh Kristen R. Walcott Mary Lou Soffa University of Virginia Michael Belanich Gregory M. Kapfhammer Allegheny College
Dec 31, 2015
November 5, 2007
ACM WEASEL Tech
Efficient Time-Aware Prioritization with Knapsack Solvers
Sara AlspaughKristen R. Walcott
Mary Lou SoffaUniversity of Virginia
Michael BelanichGregory M.
KapfhammerAllegheny College
Test Suite Prioritization
Testing occurs throughout software development life cycle Challenge: time consuming and costly
Prioritization: reordering the test suite Goal: find errors sooner in testing
Doesn’t consider the overall time budget Alternative: time-aware prioritization
Goal 1: find errors sooner in testing Goal 2: execute within time constraint
Motivating ExampleOriginal test suite with fault information
T21 fault2 min.
T46 faults2 min.
T32 faults2 min.
T14 faults2 min.
Assume:- Same execution time- Unique faults found
T46 faults2 min.
T14 faults2 min.
T21 fault2 min.
T32 faults2 min.
Prioritized test suite
Testing time budget: 4 minutes
The Knapsack Problem for Time-Aware Prioritization
Maximize: , where is the code coverage of test and is either 0 or 1.
Subject to the constraint:
where is the execution time of test and is the time budget.
P ni=1 ci ¤xi
xi
ti i
i
ci
tmax
P ni=1 ti ¤xi · tmax
The Knapsack Problem for Time-Aware Prioritization
T21 line2 min.
T45 lines2 min.
T32 lines2 min.
T14 lines2 min.
Time Budget: 4 min.
Total Value:
Space Remaining:
0
4 min.
5
2 min.
9
0 min.
Assume test cases cover unique requirements.
The Extended Knapsack Problem Value of each test case depends on test cases already in
prioritization Test cases may cover same requirements
T21 line2 min.
T45 lines2 min.
T32 lines2 min.
T14 lines2 min.
Time Budget: 4 min.
Total Value:
Space Remaining:
0
4 min.
5
2 min.
7
0 min.
T10 lines2 min.
UPDATE
Goals and Challenges Evaluate traditional and extended knapsack
solvers for use in time-aware prioritization Effectiveness
Coverage-based metrics Efficiency
Time overhead Memory overhead
How does overlapping code coverage affect results of traditional techniques?
Is the cost of extended knapsack algorithms worthwhile?
The Knapsack Solvers
Random: select tests cases at random
Greedy by Ratio: order by coverage/time Greedy by Value: order by coverage Greedy by Weight: order by time
Dynamic Programming: break problem into sub-problems; use sub-problem results for main solution
Generalized Tabular: use large tables to store sub-problem solutions
The Knapsack Solvers (continued) Core: compute optimal fractional solution then
exchange items until optimal integral solution found
Overlap-Aware: uses a genetic algorithm to solve the extended knapsack problem for time-aware prioritization
The Scaling Heuristic
Order the test cases by their coverage-to-execution-time ratio such that:
If , then it is possible to find an optimal solution that includes .
Check the inequality for each test case until it no longer holds.
belong in the final prioritization.
Ti
T1
c1 £j
tm axt1
k¸ c2 £
³tm ax
t2
´
hT1; : : :Tx¡ 1i
c1t1
¸ c2t2
¸ ::: ¸ cntn
Tx; x 2 [1;n]
Implementation Details
Knapsack Solver
Test Transformer
CoverageCalculator
TestSuite
(T)
New TestSuite(T ’)
Program Under Test
(P)
Knapsack Solver Parameters1. Selected Solver2. Reduction Preference3. Knapsack Size
Evaluation Metrics Code coverage: Percentage of requirements
executed when prioritization is run Basic block coverage used
Coverage preservation: Proportion of code covered by prioritization versus code covered by entire original test suite
Order-aware coverage: Considers both the order in which test cases execute in addition to overall code coverage
Experiment Design Goals of experiment:
Measure efficiency of algorithms and scaling in terms of time and space overhead
Measure effectiveness of algorithms and scaling in terms of three coverage-based metrics
Case studies: JDepend Gradebook
Knapsack Size 25, 50, and 75% of execution time of original test suite
Summary of Experimental Results Prioritizer Effectiveness:
Overlap-aware solver had highest overall coverage for each time limit Greedy by Value solver good for Gradebook All Greedy solvers good for JDepend
Prioritizer Efficiency: All algorithms took small amount of time and memory
except for Dynamic Programming, Generalized Tabular, and Core
Overlap-aware solver required hours to run Generalized Tabular had prohibitively large memory
requirements Scaling heuristic reduced overhead in some cases
Conclusions
Most sophisticated algorithm not necessarily most effective or most efficient
Trade-off: effectiveness versus efficiency Efficiency or effectiveness most important?
Effectiveness overlap-aware prioritizer Efficiency low-overhead prioritizer
Prioritizer choice depends on test suite nature Time versus coverage of each test case Coverage overlap between test cases
Future Research
Use larger case studies with bigger test suites
Use case studies written in other languages
Evaluate other knapsack solvers such as branch-and-bound and parallel solvers
Incorporate other metrics such as APFD
Use synthetically generated test suites
Questions?
Thank you!
http://www.cs.virginia.edu/walcott/weasel.html
Case Study Applications
Gradebook JDepend
Classes 5 22
Functions 73 305
NCSS 591 1808
Test Cases 28 53
Test Suite Exec. Time 7.008 s 5.468 s