The Effects of Durationbased Moving Windows with Estimation by Analogy Sousuke Amasaki* Chris Lokan † Okayama Prefectural University* UNSW Canberra † Mensura 2015 in Kracow, Poland 1
The Effects of Duration-‐based Moving Windows
with Estimation by Analogy
Sousuke Amasaki* Chris Lokan†
Okayama Prefectural University*
UNSW Canberra†
Mensura 2015 in Kracow, Poland 1
In Mensura 2012, we focused on Moving Windows for Effort Estimation
with Estimation by Analogy
2 Mensura 2015 in Kracow, Poland
Project Data for Training EbA Effort Estimation Model
A target project to be estimated
Drop off old project,
it maybe useless Retain
Window Size
A new target
past future
Conclusion in Mensura 2012 paper • MW could improve accuracy with EbA • Weaker effects with EbA than Linear Regression
Window policies matter
3
p MW was examined with LR and two policies [IST2014]* p Fixed-‐size
p Retain N projects in a window p Fixed-‐duration
p Retain projects within N months
p Results show the difference in accuracy improvement * C. Lokan, E. Mendes. Investigating the use of duration-‐based moving windows to improve software effort prediction: A replicated study, Information and Software Technology 56(9) , pp. 1063–1075, 2014.
Mensura 2015 in Kracow, Poland
Today’s talk is about Duration-‐based Moving Windows
4 Mensura 2015 in Kracow, Poland
past future
Fixed-size (Mensura 2012)
Fixed-duration
EbA with pre-selected features (Mensura 2012) EbA with on-time feature selection (for reality)
Research Questions
5 Mensura 2015 in Kracow, Poland
Is there a difference in the accuracy of estimates between EbA with pre-‐ and on-‐time selections using fixed-‐size windows?
RQ1. Reconfirmation of Mensura 2012 results
Is there a difference in the accuracy of estimates with and without MW with the revised EbA and fixed duration windows?
RQ2. Evaluation of Fixed-Duration Windows
RQ3. Comparison between window policies
How do these results compare with results based on fixed-‐size windows?
The revised EbA
Mensura 2015 in Kracow, Poland 6
p Select features on the basis of the whole dataset p Wrapper approach
p Use simple mean for estimation
Mensura 2012
p Select features for every new target project p Lasso for reducing computation costs
p Use inverse rank weighted mean (IRWM) for estimation
This study
Unrealistic to use future projects
Contribute to estimation accuracy
Dataset
Mensura 2015 in Kracow, Poland 7
Properties p Highly quality rated as A or B by ISBSG p Size Measured with IFPUG 4.0 or later p Known Actual effort p Not web projects p 228 projects
Candidate predictors p Unadjusted FP p Language types p Development types p Platform types p Domain Sector types
As same as Mensura 2012
Experiments
Mensura 2015 in Kracow, Poland 8
p Mensura 2012 EbA vs. the revised EbA (for RQ1) p Growing Portfolio (use all past projects) vs. Moving Windows (for RQ2, RQ3)
Performance trend analysis
Preference
Preference
Statistical significance
Statistical significance
Comparisons between:
p From 12 to 84 months (fixed-‐duration) p From 20 to 120 projects (fixed-‐size)
Results: fixed-‐size windows with the revised EbA
Mensura 2015 in Kracow, Poland 9
8 Sousuke Amasaki and Chris Lokan
20 40 60 80 100 120
Window Size (number of projects)
�10
�5
0
5
Diff
eren
ces
inm
ean
AE
(%)
(a) Di↵erences in mean MAE
(b) Di↵erences in mean MRE
Fig. 1: Results with Fixed-size Window, modified EbA with k = 5
Figure 1 and Table 2 revealed characteristics of moving windows comparedto the growing portfolio:
– With windows of up to 60 projects, MAE showed no significant preferencefor any approach. The line starts below zero and quickly goes above zero(favoring the growing portfolio), but the di↵erence was not significant as shownin Fig. 1(a). MRE showed a similar trend, except that moving windows were
8 Sousuke Amasaki and Chris Lokan
(a) Di↵erences in mean MAE
20 40 60 80 100 120
Window Size (number of projects)
�15
�10
�5
0
5
10
Diff
eren
ces
inm
ean
MR
E(%
)(b) Di↵erences in mean MRE
Fig. 1: Results with Fixed-size Window, modified EbA with k = 5
Figure 1 and Table 2 revealed characteristics of moving windows comparedto the growing portfolio:
– With windows of up to 60 projects, MAE showed no significant preferencefor any approach. The line starts below zero and quickly goes above zero(favoring the growing portfolio), but the di↵erence was not significant as shownin Fig. 1(a). MRE showed a similar trend, except that moving windows were
p GP was advantageous in smaller window sizes but not significant p MW got significantly advantageous in medium window size
Num of Neighbors k = 5
Results: comparisons between the old and the revised EbA
Mensura 2015 in Kracow, Poland 10
8 Sousuke Amasaki and Chris Lokan
20 40 60 80 100 120
Window Size (number of projects)
�10
�5
0
5
Diff
eren
ces
inm
ean
AE
(%)
(a) Di↵erences in mean MAE
(b) Di↵erences in mean MRE
Fig. 1: Results with Fixed-size Window, modified EbA with k = 5
Figure 1 and Table 2 revealed characteristics of moving windows comparedto the growing portfolio:
– With windows of up to 60 projects, MAE showed no significant preferencefor any approach. The line starts below zero and quickly goes above zero(favoring the growing portfolio), but the di↵erence was not significant as shownin Fig. 1(a). MRE showed a similar trend, except that moving windows were
8 Sousuke Amasaki and Chris Lokan
(a) Di↵erences in mean MAE
20 40 60 80 100 120
Window Size (number of projects)
�15
�10
�5
0
5
10
Diff
eren
ces
inm
ean
MR
E(%
)(b) Di↵erences in mean MRE
Fig. 1: Results with Fixed-size Window, modified EbA with k = 5
Figure 1 and Table 2 revealed characteristics of moving windows comparedto the growing portfolio:
– With windows of up to 60 projects, MAE showed no significant preferencefor any approach. The line starts below zero and quickly goes above zero(favoring the growing portfolio), but the di↵erence was not significant as shownin Fig. 1(a). MRE showed a similar trend, except that moving windows were
Num of Neighbors k = 5
p Trends were same but effective sizes and ranges were different p Trends were same but effective sizes and ranges were different p The best k moved from k=2 (Mensura 2012) to k=5 p Trends were same but effective sizes and ranges were different p The best k moved from k=2 (Mensura 2012) to k=5 p The improvement by MW was clearer in statistical significance
Results: fixed-‐duration windows with the revised EbA
Mensura 2015 in Kracow, Poland 11
12 Sousuke Amasaki and Chris Lokan
20 30 40 50 60 70 80
Window Size (calendar months)
�10
�5
0
5
Diff
eren
ces
inm
ean
AE
(%)
(a) Di↵erences in mean MAE
(b) Di↵erences in mean MRE
Fig. 2: Results with Fixed-duration Windows, EbA with k = 5
growing portfolio are larger with EbA than with LR, and the range of durationsfor which windows are advantageous is narrower with EbA than with LR. Thedi↵erence in advantageous window sizes and their number between EbA andLR were reported in [4]. These observations were common between this studyand [4].
12 Sousuke Amasaki and Chris Lokan
(a) Di↵erences in mean MAE
20 30 40 50 60 70 80
Window Size (calendar months)
�15
�10
�5
0
5
Diff
eren
ces
inm
ean
MR
E(%
)(b) Di↵erences in mean MRE
Fig. 2: Results with Fixed-duration Windows, EbA with k = 5
growing portfolio are larger with EbA than with LR, and the range of durationsfor which windows are advantageous is narrower with EbA than with LR. Thedi↵erence in advantageous window sizes and their number between EbA andLR were reported in [4]. These observations were common between this studyand [4].
p GP was advantageous in smaller window sizes but not significant p MW got significantly advantageous in medium window size p Less significant window sizes than fixed-size windows
Num of Neighbors k = 5
Results: comparison to the past study [IST2014]
Mensura 2015 in Kracow, Poland 12
12 Sousuke Amasaki and Chris Lokan
20 30 40 50 60 70 80
Window Size (calendar months)
�10
�5
0
5
Diff
eren
ces
inm
ean
AE
(%)
(a) Di↵erences in mean MAE
(b) Di↵erences in mean MRE
Fig. 2: Results with Fixed-duration Windows, EbA with k = 5
growing portfolio are larger with EbA than with LR, and the range of durationsfor which windows are advantageous is narrower with EbA than with LR. Thedi↵erence in advantageous window sizes and their number between EbA andLR were reported in [4]. These observations were common between this studyand [4].
12 Sousuke Amasaki and Chris Lokan
(a) Di↵erences in mean MAE
20 30 40 50 60 70 80
Window Size (calendar months)
�15
�10
�5
0
5
Diff
eren
ces
inm
ean
MR
E(%
)(b) Di↵erences in mean MRE
Fig. 2: Results with Fixed-duration Windows, EbA with k = 5
growing portfolio are larger with EbA than with LR, and the range of durationsfor which windows are advantageous is narrower with EbA than with LR. Thedi↵erence in advantageous window sizes and their number between EbA andLR were reported in [4]. These observations were common between this studyand [4].
Num of Neighbors k = 5
p Overall trend was same between the two studies p Fixed-size windows was more effective than fixed-duration p The effective window size became larger and its range is narrower
Answers to RQs
13 Mensura 2015 in Kracow, Poland
The change in estimation method made a difference, improving the accuracy of estimates.
RQ1. Reconfirmation of Mensura 2012 results
The fixed-‐duration windows can make a difference, and effective to improve estimation accuracy.
RQ2. Evaluation of Fixed-Duration Windows
RQ3. Comparison between window policies
The fixed-‐size and fixed-‐duration window policies can lead to significantly better estimation accuracy. But fixed-‐size made clearer difference.
Practical implications
14 Mensura 2015 in Kracow, Poland
This and past studies showed its effectiveness with major effort estimation method, LR and EbA.
1. Moving Windows is effective
This and past studies showed clearer difference when using fixed-‐size windows. Rethink practitioners’ mind regarding reference projects.
2. Fixed-size policy looks better for estimation
3. Effective window sizes might be different even among practitioners
EbA resembles practitioners’ thinking. The fact that the difference in options resulted in different window ranges partly explain the difference among practitioners
Threats to Validity
Mensura 2015 in Kracow, Poland 15
p The result was based on only ISBSG dataset p It is difficult to generalize the results
Dataset
EbA p Limited to specific options
p More accurate or more realistic settings
Conclusion
p Fixed-‐duration windows works with EbA p Under more realistic situation
p The results brought some practical implications p ex. Fixed-‐size policy is more suitable
p Exploration of EbA options p Additional experiments on other datasets
16 Mensura 2015 in Kracow, Poland
Future Work
Mensura 2015 in Kracow, Poland 17
We welcome questions !
Sousuke Amasaki: [email protected]‐pu.ac.jp Chris Lokan: [email protected]
Contact info: