The effects of duration based moving windows with estimation by analogy - sousouke amasaki

The Effects of Duration-‐based Moving Windows

with Estimation by Analogy

Sousuke Amasaki* Chris Lokan†

Okayama Prefectural University*

UNSW Canberra†

Mensura 2015 in Kracow, Poland 1

In Mensura 2012, we focused on Moving Windows for Effort Estimation

with Estimation by Analogy

2 Mensura 2015 in Kracow, Poland

Project Data for Training EbA Effort Estimation Model

A target project to be estimated

Drop off old project,

it maybe useless Retain

Window Size

A new target

past future

Conclusion in Mensura 2012 paper •  MW could improve accuracy with EbA •  Weaker effects with EbA than Linear Regression

Window policies matter

3

p MW was examined with LR and two policies [IST2014]* p  Fixed-‐size

p Retain N projects in a window p  Fixed-‐duration

p Retain projects within N months

p  Results show the difference in accuracy improvement * C. Lokan, E. Mendes. Investigating the use of duration-‐based moving windows to improve software effort prediction: A replicated study, Information and Software Technology 56(9) , pp. 1063–1075, 2014.

Mensura 2015 in Kracow, Poland

Today’s talk is about Duration-‐based Moving Windows


past future

Fixed-size (Mensura 2012)

Fixed-duration

EbA with pre-selected features (Mensura 2012) EbA with on-time feature selection (for reality)

Research Questions


Is there a difference in the accuracy of estimates between EbA with pre-‐ and on-‐time selections using fixed-‐size windows?

RQ1. Reconfirmation of Mensura 2012 results

Is there a difference in the accuracy of estimates with and without MW with the revised EbA and fixed duration windows?

RQ2. Evaluation of Fixed-Duration Windows

RQ3. Comparison between window policies

How do these results compare with results based on fixed-‐size windows?

The revised EbA


p  Select features on the basis of the whole dataset p  Wrapper approach

p  Use simple mean for estimation

Mensura 2012

p  Select features for every new target project p  Lasso for reducing computation costs

p  Use inverse rank weighted mean (IRWM) for estimation

This study

Unrealistic to use future projects

Contribute to estimation accuracy

Dataset


Properties p  Highly quality rated as A or B by ISBSG p  Size Measured with IFPUG 4.0 or later p  Known Actual effort p  Not web projects p  228 projects

Candidate predictors p  Unadjusted FP p  Language types p  Development types p  Platform types p  Domain Sector types

As same as Mensura 2012

Experiments


p  Mensura 2012 EbA vs. the revised EbA (for RQ1) p  Growing Portfolio (use all past projects) vs. Moving Windows (for RQ2, RQ3)

Performance trend analysis

Preference

Preference

Statistical significance

Statistical significance

Comparisons between:

p From 12 to 84 months (fixed-‐duration) p From 20 to 120 projects (fixed-‐size)

Results: fixed-‐size windows with the revised EbA


8 Sousuke Amasaki and Chris Lokan

20 40 60 80 100 120

Window Size (number of projects)

�10

�5

0

5

Diff

eren

ces

inm

ean

AE

(%)

(a) Di↵erences in mean MAE

(b) Di↵erences in mean MRE

Fig. 1: Results with Fixed-size Window, modified EbA with k = 5

Figure 1 and Table 2 revealed characteristics of moving windows comparedto the growing portfolio:

– With windows of up to 60 projects, MAE showed no significant preferencefor any approach. The line starts below zero and quickly goes above zero(favoring the growing portfolio), but the di↵erence was not significant as shownin Fig. 1(a). MRE showed a similar trend, except that moving windows were



20 40 60 80 100 120


�15

�10

�5

0

5

10

Diff

eren

ces

inm

ean

MR

E(%

)(b) Di↵erences in mean MRE




p  GP was advantageous in smaller window sizes but not significant p  MW got significantly advantageous in medium window size

Num of Neighbors k = 5

Results: comparisons between the old and the revised EbA



20 40 60 80 100 120


�10

�5

0

5

Diff

eren

ces

inm

ean

AE

(%)








20 40 60 80 100 120


�15

�10

�5

0

5

10

Diff

eren

ces

inm

ean

MR

E(%






p  Trends were same but effective sizes and ranges were different p  Trends were same but effective sizes and ranges were different p  The best k moved from k=2 (Mensura 2012) to k=5 p  Trends were same but effective sizes and ranges were different p  The best k moved from k=2 (Mensura 2012) to k=5 p  The improvement by MW was clearer in statistical significance

Results: fixed-‐duration windows with the revised EbA



20 30 40 50 60 70 80

Window Size (calendar months)

�10

�5

0

5

Diff

eren

ces

inm

ean

AE

(%)



Fig. 2: Results with Fixed-duration Windows, EbA with k = 5

growing portfolio are larger with EbA than with LR, and the range of durationsfor which windows are advantageous is narrower with EbA than with LR. Thedi↵erence in advantageous window sizes and their number between EbA andLR were reported in [4]. These observations were common between this studyand [4].



20 30 40 50 60 70 80


�15

�10

�5

0

5

Diff

eren

ces

inm

ean

MR

E(%




p  GP was advantageous in smaller window sizes but not significant p  MW got significantly advantageous in medium window size p  Less significant window sizes than fixed-size windows


Results: comparison to the past study [IST2014]



20 30 40 50 60 70 80


�10

�5

0

5

Diff

eren

ces

inm

ean

AE

(%)







20 30 40 50 60 70 80


�15

�10

�5

0

5

Diff

eren

ces

inm

ean

MR

E(%





p  Overall trend was same between the two studies p  Fixed-size windows was more effective than fixed-duration p  The effective window size became larger and its range is narrower

Answers to RQs


The change in estimation method made a difference, improving the accuracy of estimates.

RQ1. Reconfirmation of Mensura 2012 results

The fixed-‐duration windows can make a difference, and effective to improve estimation accuracy.

RQ2. Evaluation of Fixed-Duration Windows

RQ3. Comparison between window policies

The fixed-‐size and fixed-‐duration window policies can lead to significantly better estimation accuracy. But fixed-‐size made clearer difference.

Practical implications


This and past studies showed its effectiveness with major effort estimation method, LR and EbA.

1. Moving Windows is effective

This and past studies showed clearer difference when using fixed-‐size windows. Rethink practitioners’ mind regarding reference projects.

2. Fixed-size policy looks better for estimation

3. Effective window sizes might be different even among practitioners

EbA resembles practitioners’ thinking. The fact that the difference in options resulted in different window ranges partly explain the difference among practitioners

Threats to Validity


p  The result was based on only ISBSG dataset p  It is difficult to generalize the results

Dataset

EbA p  Limited to specific options

p More accurate or more realistic settings

Conclusion

p  Fixed-‐duration windows works with EbA p  Under more realistic situation

p  The results brought some practical implications p  ex. Fixed-‐size policy is more suitable

p  Exploration of EbA options p  Additional experiments on other datasets


Future Work


We welcome questions !

Sousuke Amasaki: [email protected]‐pu.ac.jp Chris Lokan: [email protected]

Contact info: