Top Banner
Institute of Software,Chinese Academy of Scienc Local Bias and its Impacts on Local Bias and its Impacts on the Performance of Parametric the Performance of Parametric Estimation Models Estimation Models Ye Yang, Lang Xie, Zhimin He (ISCAS) Qi Li, Vu Nguyen, Barry Boehm (USC) Ricardo Valerdi (MIT/Univ. of Arizona) Sep. 21, 2011 Promise 2011, Banff, Canada
23

Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

May 06, 2015

Download

Technology

CS, NcState

Promise 2011:
"Local Bias and its Impacts on the Performance of Parametric Estimation Models"
Ye Yang, Lang Xie, Zhimin He, Qi Li, Vu Nguyen, Barry Boehm and Ricardo Valerdi.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Local Bias and its Impacts on the Local Bias and its Impacts on the Performance of Parametric Estimation Performance of Parametric Estimation

ModelsModels

Ye Yang, Lang Xie, Zhimin He (ISCAS)

Qi Li, Vu Nguyen, Barry Boehm (USC)

Ricardo Valerdi (MIT/Univ. of Arizona)

Sep. 21, 2011

Promise 2011, Banff, Canada

Page 2: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

OutlineOutline

Background Research questions Measuring local bias Measuring the impacts of local bias Handling Local Bias Conclusions and future work

2

Page 3: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

BackgroundBackground

Continuously calibrated and validated parametric models are necessary for realistic software estimates.

3

Model user

Model maintener

Model researcher

Page 4: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Background(Cont.)Background(Cont.)

Typical parametric models are calibrated over a broad range of industry data

Advocate local calibration to improve accuracy over the default model calibration.

Pros and cons of local calibration (local tuning) Pros: better model performance Cons: less bound to reach full compliance with the general model

4

Page 5: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Background (Cont.)Background (Cont.)

The evolution cycle of a parametric model Mismatches between “general assumptions” and “local assumptions”

5

Model Localization

Model Usage

Model Calibration

Model Building

General assumptions

Underlying model

Local data

Calibrationdata

Local assumptions

Model updates

Historical data

Resultant tuning variance caught increasing research attention

Counter-intuitive calibration results

Challenges in making use of unbalanced dataset for developing and evaluating general model

Page 6: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Example: COCOMO II modelExample: COCOMO II model

COCOMO II model

Range of local tuning parameters: Yang and Clark: CII Database experience: 1<=A<=4 Menzies : (2.2 <= A <= 9.18) ^ (0.88 <= B <= 1.09)

6

5

1

170.01

1

ii

B SF

jj

Effort A Size EM

Ln_effort

Ln_Size

Page 7: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Research questionsResearch questions

Research questions: Is there a way to measure the local bias? As historical data accumulates from multiple companies, how will

the associated local bias impact the performance of the general parametric estimation model?

Are there any correlation patterns between local bias and model performance variation?

Assumptions: The general parametric model follows a similar structure as the

COCOMO II. In model localization stage, constant A and constant B are tuned

with local data. In model usage stage, locally calibrated A and B are used for

project estimation.

7

Page 8: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

OutlineOutline

Background Research questions Measuring local bias Measuring the impacts of local bias Handling Local Bias Conclusions and future work

8

Page 9: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Local Bias DefinitionLocal Bias Definition

Local bias: degree of deviation between a local model and the general model

In the context of CII model:

where A’ and B’ are model parameters calibrated from local data of

each organization, A and B are default constant values of COCOMO II model

(A=2.94, B=0.91), and A standard size of 100KLOC to normalize local bias.

9

' '| ln( ) | | ln( ) ( ' ) ln( ) |

Effort Alocalbias B B Size

Effort A

Page 10: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Summary of Dataset Summary of Dataset

10

CII 2000 Subset After2000 SubsetCII 2010Dataset

Page 11: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Analysis procedureAnalysis procedure

Break After2000 subset into 10 subsets.

Conduct representative local calibration to produce A’ and B’.

Calculate local bias and compare among groups.

11

CII 2000 SubsetAfter2000 Subset

Subset1

A, BA1’, B1’ A2’, B2’ An’, Bn’

local_bias1 local_bias2 local_biasn

CII 2010Dataset

Subset2

Subsetn

Group by Organization_IDDefault Constants: A, B

Page 12: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Measuring local bias - ResultsMeasuring local bias - Results

Parameters of local models

12

Local bias of each group

Different local A and B in each group, indicating local bias introduced when adopting local calibration;

Local bias varies in different group, ranging from 0.06 to 2.25; E.g. in group 9, the relative ratio of the local model’s estimates and the CII

model estimates is as great as almost EXP(2.25)=9.49 times considering a normal project size at 100KSLOC.

Page 13: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

OutlineOutline

Background Research questions Measuring local bias Measuring the impacts of local bias Handling Local Bias Conclusions and future work

13

Page 14: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Measuring the impacts of local biasMeasuring the impacts of local bias

Performance assessment Basic performance indicators: MMRE (mean MRE), stdMRE (the variance of

MRE) Assessment procedure:

Average MMRE, Range of MMRE, Average stdMRE, and Range of stdMRE are used to assess the performance of an estimation model.

14

Spliting data set into training set

and test set

Tuning model parameters on

training set

Evaluating model performance on

test setMMRE, stdMRE

Average MMRERange of MMREAverage stdMRERange of stdMRE

Repeat the above steps for 2000 times

2000 (MMRE, stdMRE) pairs

Page 15: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Analysis procedureAnalysis procedure

First, for each group ssi in the After2000 subset:

1. combine ssi with CII 2000 data set to produce a new data set dsi ;

2. Assessing model performance on data set dsi , record values of performance indicators;

Then conduct correlation analysis between local bias and model performance

15

CII 2000 subsetI

SS1 Performance Local bias

CII 2000 subsetI

SS2 Performance Local bias

…… …… ……

Correlation analysis

Page 16: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

ResultsResults

Model performance

16

• Model performance decreases as new subsets being introduced

CII 2000 CII2010

MMRE 0.3478 0.4063

StdMRE 0.3261 0.3401

Reflecting the uncertainty inherent in model performance when adding just a small group of new data points into

the CII 2000 baseline dataset.

Page 17: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Measuring the impacts of local bias(cont.)Measuring the impacts of local bias(cont.)

Spearman correlation coefficients between local bias and model performance:

At the significant level of p-value less than 0.05, the range of stdMRE is significantly positive correlated with local bias and local_bias*num. Both the average stdMRE and the average MMRE are significantly positive correlated with local_bias*num.

Range of stdMRE reflects the uncertainty of model performance. Hence, the bigger the local bias is, the weaker the performance is.

17

Page 18: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

DiscussionsDiscussions

Two types of measures Local bias:

Useful to bridge the potential gaps between “model building” stage and “model localization” stage

Performance measures: range and average of MMRE and stdMRE are easy to produce,

reflecting certain profile of bias’s influence

Two components that drive the decreased model performance the degree of local bias and the number of data points associated

with each additional group

18

Page 19: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Implications to Parametric Model Calibration

Previous approaches Data pre-processing

Reducing factors, removing outliers, etc regression based approaches

variants of standard linear regression, incorporating a priori knowledge

machine learning approaches mainly focus on optimizing model accuracy

Need to pay attention to balance accuracy and stability

19

Page 20: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Threats to ValidityThreats to Validity

Other sources of bias? chronological bias, new technologies influences, etc.

Other performance indicators? PRED, MRE, etc

Other parametric models?

20

Page 21: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Ongoing work on handling local biasOngoing work on handling local bias

Assumption : local historical data set with higher local bias presents more different

pattern for cost estimation, and it should be assigned a lower weight when being used for model calibration.

Constraints for weight distribution function Weight=F ( LocalBias ) IF LocalBias =0, THEN Weight =1; IF LocalBias → +∞, THEN Weight → 0; The F should be a decreasing function on interval [0, +∞).

Three functions

21

11

1F Weight

LocalBias

12

1 ln( )F Weight

LocalBias

13F Weight LocalBiase

Page 22: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

ConclusionsConclusions

Providing a definition for consistently understanding and measuring local bias;

The impact assessment and correlation analysis verify that local bias can be harmful to general model performance;

Offering insights to ease parametric model evolution by identifying and avoiding local bias early on in the data collection stage;

Better local bias handling approach is needed. E.g. employ machine learning approach to learn local bias, and learn how to

improve the model structure to counter-effect the bias

22

Page 23: Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Estimation Models"

Institute of Software,Chinese Academy of Sciences

Thank you!Thank you!

Contact:

Ye Yang ([email protected])