This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• The need for timely, correct fixes, and tracking.
• Two approaches, MTTR and Age; Tradeoffs.
• Log-transform of data, form of the distribution
• Multiplicative factors lognormal
• Transformation from rates to age
• Comparison of models
• Implications for management
3Mullen Gokhale PROMISE 2008
Problem definition
• Our problem was to characterize and improve software defect repair times order to improve both reliability of released networking products and time-to-market of products under development.
• Repair time is from date defect record is created until defect is repaired in at least one version.
• Both interval before defect is recorded and interval until fix is distributed are not included.
4Mullen Gokhale PROMISE 2008
One approach: Mean Time To Repair, MTTR( Not today ! )
• Little’s Law:
• average wait time = queue length / service rate
• Similar to days accounts receivable or days of inventory; well understood by management and goaled at Cisco
• Both unfixed and recent fixes affect the result
• Integrate both queue length and service rate over 90 days
• Ordinarily track all dispositions, not just fixes
• Suitable for comparing products, teams, etc.
• Retrospective trending can be done using on historical data
5Mullen Gokhale PROMISE 2008
Second approach: Measuring age at fix
• Closed bugs: age is interval from creation to fix
• Open bugs: age is from creation to present
• Not studied here; distribution may differ from Closed.
• Average age of open or average age of closed can be erratic if there are outliers
• Controlling variability depends on preventing outliers.
• Data collection: pick a product and a range of time during which > 1000 defects were fixed. Determine the age of each defect at the time it was fixed.
• We included only defects for which there was a fix, not other dispositions.
7Mullen Gokhale PROMISE 2008
One year, Severities 1-3, Linear plot
• Very skewed distribution
• Median 37 days
• Mean 81
• Std. dev. 147
• 85%-ile 139
Percent bugs fixed in under N days
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
N days
Bug Sev 1
Bug Sev 2
Bug Sev 3
8Mullen Gokhale PROMISE 2008
One year, Severities 1-3, Log plot
• Same chart but Log scale.
• Log chart shows distinct S curve
• Lower counts for S1 yield relatively greater fluctuations.
• Severe bugs (S1, S2) get faster service except for tail.
Percent bugs fixed in under N days
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
ln N days
Bug Sev 1
Bug Sev 2
Bug Sev 3
9Mullen Gokhale PROMISE 2008
Lognormal provides excellent fit
• N >> 1000. In this case Age at Fix is visually identical with the Lognormal.
• The lognormal is the most commonly used distribution in maintainability analysis because it is considered representative of the distribution of most repair times. MIL-HDBK-470A 4.4.3.1
• Note for later --- fitted lognormal is slightly lower at the left edge
Cumulative Percent Bugs Fixedas function of age at fix (MRV).
0
10
20
30
40
50
60
70
80
90
100
ln Age at Fix (MRV)
Cu
mu
lati
ve
% B
ug
s
Num Defects
Fitted Lognormal
Cumulative Percent Bugs Fixedas function of age at fix (MRV).
0
10
20
30
40
50
60
70
80
90
100
Age at Fix (MRV)
Cu
mu
lati
ve
% B
ug
s
Num Defects
Fitted Lognormal
10Mullen Gokhale PROMISE 2008
Relationship between the mean and varianceof the Log(age) and of the age itself
• Mean (Log(age)) =
• Variance (Log(age)) =
• Median (Log(age)) =
• Median (age) = exp ()
• Mean (age) = exp ( + )
• Variance (age)
=exp(2 + ) (exp() -1)
mean stdev
2.15 1.70 31 77
2.26 1.69 34 103
2.35 1.66 37 128
3.17 1.52 65 126
3.30 1.50 73 140
3.46 1.47 81 147
3.0 1.5 62 180
3.0 1.6 72 250
3.0 1.7 85 351
2.5 1.6 44 151
3.0 1.6 72 250
3.5 1.6 119 411
Example Values
11Mullen Gokhale PROMISE 2008
Why might the Ages be Lognormal?
• The Lognormal can be generated when a random variable is the product of other random variables, just as a Normal distribution can be generated by summing random variables.
• Informally, the conditions are that the constituent random factors be substantially independent, that no one variable dominate the others, and that there be a large number of factors.
• We propose a hypothetical model of the defect repair process including realistic multiplicative factors and approximating the mathematical conditions.
13Mullen Gokhale PROMISE 2008
Seven hypothetical factors affecting resolution timeDrawn from experience and COCOMO
• MIL-HDBK-470A: Designing and Developing Maintainable Products and Systems, 4.4.3.1 - Lognormal Distribution, Aug 1997. (Lognormal is representative of most repair times.)
• R. Mullen, Lognormal Distribution of Software Failure Rates: Origin and Evidence, ISSRE 1998. (re Central Limit Theorem and Lognormal.)
• R. Mullen and S. Gokhale: Software Defect Rediscoveries: A Discrete Lognormal Model, ISSRE 2005. (Further references to Lognormal in SW.)
• B. Schroeder and G. Gibson A large-scale study of failures in high-performance-computing systems, CMU-PDL-05-112, Dec 2005. Later in DSN-2006. (Lognormal provides best fit for repair times).