Statistical distributions of software metrics: do they matter? Israel Herraiz Technical University of Madrid [email protected]Grab these slides from http://slideshare.net/herraiz/statistical-distributions-of-metrics Israel Herraiz, UPM Statistical distributions of software metrics: do they matter? 1/17
Presentation for the Seminar on Open Source Evolution 2013 http://informatique.umons.ac.be/genlog/SOS-Evol/SOS-Evol2013.html
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Statistical distributions of software metrics: dothey matter?
Israel Herraiz, UPM Statistical distributions of software metrics: do they matter? 12/17
Probability of finding defects
Defects density (only pre-release defects)
Using Number of Methods and number of pre-release defects per LOC
Below xmin Above xmin
0 1 2 3 4 5 6 7 8 9 100
2000
4000
6000
8000
10000
12000Below xmin
0 0.05 0.1 0.15 0.2 0.25 0.3 0.350
50
100
150
200
250
300Above xmin
Avg .Dens. = .2685 Avg .Dens. = .4565
* Data obtained from "Predicting Defects for Eclipse” PROMISE 2007
Israel Herraiz, UPM Statistical distributions of software metrics: do they matter? 13/17
Probability of finding defects
Defects density (only post-release defects)
Using Number of Methods and number of post-release defects per LOC
Below xmin Above xmin
0 1 2 3 4 5 6 7 8 9 100
2000
4000
6000
8000
10000
12000Below xmin
0 0.05 0.1 0.15 0.2 0.25 0.3 0.350
50
100
150
200
250
300Above xmin
Avg .Dens. = .1437 Avg .Dens. = .2690
Israel Herraiz, UPM Statistical distributions of software metrics: do they matter? 14/17
Probability of finding defects
Defects density (pre + post-release defects)
Using CYCLO/SLOC and number of total defects per LOC
10−1
101
103
105
10−4
10−3
10−2
10−1
100
Pr(
X ≥
x)
x
10−1
100
101
102
103
104
105
10−1
100
101
102
103
Below xmin Above xmin
Avg .Dens. = .3335 (>9000 files) Avg .Dens. = .7747 (364 files)Israel Herraiz, UPM Statistical distributions of software metrics: do they matter? 15/17
1 Some background
2 Statistical properties of software metrics
3 Evidence of impact on quality
4 Summary of findings and further work
Israel Herraiz, UPM Statistical distributions of software metrics: do they matter? 16/17
Summary and further work
Summary of preliminary findings
Some metrics have a transition from lognormal to power law
Clear relation between normalized metrics and defects density
Although the threshold might not be perfect (e.g., you might find ahigh defects density in a lower side file), it greatly reduces the searchspace for potentially problematic files
Further work
Verify in more projects
Do you have defects data at the file level?
Find explanation for the transition and its influence on quality
How do the statistical parameters change over time? Do defectsevolve accordingly?
Israel Herraiz, UPM Statistical distributions of software metrics: do they matter? 17/17