Top Banner
Information Theory Information Theory in Software Metrics in Software Metrics (Assessment and Issues) (Assessment and Issues) Steve Counsell, Steve Counsell, (Brunel University and CREST) (Brunel University and CREST)
30

Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Mar 28, 2015

Download

Documents

Chase Thomson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Information Theory in Information Theory in Software Metrics Software Metrics (Assessment and Issues)(Assessment and Issues)

Steve Counsell,Steve Counsell,

(Brunel University and CREST)(Brunel University and CREST)

Page 2: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

IntroductionIntroduction

Coupling: Coupling: Well-understoodWell-understood Excessive coupling should be avoided Excessive coupling should be avoided Empirically (in excess) has been associated with Empirically (in excess) has been associated with

fault-proneness in C++ at least fault-proneness in C++ at least The Coupling Between Objects (CBO) metric of The Coupling Between Objects (CBO) metric of

Chidamber and Kemerer has dominated the areaChidamber and Kemerer has dominated the area Simple count of the number of unique classes to which any Simple count of the number of unique classes to which any

single class is coupled (in whatever way)single class is coupled (in whatever way)

Page 3: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Introduction (cont.)Introduction (cont.)

Theoretical properties also well Theoretical properties also well understoodunderstood Coupling of a modular system is non-Coupling of a modular system is non-

negative negative Merging two modules can’t increase system Merging two modules can’t increase system

coupling coupling Based on a modular system being Based on a modular system being

comprised of nodes and ‘edges’ comprised of nodes and ‘edges’ connecting those nodesconnecting those nodes

Page 4: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Information Theoretic Information Theoretic metrics (for coupling)metrics (for coupling)

Pioneered by Allen and Khoshgoftaar Pioneered by Allen and Khoshgoftaar (A&K)(A&K) First appeared based on Allen’s PhD work, First appeared based on Allen’s PhD work,

c.1996c.1996 METRICS paper in 1999METRICS paper in 1999

At the time created a bit of a stir At the time created a bit of a stir Metrics community re-thinkMetrics community re-think Could be applied to both OO and proceduralCould be applied to both OO and procedural Appealed to the cross-disciplinary ethos Appealed to the cross-disciplinary ethos

Page 5: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

RoadmapRoadmap

Explain A&K’s metric for system couplingExplain A&K’s metric for system coupling Based on a modular system graphBased on a modular system graph

Demonstrate its usefulness Demonstrate its usefulness and drawbacks and drawbacks

Identify open issues Identify open issues Research paths in evaluating/modifying the Research paths in evaluating/modifying the

metricmetric Other applicationsOther applications

Page 6: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Explaining A&K’s Explaining A&K’s couplingcoupling

Page 7: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

A modular system A modular system

Source: Allen and Khoshgoftaar, 1999

Page 8: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Inter-module couplingInter-module coupling

Source: Allen and Khoshgoftaar, 1999

Page 9: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Part IPart I

Page 10: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

RepresentationRepresentation

Source: Allen and Khoshgoftaar, 1999

Page 11: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

EntropyEntropy

The average information per nodeThe average information per node Always non-negative Always non-negative

Defined as:Defined as:

Page 12: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Entropy (cont.) Entropy (cont.)

All logs All logs base 2base 2

Unit of measure is a bit Unit of measure is a bit Graph selected has entropy Graph selected has entropy HH((SS) of 2.46) of 2.46

Page 13: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Part IIPart II

Page 14: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Sub-graph analysisSub-graph analysis

Consider the subgraph SConsider the subgraph Si i consisting of all consisting of all

the nodes in S and the edges of S that the nodes in S and the edges of S that have the have the iithth node as an end point node as an end point Disconnected nodes included in the sub-Disconnected nodes included in the sub-

graphgraph

Calculate the same probability Calculate the same probability distribution as we did previously distribution as we did previously

Page 15: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

For node 2 For node 2 Node Edge 1 Edge 4

0 0 0

1 1 0

2 1 1

3 0 0

4 0 0

5 0 0

6 0 0

7 0 0

8 0 0

9 0 0

10 0 0

11 0 1

12 0 0

13 0 0

14 0 0

Source: Allen and Khoshgoftaar, 1999

Page 16: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Entropy (for distribution Entropy (for distribution of node labels)of node labels)

Defined as:Defined as:

Page 17: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Entropy (cont.)Entropy (cont.)

Gives an entropy Gives an entropy HH((SSii) total) totalvalue (value (ii : 0..14) of 6.28 : 0..14) of 6.28

Page 18: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Part IIIPart III

Page 19: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Ethos of the coupling Ethos of the coupling metric metric

The entropy of the modular system taken The entropy of the modular system taken as a whole is less than or equal to the as a whole is less than or equal to the sum of entropies of the individual sum of entropies of the individual componentscomponents H(S) <= sum H(SH(S) <= sum H(Sii))

The difference between these values The difference between these values represents the true coupling relationships represents the true coupling relationships or ‘excess entropy’or ‘excess entropy’

Page 20: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Excess entropy Excess entropy CC((SS))

C(S) = 6.28 – 2.46 = 3.82

Where:

Page 21: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Coupling in a modular Coupling in a modular system (ms)system (ms)

Coupling(MS) = (Coupling(MS) = (nn+1) C(S) +1) C(S)

= 15 * 3.82 = 57.28= 15 * 3.82 = 57.28

Page 22: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Assessment of the Assessment of the metricmetric

Page 23: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

‘‘A metric sensitive to patterns A metric sensitive to patterns of connections. This is of connections. This is attractive, because software attractive, because software engineers recognize patterns as engineers recognize patterns as well’ (Allen and Khoshgoftaar, well’ (Allen and Khoshgoftaar, 1999)1999)

Page 24: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Coupling(MS): a. 2.76 f. 26.83b. 8.00 g. 30.83c. 16.00 h. 34.83d. 17.32 i. 22.04e. 24.07 j. 27.78

Source: Allen and Khoshgoftaar, 1999

Page 25: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Coupling (CBO): a. 2 f. 8b. 4 g. 10c. 6 h. 12 d. 6 i. 8

e. 8 j. 8

Source: Allen and Khoshgoftaar, 1999

Coupling(MS): a. 2.76 f. 26.83b. 8.00 g. 30.83c. 16.00 h. 34.83d. 17.32 i. 22.04e. 24.07 j. 27.78

Page 26: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Comparison with CBOComparison with CBO

0

5

10

15

20

25

30

35

40

1 2 3 4 5 6 7 8 9 10

Graph

Coupling(MS)

CBO

Page 27: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

IssuesIssues

Computes system coupling Computes system coupling Most coupling studies use a class coupling basisMost coupling studies use a class coupling basis

Need a ‘class-based’ entropy measure (NHD)Need a ‘class-based’ entropy measure (NHD) Comparison between i. and j. Comparison between i. and j.

Suggests that I is ‘better’ than j. Suggests that I is ‘better’ than j. OO people might disagree with an inheritance structure being OO people might disagree with an inheritance structure being

‘better’ ‘better’ Maintaining the root node would be highly problematicMaintaining the root node would be highly problematic

Do developers really look for patterns?Do developers really look for patterns? Does not take into account the ‘type’ of coupling Does not take into account the ‘type’ of coupling Can not be gleaned from a UML class diagramCan not be gleaned from a UML class diagram

Page 28: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Potential studiesPotential studies

Fault analysis Fault analysis Which of the two correlates more with faultsWhich of the two correlates more with faults

Larger-scale study Larger-scale study The effect of refactoring on the values of The effect of refactoring on the values of

both Coupling(MS) and CBO both Coupling(MS) and CBO Hamming distance for coupling?Hamming distance for coupling? A final word on cohesion……A final word on cohesion……

Page 29: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

CohesionCohesion

A key advantage of the CBO and the A key advantage of the CBO and the reason for its popularity is that there is no reason for its popularity is that there is no argument about its interpretation and to argument about its interpretation and to some extent the Coupling(MS); it is an some extent the Coupling(MS); it is an objectiveobjective measure measure

The same cannot be said about The same cannot be said about cohesion, because it is cohesion, because it is subjectivesubjective

Page 30: Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Thanks for Thanks for listeninglistening