Top Banner
Modeling History to Understand Software Evolution PhD Defense Tudor Gîrba Supervisors: Stéphane Ducasse, Oscar Nierstrasz 13 27 73
70

Girba Phd Presentation 2005-11-14

Jun 23, 2015

Download

Technology

Tudor Girba

I used this set of slides for my PhD defense.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Girba Phd Presentation 2005-11-14

Modeling Historyto Understand Software Evolution

PhD Defense

Tudor Gîrba

Supervisors: Stéphane Ducasse, Oscar Nierstrasz

13 2773

Page 2: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Context: Reverse engineering is creating high level views of the system

Forward EngineeringRevers

e Eng

ineeri

ng

Time

RequirementsAnalysis

Design

Implementation

2

Page 3: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Context: Reverse engineering is creating high level views of the system

Forward EngineeringRevers

e Eng

ineeri

ng

Time

RequirementsAnalysis

Design

Implementation

2

Page 4: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Context: History holds useful information for reverse engineering

The doctor always looks at my health file

Historical information is useful but, it is hidden among huge amounts of data

The more data the more techniques are needed to analyze it

Version 1 Version 2 Version 3 … Version n

N versions meansN times more data

3

Page 5: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Context: Many techniques were developed

[Lanza, Ducasse ‘02][Lehman etal. ‘01]

[Gall etal. ‘03]

…Evolution patterns

Trend analysis

Co-changeanalysis

[Eick etal. ‘02]Authors analysis4

Page 6: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

[Lanza, Ducasse ‘02][Lehman etal. ‘01]

[Gall etal. ‘03]

…Evolution patterns

Trend analysis

Co-changeanalysis

[Eick etal. ‘02]Authors analysis

Problem: Current approaches rely on ad-hoc models or on too specific meta-models

5

Page 7: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

[Lanza, Ducasse ‘02][Lehman etal. ‘01]

[Gall etal. ‘03]

…Evolution patterns

Trend analysis

Co-changeanalysis

[Eick etal. ‘02]Authors analysis

Problem: Current approaches rely on ad-hoc models or on too specific meta-models

Research question:

How can we build a generic meta-model?

5

Page 8: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Version

Version

History

History

VersionHistory

6

Page 9: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Version

Version

History

History

VersionHistory

Hismo:Modeling History

Version

Version

History

History

VersionHistory

6

Page 10: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Example: Evolution Matrix reveals different evolution patterns

Class

NOM

NOA

versions

Polymetricview

[Lanza, Ducasse ‘02]

PulsarClass

IdleClass

White DwarfClass

SupernovaClass

7

Page 11: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Example: Evolution Matrix reveals different evolution patterns

Class

NOM

NOA

versions

Polymetricview

[Lanza, Ducasse ‘02]

PulsarClass

IdleClass

White DwarfClass

SupernovaClass

Thesis:

Evolution needs to be modeledas a first class entity

7

Page 12: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Solution: History encapsulates and characterizes the evolution

versions

PulsarClass History

ClassHistoryIdleClass History

White DwarfClass History

SupernovaClass History

isPulsarisIdle…

8

Page 13: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Hismo: The history meta-model

SystemVersion

ClassVersion

ClassHistory

9

Page 14: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Hismo: The history meta-model

SystemHistory

SystemVersion

ClassVersion

ClassHistory

9

Page 15: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Hismo: The history meta-model

SystemHistory

SystemVersion

ClassVersion

ClassHistory

9

Page 16: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

… but, what about relationships?

SystemHistory

SystemVersion

ClassVersion

ClassHistory

InheritanceVersion

10

Page 17: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

… but, what about relationships?

SystemHistory

SystemVersion

ClassVersion

ClassHistory

InheritanceHistory

InheritanceVersion

10

Page 18: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Hismo is obtained by transforming the structural meta-model

History Version

VersionHistory

History Version

11

Page 19: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

12

Page 20: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Application:History measurements

1327

73

12

Page 21: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

2

2

4

2

2

2

3

3

1

2

5

4

2

2

7

9

3

2

5 3 4 4

2

2

1

Problem: History holds useful information hidden among large amounts of data

How much was a class changed?When was a class changed?…

13

Page 22: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

|NOMi(C)-NOMi-1(C)|ENOM(C)= ∑i=2

n

5 3 4 41

ENOM(C)= 4 + 2 + 1 + 0 = 7

History can be measured: How much was a class changed?

Evolution of Number of Methods

13 2773

14

Page 23: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Latest Evolution of Number of Methods

Earliest Evolution of Number of Methods

LENOM(C)= ∑i=2

n|NOMi(C)-NOMi-1(C)| 2i - n

EENOM(C)= ∑i=2

n|NOMi(C)-NOMi-1(C)| 22 - i

5 3 4 41

LENOM(C)= 4 2-3 + 2 2-2 + 1 2-1 + 0 20 = 1

EENOM(C)= 4 20 + 2 2-1 + 1 2-2 + 0 2-3 = 5.125

History can be measured: When was a class changed?

13 2773

15

Page 24: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

History measurements compress aspects of the evolution into numbers

2

2

4

2

2

2

3

3

1

2

5

4

2

2

B

C

D

A 7

9

3

2

5 3 4E 4

ENOM LENOM EENOM

7 3.37 3.25

7 5.75 1.37

3 1 2

0 0 0

7 1 5.12

2

2

1

13 2773

16

Page 25: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

History measurements compress aspects of the evolution into numbers

13 2773

Late changer

Dead stable

Early changer

Balanced changer

B

C

D

A

E

ENOM LENOM EENOM

7 3.37 3.25

7 5.75 1.37

3 1 2

0 0 0

7 1 5.12

17

Page 26: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Many measurements can be defined at different levels of abstraction …

13 2773

EvolutionLatest/Earliest EvolutionStabilityHistorical Max/MinHistorical AverageGrowth Trend…

of

Number of MethodsNumber of StatementsCyclomatic ComplexityLines of CodeNumber of ClassesNumber of modules…

… But measurements are a means not a goal18

Page 27: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

19

Page 28: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Application:Yesterday’s Weather

19

Page 29: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Common Wisdom: The recently changed parts are likely to change in the near future

Is the common wisdom relevant?

Yesterday’s Weather metaphor:It expresses the chances of having the same weather today as we had yesterdayIt is location specific

Sahara - 90%Switzerland - 30%

[Mens,Demeyer ‘01]

20

Page 30: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Yesterday’s Weather: For each given version we check the common wisdom

Presentversion

Pastversions

Futureversions

YesterdayWeatherHit(present):

past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end)

past.intersectWith(future).notEmpty()

21

Page 31: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Yesterday’s Weather: For each given version we check the common wisdom

Past LateChangers

Presentversion

Pastversions

Futureversions

YesterdayWeatherHit(present):

past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end)

past.intersectWith(future).notEmpty()

21

Page 32: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Yesterday’s Weather: For each given version we check the common wisdom

Past LateChangers

Future EarlyChangers

Presentversion

Pastversions

Futureversions

YesterdayWeatherHit(present):

past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end)

past.intersectWith(future).notEmpty()

21

Page 33: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Yesterday’s Weather: For each given version we check the common wisdom

Past LateChangers

Future EarlyChangers

Presentversion

Pastversions

Futureversionshit

YesterdayWeatherHit(present):

past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end)

past.intersectWith(future).notEmpty()

21

Page 34: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Overall Yesterday’s Weather shows the localization of changes in time

hit hit hit hithit hit hit hithit

7 hits

8 possiblehits

= 87%3 hits

8 possiblehits

= 37%

hit

YW =YW =

22

Page 35: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Overall Yesterday’s Weather shows the localization of changes in time

hit hit hit hithit hit hit hithit

7 hits

8 possiblehits

= 87%3 hits

8 possiblehits

= 37%

hit

Case studies:

40 versions of CodeCrawler (180 classes): 100%40 versions of Jun (700 classes): 79%40 versions of Jboss (4000 classes): 53%

YW =YW =

22

Page 36: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

23

Page 37: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Application:History-based Detection Strategies

23

Page 38: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Context: Detection Strategies detect design flaws based on measurements

Example: God Class Maintainability problem because it encapsulates a lot of knowledge

Class ATFD > 40

Class WMC > 75

Class TCC < 0.2

Class NOA > 20

AND

AND

OR God Class

[Marinescu ‘04]

24

Page 39: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

History-based Detection Strategies take evolution into account

Example: a Stable God Class is not necessarily a bad one

History Last God Class

History Stability > 95%AND Stable God Class

25

Page 40: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

History-based Detection Strategies take evolution into account

Example: a Stable God Class is not necessarily a bad one

History Last God Class

History Stability > 95%AND Stable God Class

Case study: 5 out of 24 God Classes in Jun were stable and harmless

25

Page 41: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

26

Page 42: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Application:Characterizing the evolution

of class hierarchies

26

Page 43: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Context: Given the evolution of a hierarchy …

B

A

B

A

BC

A

BC

D

A

BC

D

A

ED

B is stable

C was removed

E is newborn

A is persistent

D inherited from C and then from A …

time

27

Page 44: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

How were the hierarchies evolved?

… but useful information is hidden among large amounts of data

28

Page 45: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Hierarchy Evolution Complexity View characterizes class hierarchy histories

B is stable

C was removed

E is newborn

A is persistent

D inherited from C and then from A …

A

B

E

C

D

ENOM

ENOS

Removed

Age

Removed

Age InheritanceHistory

ClassHistory

29

Page 46: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Case study: Class hierarchies in Jun reveal evolution patterns

OldStableBalancedReliable inheritance

PersistentUnbalancedStableReliable inheritance

OldUnstableUnbalancedUnreliable inheritance

YoungUnstable rootReliable inheritance

Newborn

30

Page 47: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

31

Page 48: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Application:Detecting co-change patterns

31

Page 49: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Context: Repeated co-changes reveal hidden dependencies

A

B

C

D

E

v1 v2 v3 v4 v5 v6

Can we identify co-change patterns like:

Parallel InheritanceShotgun Surgery…

?

[Gall etal. ‘98]

32

Page 50: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Formal Concept Analysis (FCA) finds elements that have properties in common

A

B

C

D

E

P1 P2 P3 P4 P5 P6

(A,D,E)(P2)

(A,D)(P2,P6)

(A,B,C,D)(P6)

(A,B,C,D,E)()

(D,E)(P2,P4)

(A,B,C)(P5,P6)

(A)(P2,P5,P6)

(D)(P2,P4,P6)

(C)(P3,P5,P6)

()(P1,P2,P3,P4,P5,P6)

FCA

To use FCA, we need to map our interestson elements and properties

[Ganter, Wille ‘99]

33

Page 51: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Formal Concept Analysis (FCA) finds elements that have properties in common

A

B

C

D

E

P1 P2 P3 P4 P5 P6

(A,D,E)(P2)

(A,D)(P2,P6)

(A,B,C,D)(P6)

(A,B,C,D,E)()

(D,E)(P2,P4)

(A,B,C)(P5,P6)

(A)(P2,P5,P6)

(D)(P2,P4,P6)

(C)(P3,P5,P6)

()(P1,P2,P3,P4,P5,P6)

FCA

To use FCA, we need to map our interestson elements and properties

[Ganter, Wille ‘99]

34

Page 52: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

We use FCA to identify entities thatco-changed repeatedly

A

B

C

D

E

v1 v2 v3 v4 v5 v6

(A,D,E)(v2)

(A,D)(v2,v6)

(A,B,C,D)(v6)

(A,B,C,D,E)()

(D,E)(v2,v4)

(A,B,C)(v5,v6)

(A)(v2,v5,v6)

(D)(v2,v4,v6)

(C)(v3,v5,v6)

()(v1,v2,v3,v4,v5,v6)

FCA

Elements = HistoriesProperties = “changed in version X”

35

Page 53: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Example: Parallel inheritance denotes children added in several hierarchies

0 1 1 1 2 4A

AA A A A A

Elements = ClassHistoriesProperties = “changed number of children in version X”

v1 v2 v3 v4 v5 v6

36

Page 54: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Example: Parallel inheritance denotes children added in several hierarchies

0 1 1 1 2 4A

AA A A A A

Elements = ClassHistoriesProperties = “changed number of children in version X”

v1 v2 v3 v4 v5 v6

Case study: JBoss

ServiceMBeanSupportJBossTestCase

EJBLocalHomeEJBLocalObject

9versions

14versions

36

Page 55: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

37

Page 56: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Application:Ownership map

37

Page 57: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Context: The code history might tell you what happened, but not why it happened

files

time

[Rysselberghe, Demeyer ‘04]Case study: Outsight

38

Page 58: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Context: The code history might tell you what happened, but not why it happened

files

time

[Rysselberghe, Demeyer ‘04]

Who is responsible for this?

Case study: Outsight

38

Page 59: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

We color the lines to show which author owned which files in which period

File History A

File History B

Green authorlarge commit

Green authorownership

Blue authorsmall commit

39

Page 60: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

The commit history shows what happened

40

Page 61: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Ownership Map shows which author owned which files in which period

41

Page 62: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

d(A, B) = ∑ min2{ | a - b | b ∈ B }

We cluster the file histories to favor colored blocks inside each module

We use the Hausdorf distance between the commit timestamps

a ∈ A

B

A

42

Page 63: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Ownership Map on alphabetically ordered files is not very useful, but …

43

Page 64: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

The ordered Ownership Map reveals developer patterns

44

Page 65: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

The ordered Ownership Map reveals developer patterns

DialogueMonologue

Edit Takeover

Familiarization 44

Page 66: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

45

Page 67: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Version

Version

History

History

VersionHistory

Overview

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Implementation:

Both Hismo and its applicationsare implemented in

one single infrastructure

45

Page 68: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Implementation: All tools are integrated into Moose

Van

Moose

CodeCrawler Chronia

13 2773

ConAn

Integration mechanismModel repository Extensible meta-model

46

Page 69: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Conclusion: Hismo offers a uniform way of expressing evolution analyses

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Version

Version

History

History

VersionHistory

47

Page 70: Girba Phd Presentation 2005-11-14

© Tudor Gîrba /47

Conclusion: Hismo offers a uniform way of expressing evolution analyses

Hismo

Applications

Yesterday’sWeather

History-basedDetectionStrategies

Hierarchyevolution

Co-changepatterns

OwnershipMap

Historicalmeasurements

13 2773

Questions?

Version

Version

History

History

VersionHistory

47