Top Banner
Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz http://scg.unibe.ch/ download/oorp/
87

Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

Dec 26, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

Software Reengineering

& Evolution

Serge DemeyerStéphane DucasseOscar Nierstrasz

http://scg.unibe.ch/download/oorp/

Page 2: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Software Reengineering and Evolution.2

Schedule1. Introduction

There are OO legacy systems too !

2. Reverse EngineeringHow to understand your code

3. VisualizationScaleable approach

4. RestructuringHow to Refactor Your Code

5. Code DuplicationThe most typical problems

6. Software EvolutionLearn from the past

7. Conclusion

Page 3: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.3

GoalsWe will try to convince you:

• Yes, Virginia, there are object-oriented legacy systems too!

• Reverse engineering and reengineering are essential activities in the lifecycle of any successful software system. (And especially OO ones!)

• There is a large set of lightweight tools and techniques to help you with reengineering.

• Despite these tools and techniques, people must do job and they represent the most valuable resource.

Page 4: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.4

What is a Legacy System ?“legacy”

A sum of money, or a specified article, given to another by will; anything handed down by an ancestor or predecessor. — Oxford English Dictionary

so, further evolution and development may be prohibitively expensive

A legacy system is a piece of software that:

• you have inherited, and• is valuable to you.

Typical problems with legacy systems:

• original developers not available• outdated development methods

used• extensive patches and

modifications have been made• missing or outdated

documentation

Page 5: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.5

Software Maintenance - Cost

requirementdesign

codingtesting

delivery

x 1

x 5

x 10

x 20

x 200Relative Maintenance

EffortBetween 50% and 75% of global effort is spent

on “maintenance” !

Relative Costof Fixing Mistakes

Solution ?• Better requirements

engineering?• Better software methods &

tools(database schemas, CASE-tools, objects, components, …)?

Page 6: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.6

Continuous Development

17.4% Corrective(fixing reported errors)

18.2% Adaptive(new platforms or OS)

60.3% Perfective(new functionality)

The bulk of the maintenance cost is due to new functionality even with better requirements, it is hard to predict new

functions

data from [Lien78a]

4.1% Other

Page 7: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.7

(*) process-oriented structured methods, information engineering,data-oriented methods, prototyping, CASE-tools – not OO !

Contradiction ? No!• modern methods make it easier to change

... this capacity is used to enhance functionality!

Modern Methods & Tools ?[Glas98a] quoting empirical study from Sasa

Dekleva (1992)• Modern methods(*) lead to more reliable

software• Modern methods lead to less frequent software

repair• and ...• Modern methods lead to more total

maintenance time

Page 8: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.8

Lehman's LawsA classic study by Lehman and Belady [Lehm85a] identified

several “laws” of system change.

Continuing change• A program that is used in a real-world environment must

change, or become progressively less useful in that environment.

Increasing complexity• As a program evolves, it becomes more complex, and extra

resources are needed to preserve and simplify its structure.

Those laws are still applicable…

Page 9: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.9

What about Objects ?Object-oriented legacy systems• = successful OO systems whose architecture and design

no longer responds to changing requirements

Compared to traditional legacy systems• The symptoms and the source of the problems are the

same• The technical details and solutions may differ

OO techniques promise better• flexibility, • reusability, • maintainability• …

they do not come for free

Page 10: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.10

What about Components ?

Components are very brittle …After a while one inevitably resorts to glue :)

Page 11: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Reengineering Legacy Systems.11

Soccer Field Metaphor

© A. Van Deursen

• Assume 10 lines of code= 40 tiles of 1 x 1

cm• 12.5 million lines of code

40 soccer fields

A. van Deursen, De software-evolutieparadoxIntreerede TU Delft, 23 feb 2005

Imagine 400 developers concurrently

moving tiles around on 40 soccer fields

Page 12: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

technical debt

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.12

How to deal with Legacy ?New or changing requirements will gradually degrade original

design… unless extra development effort is spent to adapt the structure

New Functionality

Hack it in ?

• duplicated code• complex conditionals• abusive inheritance• large

classes/methods

First …• refactor• restructure• reengineer

Take a loan on your software pay back via reengineering

Investment for the future paid back during

maintenance

Page 13: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.13

Common SymptomsLack of Knowledge• obsolete or no

documentation• departure of the original

developers or users• disappearance of inside

knowledge about the system• limited understanding of

entire systemmissing tests

Process symptoms• too long to turn things over

to production• need for constant bug fixes• maintenance dependencies• difficulties separating

products simple changes take too long

Code symptoms• duplicated code• code smellsbig build times

Page 14: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.14

The Reengineering Life-Cycle

Requirements

Designs

Code

(0) requirementanalysis

(1) modelcapture

(2) problemdetection (3) problem

resolution

(4) program transformation

• people centric• lightweight

Page 15: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.15

A Map of Reengineering Patterns

Tests: Your Life Insurance

Detailed Model Capture

Initial Understanding

First Contact

Setting Direction

Migration Strategies

Detecting Duplicated Code

Redistribute Responsibiliti

es

Transform Conditionals to Polymorphism

Page 16: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.16

2. Reverse Engineering• What and Why• First Contact

Interview during Demo

• Initial UnderstandingAnalyze the Persistent Data

• Detailed Model CaptureLook for the Contracts

Page 17: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.17

What and Why ?DefinitionReverse Engineering is the process of analysing a subject system

to identify the system’s components and their interrelationships and

create representations of the system in another form or at a higher level of abstraction. — Chikofsky & Cross, ’90

MotivationUnderstanding other people’s code(cfr. newcomers in the team, code reviewing,original developers left, ...)

Generating UML diagrams is NOT reverse engineering... but it is a valuable support tool

Page 18: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.18

The Reengineering Life-Cycle

(0) req. analysis(1) model captureissues• scale• speed• accuracy• politics

Requirements

Designs

Code

(0) requirementanalysis

(1) modelcapture

(2) problemdetection

(3) problemresolution

(4) program transformation

Page 19: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.19

First Contact

System experts

Chat with theMaintainers

Interviewduring Demo

Talk withdevelopers

Talk withend users

Talk about it

Verify whatyou hear

feasibility assessment(one week time)

Software System

Read All the Codein One Hour

Do a MockInstallation

Read it Compile it

Skim theDocumentation

Read about it

Page 20: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.20

First Project PlanUse standard templates, including:• project scope

see "Setting Direction"

• opportunities e.g., skilled maintainers, readable source-code,

documentation

• risks e.g., absent test-suites, missing libraries, … record likelihood (unlikely, possible, likely)

& impact (high, moderate, low) for causing problems

• go/no-go decision• activities

fish-eye view

Page 21: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.21

• Solution: interview during demo- select several users- demo puts a user in a positive

mindset- demo steers the interview

Interview during Demo

Solution: Ask the user!

• ... howeverWhich user ?Users complainWhat should you ask ?

Problem: What are the typical usage scenarios?

Page 22: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.22

Initial Understanding

understand higher-level model

Top down

Speculate about Design

Recover design

Analyze the Persistent

Data

Study the Exceptional

Entities

Recover database

Bottom up

Identify problems

Page 23: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.23

Analyze the Persistent Data

Problem: Which objects represent valuable data?Solution: Analyze the database schema• Prepare Model

tables classes; columns attributes candidate keys (naming conventions + unique indices) foreign keys (column types + naming conventions

+ view declarations + join clauses)• Incorporate Inheritance

one to one; rolled down; rolled up• Incorporate Associations

association classes (e.g. many-to-many associations) qualified associations

• Verification Data samples + SQL statements

Page 24: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.24

Example: One To One

Patientid: char(5)insuranceID: char(7)insurance: char(5)

Salesmanid: char(5)company: char(40)

Personid: char(5)name: char(40)addresss: char(60)

Patientid: char(5)insuranceID: char(7)insurance: char(5)

Salesmanid: char(5)company: char(40)

Personid: char(5)name: char(40)addresss: char(60)

Page 25: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.25

Example: Rolled DownPatientid: char(5)name: char(40)addresss: char(60)insuranceID: char(7)insurance: char(5)

Salesmanid: char(5)name: char(40)

addresss: char(60)

company: char(40)

Patientid: char(5)insuranceID: char(7)insurance: char(5)

Salesmanid: char(5)company: char(40)

Personid: char(5)name: char(40)addresss: char(60)

Page 26: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.26

Example: Rolled Up

Personid: char(5)name: char(40)addresss: char(60)insuranceID: char(7) «optional»insurance: char(5) «optional»company: char(40) «optional»

Patientid: char(5)insuranceID: char(7)insurance: char(5)

Salesmanid: char(5)company: char(40)

Personid: char(5)name: char(40)addresss: char(60)

Page 27: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.27

Example: Qualified Association

Patientid: char(5)…

TreatmentpatientID: char(5)date: datenr: integercomment: varchar(255)

Patientid: char(5)…

Treatmentcomment: Text

date: Datenr: Integer

1

1

addTreatment(d, n, t)lookupTreatment(d, n)

Page 28: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.28

Initial Understanding (revisited)

Top down

Speculate about Design

Analyze the Persistent

Data

Study the Exceptional

Entities

understand higher-level model

Bottom up

ITERATION

Recover design

Recover database

Identify problems

Page 29: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.29

3. Software Visualization• Introduction

The Reengineering life-cycle

• Examples• Lightweight Approaches

CodeCrawler

• Dynamic Analysis• Conclusion

Page 30: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.30

The Reengineering Life-cycle

Requirements

Designs

Code

(0) requirementanalysis

(1) modelcapture

(2) problemdetection (3) problem

resolution

(4) program transformation

(2) problem detectionissues• Tool support• Scalability• Efficiency

Page 31: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.31

Visualising Hierarchies• Euclidean cones

Pros:• More info than

2D Cons:

• Lack of depth• Navigation

• Hyperbolic trees Pros:

• Good focus• Dynamic

Cons: • Copyright

Page 32: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.32

Bottom Up Visualisation

Filter

All program entities

and relations

Page 33: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.33

A lightweight approach• A combination of

metrics and software visualizationVisualize software using

colored rectangles for the entities and edges for the relationships

Render up to five metrics on one node:• Size (1+2)• Color (3)• Position (4+5)

Relationship

Entity

Y Coordinate

Height Color tone

Width

X Coordinate

Page 34: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.34

Nodes: ClassesEdges: Inheritance RelationshipsWidth: Number of attributesHeight: Number of methodsColor: Number of lines of code

System Complexity View

Page 35: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.35

Inheritance Classification View

Boxes: ClassesEdges: InheritanceWidth: Number of Methods AddedHeight: Number of Methods OverriddenColor: Number of Method Extended

Page 36: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.36

Data Storage Class Detection View

Boxes: ClassesWidth: Number of Methods Height: Lines of CodeColor: Lines of Code

Page 37: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.37

Industrial Validation

Nokia (C++ 1.2 MLOC >2300 classes)Nokia (C++/Java 120 kLOC >400 classes)MGeniX (Smalltalk 600 kLOC >2100classes)Bedag (COBOL 40 kLOC)...

Personal experience2-3 days to get something

Used by developers + Consultants

Page 38: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Reengineering Legacy Systems.38

CO

BO

L C

ALL

GR

AP

H

Page 39: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.39

Program Dynamics• Simple• Reproducible• Scales well

Page 40: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.40

• Visualization of similarities in event traces

• Eliminate similarities

Frequency Spectrum

Page 41: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.41

• Extract run-time coupling• Apply datamining

(“google”)• Experiment with

documented open-source cases (Ant, JMeter) recall: +- 90 % precision: +- 60 %

Key Concept Identification

Class

IC_C

C’ +

web

-m

inin

g

An

t docs

Project √ √

UnknownElement √ √

Task √ √

Main √ √

IntrospectionHelper √ √

ProjectHelper √ √

RuntimeConfigurable √ √

Target √ √

ElementHandler √ √

TaskContainer × √

Recall (%) 90 -

Precision (%) 60 -

Page 42: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Reengineering Legacy Systems.42

Replication

T. Eisenbarth, R. Koschke, and D. Simon. Locating features in source code. IEEE Transactions on Software Engineering, 29(3):210–224, March 2003.

Replication is not supported, industrial cases are rare, …. In order to help the discipline mature, we think that more systematic empirical evaluation is needed.[Tonella et. Al, in Empirical Software Engineering]

Page 43: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Reengineering Legacy Systems.43

Pilot Study: ATM Simulation

Assumptions• Feature: invoked from the

outside.• Map: scenario-feature map

exists• Recompile: recompile or

instrumentation possible• Isolate: system can run in

isolation (prevent noise)• Manual: perform dynamic

analysis without help (I.e. no operator)

• Generic: no limit to granularity of computational unit

scenario-feature map

concept lattice

Page 44: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Reengineering Legacy Systems.44

Case Study: Portfolio Management

2nd iteration

3rd iteration

Page 45: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.45

4. RestructuringMost common situations

Transform Conditionals to Polymorphism Transform Self Type Checks Transform Provider Type Checks

Redistribute Responsibilities Move Behaviour Close to Data Eliminate Navigation Code Split up God Class Empirical Validation

Page 46: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.46

Transform Conditionals to Polymorphism

TransformSelf Type Checks

Test providertype

Test self type Test externalattribute

TransformClient Type Checks

Transform Conditionalsinto Registration

Testnull values

IntroduceNull Object

Factor Out Strategy

Factor Out State

Test object state

Page 47: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.47

class Message {private:

int type_; void* data;...void send (Channel* ch) {

switch (type_) {case TEXT : {ch->nextPutAll(data);break;

}case ACTION : {ch->doAction(data); ...

void makeCalls (Telephone* phoneArray[]) {

for (Telephone *p = phoneArray;p; p++) {

switch (p-> phoneType()) {case TELEPHONE::POTS : {POTSPhone* potsp =

(POTSPhone*)ppotsp->tourne();potsp->call();...

case TELEPHONE::ISDN : {ISDNPhone* isdnp =

(ISDNPhone*)pisdnp->initLine();isdnp->connect();...

Example: Transform Conditional

Transform Self Type Checks Transform Client Type Checks

Page 48: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.48

Messagesend()

Messagesend()

ActionMessagesend()

TextMessagesend()

switch (type_) {case TEXT : {

ch->nextPutAll(data);break;}

case ACTION : {ch->doAction(data);

...

Client1

Client1

Client2

Client2

Transform Self Type Check

Page 49: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.49

TelephoneBoxmakeCall ()

Telephone

POTSPhone...

ISDNPhone...

TelephoneBoxmakeCall ()

TelephonemakeCall()

POTSPhonemakeCall()

...

ISDNPhonemakeCall

...

Transform Client Type Check

Page 50: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.50

Redistribute Responsibilities

Eliminate Navigation Code

Data containers

Monster clientof data containers

Split Up God Class

Move Behaviour Close to Data

Chains ofdata containers

Page 51: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.51

Move Behavior Close to Data (example

1/2)Employee+telephoneNrs+name(): String+address(): String

Payroll+printEmployeeLabel()

System.out.println(currentEmployee.name() );System.out.println(currentEmployee.address() );for (int i=0; i < currentEmployee.telephoneNumbers.length; i++) {

System.out.print(currentEmployee.telephoneNumbers[i]);System.out.print(" ");}

System.out.println("");

TelephoneGuide+printEmployeeTelephones()

*

*

…for …

System.out.print(" -- ");…

Page 52: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.52

Move Behavior Close to Data (example

2/2)Employee- telephoneNrs- name(): String- address(): String+printLabel(String)

Payroll+printEmployeeLabel()

public void printLabel (String separator) {System.out.println(_name);System.out.println(_address);for (int i=0; i < telephoneNumbers.length; i++) {

System.out.print(telephoneNumbers[i]);System.out.print(separator);}

System.out.println("");}

TelephoneGuide+printEmployeeTelephones()

*

*

…emp.printLabel(" -- ");…

…emp.printLabel(" ");…

Page 53: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.53

Car-engine+increaseSpeed()

Eliminate Navigation Code

…engine.carburetor.fuelValveOpen = true

Engine+carburetor

Car-engine+increaseSpeed()

Carburetor+fuelValveOpen

Engine-carburetor+speedUp()

Car-engine+increaseSpeed()

…engine.speedUp()

carburetor.fuelValveOpen = true

Carburetor-fuelValveOpen+openFuelValve()

Engine-carburetor+speedUp()

carburetor.openFuelValve()fuelValveOpen = true

Carburetor+fuelValveOpen

Page 54: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.54

Split Up God ClassProblem: Break a class which monopolizes control?Solution: Incrementally eliminate navigation code• Detection:

measuring size class names containing Manager, System, Root, Controller the class that all maintainers are avoiding

• How: move behaviour close to data + eliminate navigation code remove or deprecate façade

• However: If God Class is stable, then don't split

shield client classes from the god class

Page 55: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.55

Split Up God Class: 5 variants

ControllerA

ControllerFilter1

Filter2

B

ControllerFilter1

Filter2

MailHeader

C

ControllerFilter1Filter2

MailHeader

FilterActionD

ControllerFilter1Filter2

MailHeader

FilterAction

NameValuePair

E

Mail client filters incoming mail

Extract behavioral class

Extract data class

Extract behavioral class

Extract data class

Page 56: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.56

Empirical Validation• Controlled experiment with 63

last-year master-level students (CS and ICT)

Independent Variables Dependent Variables

Experimental task

Institution

God classdecomposition

9

6

3 Time

Accuracy

Page 57: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.57

Interpretation of Results• “Optimal decomposition” differs with respect

to training Computer science: preference towards C-E ICT-electronics: preference towards A-C

• Advanced OO training can induce a preference towards particular styles of decomposition Consistent with [Arisholm et al. 2004]

“Good” design is in

the eye of the

beholder

Page 58: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.58

5. Code Duplicationa.k.a. Software Cloning,

Copy&Paste Programming

• Code Duplication What is it? Why is it harmful?

• Detecting Code Duplication• Approaches• A Lightweight Approach• Visualization (dotplots)• Duploc• Recent trends

Page 59: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.59

The Reengineering Life-Cycle

Requirements

Designs

Code

(0) requirementanalysis

(1) modelcapture

(2) problemdetection (3) problem

resolution

(2) Problem detection

(2) Problem detection

issues• Scale• Unknown a priori

Page 60: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.60

Code is CopiedSmall Example from the Mozilla Distribution (Milestone 9)Extract from /dom/src/base/nsLocation.cpp

[432] NS_IMETHODIMP [433] LocationImpl::GetPathname(nsString[434] {[435] nsAutoString href;[436] nsIURI *url;[437] nsresult result = NS_OK;[438] [439] result = GetHref(href);[440] if (NS_OK == result) {[441] #ifndef NECKO[442] result = NS_NewURL(&url, href);[443] #else[444] result = NS_NewURI(&url, href);[445] #endif // NECKO[446] if (NS_OK == result) {[447] #ifdef NECKO[448] char* file;[449] result = url->GetPath(&file);[450] #else[451] const char* file;[452] result = url->GetFile(&file);[453] #endif[454] if (result == NS_OK) {[455] aPathname.SetString(file);[456] #ifdef NECKO[457] nsCRT::free(file);[458] #endif[459] }[460] NS_IF_RELEASE(url);[461] }[462] }[463] [464] return result;[465] }[466]

[467] NS_IMETHODIMP [468] LocationImpl::SetPathname(const nsString[469] {[470] nsAutoString href;[471] nsIURI *url;[472] nsresult result = NS_OK;[473] [474] result = GetHref(href);[475] if (NS_OK == result) {[476] #ifndef NECKO[477] result = NS_NewURL(&url, href);[478] #else[479] result = NS_NewURI(&url, href);[480] #endif // NECKO[481] if (NS_OK == result) {[482] char *buf = aPathname.ToNewCString();[483] #ifdef NECKO[484] url->SetPath(buf);[485] #else[486] url->SetFile(buf);[487] #endif[488] SetURL(url);[489] delete[] buf;[490] NS_RELEASE(url); [491] }[492] }[493] [494] return result;[495] }[496]

[497] NS_IMETHODIMP [498] LocationImpl::GetPort(nsString& aPort)[499] {[500] nsAutoString href;[501] nsIURI *url;[502] nsresult result = NS_OK;[503] [504] result = GetHref(href);[505] if (NS_OK == result) {[506] #ifndef NECKO[507] result = NS_NewURL(&url, href);[508] #else[509] result = NS_NewURI(&url, href);[510] #endif // NECKO[511] if (NS_OK == result) {[512] aPort.SetLength(0);[513] #ifdef NECKO[514] PRInt32 port;[515] (void)url->GetPort(&port);[516] #else[517] PRUint32 port;[518] (void)url->GetHostPort(&port);[519] #endif[520] if (-1 != port) {[521] aPort.Append(port, 10);[522] }[523] NS_RELEASE(url);[524] }[525] }[526] [527] return result;[528] }[529]

Page 61: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.61

Case Study LOC

Duplication without comment

s

with comment

s

gcc 460’000 8.7% 5.6%

Database Server

245’000 36.4% 23.3%

Payroll 40’000 59.3% 25.4%

Message Board

6’500 29.4% 17.4%

How Much Code is Duplicated?

Usual estimates: 8 to 12% in normal industrial code

15 to 25 % is already a lot!

Page 62: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.62

Copied Code Problems• General negative effect:

Code bloat• Negative effects on Software Maintenance

Copied Defects Changes take double, triple, quadruple, ... Work Dead code Add to the cognitive load of future maintainers

• Copying as additional source of defects Errors in the systematic renaming produce unintended

aliasing• Metaphorically speaking:

Software Aging, “hardening of the arteries”, “Software Entropy” increases even small design

changes become very difficult to effect

Page 63: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.63

Code Duplication Detection

Nontrivial problem: • No a priori knowledge about which code has been copied• How to find all clone pairs among all possible pairs of segments?

Lexical Equivalence

Semantic Equivalence

Syntactical Equivalence

Type I

Type II(& Type III)

Type IV

Page 64: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.64

General Schema of Detection Process

Source Code Transformed Code Duplication Data

Transformation Comparison

Author Level Transformed CodeComparison Technique

[John94a] Lexical Substrings String-Matching

[Duca99a] Lexical Normalized Strings String-Matching

[Bake95a] Syntactical Parameterized Strings String-Matching

[Mayr96a] Syntactical Metric Tuples Discrete comparison

[Kont97a] Syntactical Metric Tuples Euclidean distance

[Baxt98a] Syntactical AST Tree-Matching

Page 65: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.65

Simple Detection Approach (i)

• Assumption: • Code segments are just copied and changed at a few places

• Code Transformation Step• remove white space, comments• remove lines that contain uninteresting code elements

(e.g., just ‘else’ or ‘}’)

…//assign same fastid as containerfastid = NULL;const char* fidptr = get_fastid();if(fidptr != NULL) { int l = strlen(fidptr); fastid = newchar[ l + 1 ];

…fastid=NULL;constchar*fidptr=get_fastid();if(fidptr!=NULL)intl=strlen(fidptr)fastid = newchar[l+]

Page 66: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.66

Simple Detection Approach (ii)

• Code Comparison Step Line based comparison (Assumption: Layout did not

change during copying) Compare each line with each other line. Reduce search space by hashing:

1. Preprocessing: Compute the hash value for each line2. Actual Comparison: Compare all lines in the

same hash bucket• Evaluation of the Approach

Advantages: Simple, language independent Disadvantages: Difficult interpretation

Page 67: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.67

A Perl script for C++ (i)while (<>) { chomp; $totalLines++;

# remo ve comments of type /* */ my $codeOnly = ''; while(($inComment && m|\*/|) || (!$inComment && m|/\*|)) { unless($inComment) { $codeOnly .= $` } $inComment = !$inComment; $_ = $'; } $codeOnly .= $_ unless $inComment; $_ = $codeOnly;

s|//.*$||; # remo ve comments of type // s/\s+//g; #remo ve white space s/$keywordsRegExp//og if $remo veKeywords; #remo ve keywords

$equivalenceClassMinimalSiz e = 1;$slidingWindo wSize = 5;$remo veKeywords = 0;@keywords = qw(if then else );

$keywordsRegExp = join '|', @k eywords;

@unwantedLines = qw( else return return; { } ; );push @unw antedLines, @keywords;

Page 68: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.68

A Perl script for C++ (ii)$codeLines++; push @currentLines, $_; push @currentLineNos, $.; if($slidingWindowSiz e < @currentLines) { shift @currentLines; shift @currentLineNos;} #print STDERR "Line $totalLines >$_<\n"; my $lineToBeCompared = join '', @currentLines; my $lineNumbersCompared = "<$ARGV>"; # append the name of the fi le $lineNumbersCompared .= join '/', @currentLineNos; #print STDERR "$lineNumbersCompared\n"; if($bucketRef = $eqLines{$lineToBeCompared}) { push @$bucketRef , $lineNumbersCompared; } else {$eqLines{$lineToBeCompared} = [ $lineNumbersCompared ];} if(eof) { close ARGV } # Reset linenumber-count for next file

•Handles multiple files•Removes comments

and white spaces•Controls noise (if, {,)•Granularity (number

of lines)•Possible to remove

keywords

Page 69: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.69

Output SampleLines: create_property(pd,pnImplObjects,stReference,false,*iImplObjects);create_property(pd,pnElttype,stReference,true,*iEltType);create_property(pd,pnMinelt,stInteger,true,*iMinelt);create_property(pd,pnMaxelt,stInteger,true,*iMaxelt);create_property(pd,pnOwnership,stBool,true,*iOwnership);Locations: </face/typesystem/SCTypesystem.C>6178/6179/6180/6181/6182 </face/typesystem/SCTypesystem.C>6198/6199/6200/6201/6202Lines: create_property(pd,pnSupertype,stReference,true,*iSupertype);create_property(pd,pnImplObjects,stReference,false,*iImplObjects);create_property(pd,pnElttype,stReference,true,*iEltType);create_property(pd,pMinelt,stInteger,true,*iMinelt);create_property(pd,pnMaxelt,stInteger,true,*iMaxelt);Locations: </face/typesystem/SCTypesystem.C>6177/6178</face/typesystem/SCTypesystem.C>6229/6230

Lines = duplicated linesLocations = file names and line number

Page 70: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.70

Visualization of Duplicated Code

•Visualization provides insights into the duplication situation•A simple version can be implemented in three days•Scalability issue

•Dotplots — Technique from DNA Analysis • Code is put on vertical as well as horizontal axis• A match between two elements is a dot in the matrix

Exact Copies Copies with Inserts/Deletes Repetitive

a b c d e f a b c d e f a b c d e fa b x y e f b c d e a b x y dc ea x b c x d e x f xg ha

Variations Code Elements

Page 71: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.71

Visualization of Copied Code Sequences

All examples are made using Duploc from an industrial case study (1 Mio LOC C++ System)

Detected ProblemFile A contains two copies of a piece of code

File B contains another copy of this code

Possible SolutionExtract Method

File A

File A

File B

File B

Page 72: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.72

Visualization of Repetitive Structures

Detected Problem4 Object factory clones: a switch statement over a type variable is used to call individual construction code

Possible SolutionStrategy Method

Page 73: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.73

Visualization of Cloned Classes

Class A

Class B

Class BClass A

Detected Problem:Class A is an edited copy of class B. Editing & Insertion

Possible SolutionSubclassing …

Page 74: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.74

Visualization of Clone Families

20 Classes implementing lists for different data types

DetailOverview

Page 75: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

RecentTrends

© S. Demeyer, S. Ducasse, O. NierstraszObject-Oriented

Reengineering.75

Clone Detection Inside

Duplic

ate

Bug F

ixes

Plagiarism

Lice

nce

Infri

ngem

ent

& Pro

vena

nce

Malware Detection

Page 76: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.76

6. Software Evolution• Exploiting the Version Control

System Visualizing CVS changes

• The Evolution Matrix• Yesterday's weather

It is not age that turns a piece of software into a legacy system,but the rate at which it has been developed and adapted without being reengineered.

[Demeyer, Ducasse and Nierstrasz: Object-Oriented Reengineering Patterns]

Page 77: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.77

The Reengineering Life-Cycle

Requirements

Designs

Code

(0) requirementanalysis

(1) modelcapture

(2) problemdetection (3) problem

resolution

(2) Problem detection

(2) Problem detection Issues• scale

Page 78: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.78

Analyse CVS changes

4) Block Shift = Design Change

3) Triangle = Core Reduces

1) Vertical lines = Frequent Changers

2) Horizontal line = Shotgun Surgery

Page 79: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.79

Ownership Map:Developer Activity

DialogueMonologue

Edit Takeover

Familiarization

Page 80: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

What to (re)test ?

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.80

Data from Windows Vista and Windows 7

Software components with a high level of ownership will have fewer failures than components with lower top ownership levels.

Software components with many minor contributors will have more failures than software components that have fewer.

Page 81: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.81

The Evolution Matrix

Last Version

First Version

Major Leap

Removed Classes

TIME (Versions)

Growth Stabilisation

Added Classes

Page 82: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.82

Example: MooseFinder (38 Versions)

Page 83: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Reengineering Legacy Systems.83

Test history

single test

unit tests

integration tests

… affect unit tests… affect unit tests

phased testing

System under study = checkstyle

Page 84: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

Selenium Tests

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.84

Git repositories of the XWiki, OpenLMIS and Atlas

© Laurent Christophe (Vrije Universiteit Brussel)

Avoid Magic Constants !!

Page 85: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

Recommender Systems

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.85

Stack Trace ⇒ link to source code

Description ⇒ text mining

Who to fix ? How long to fix ?

Misclassified bug reports ?

Page 86: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.86

7. Conclusion1. Introduction

There are OO legacy systems too !

2. Reverse EngineeringHow to understand your code

3. VisualizationScaleable approach

4. RestructuringHow to Refactor Your Code

4. Code DuplicationThe most typical problems

5. Software EvolutionLearn from the past

6. ConclusionDid we convince you?

Page 87: Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.87

GoalsWe will try to convince you:• Yes, Virginia, there are object-oriented legacy

systems too! … actually, that's a sign of health

• Reverse engineering and reengineering are essential activities in the lifecycle of any successful software system. (And especially OO ones!) … consequently, do not consider it second class

work• There is a large set of lightweight tools and

techniques to help you with reengineering. … check our book, but remember the list is growing

• Despite these tools and techniques,people must do job and represent the most valuable resource. … pick them carefully and reward them properly

Did we convince you ?