Documenting the feature implementations mined from the OO source code of a collection of software variants Rafat AL-MSIE’DEEN, Abdelhak-Djamel SERIAI, Marianne HUCHARD, Christelle URTADO and Sylvain VAUTTIER LIRMM / CNRS and Montpellier 2 University - Montpellier - France LGI2P / Ecole des Mines d’Alès - Nîmes - France
25
Embed
Documenting the Mined Feature Implementations from the Object-oriented Source Code of a Collection of Software Product Variants
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Documenting the feature implementationsmined from the OO source code
of a collection of software variants
Rafat AL-MSIE’DEEN, Abdelhak-Djamel SERIAI, Marianne HUCHARD,Christelle URTADO and Sylvain VAUTTIER
LIRMM / CNRS and Montpellier 2 University - Montpellier - FranceLGI2P / Ecole des Mines d’Alès - Nîmes - France
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Outline
Context of software product line (reverse) engineeringThe big picture of our proposal for software product line reverseengineeringOverview of the feature documentation processUsed techniques (in a nutshell)Step by step feature documentation process on an exampleConclusion and perspectives
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Undisciplined development of software product variants
Software product variants are often developed in an undisciplinedmanner.
ad-hoc reuse: copying and modifying previous software code (clone and own).
Results is a set of software products that:Implicitly share some (but not all) code,Are hard to understand, maintain, evolve.
Developing new variants still requires efforts.
Context
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Benefits of software product line engineering
Expliciting shared (common) features and specific (variable) ones is aplus.
There is an abstract model of the developed productsincreases understandability
Products are easier to maintain and evolvesingle point of maintenance
Future products are easier to developdisciplined reuse
Software product line engineeringDomain engineering
A repository for reusable software feature implementationsA model of valid software products
– FODA feature modelApplication engineering
A software application development process based on feature selection
Context
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Software product line engineering
My product line
Circle
Square
The triangles
The rectangles
Rectangle
Right triangleEquilateral triangle
Parallelogram
Feature implementation repository
FODA feature model
mandatoryoptionalalternative (xor)or
Domain engineering
Software product variants’ code(applications)
Application engineering
Context
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Software product line (reverse) engineering
My product line
Circle
Square
The rectangles
Rectangle
Equilateral triangle
Parallelogram
Feature implementation repository
FODA feature model
mandatoryoptional
alternative (xor)or
Domain engineering
Software product variants’ code(applications)
Application engineering
The triangles
Right triangle
New product derivation
Software product linereverse engineering
FODA: Feature-oriented domain analysis
Context
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALESBig picture
Software product line reverse engineering process
Automatically builds a « canonical » feature model from:the object-orientes source code of software variants,the software variant use-cases, if available.
Assumes:software code is object-oriented,software code statically reflects the features it implements,
no pre-compiling, macros, parameterizable code, etc.software code respects best practices relatively to naming (names for sourcecode elements are relevant),a given feature is always implemented with identical code,feature implementations are disjoint,features are functional.
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Software product line reverse engineering process
Software product variants’ code
STEP 1Feature mining
Common features’ implementations
Variable features’ implementations
Circle
Square
Rectangle
Equilateral triangle
Parallelogram
Right triangle
STEP 2Feature
documentation
My product line
Circle
Square
XOR
Rectangle
Equilateral triangle
Parallelogram
OR
Right triangle
« Canonical » FODA feature modelwith constraints
Named and documentedcommon and variable features
STEP 3Feature model building
use-case-1
use-case-2
Use cases and their descriptions
Big picture
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Feature documentation
Feature documentationautomatically assigning a meaningful name to previously mined featureimplementationscan be based either on:
source code analysis (using the most frequent words extracted from the feature’sOO source code elements, i.e. identifiers),use-case names, when available (assuming a functional feature to use-casecorrespondance).
Automatically assigning a use-case name to a feature implementationamounts to automatically identify the use-case that corresponds to afeature implementation.
Overview
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Feature documentation process
A three step process:eliminate unsignificant use-cases to search from (to reduce search space fornext step),
making small groups of use-cases and their corresponding features: hybridblocks
among each hybrid block, compute textual similarity between the features’code and the use-casesuse these similarities to build the input of a clustering technique thatassociates a use-case (therefore its name) to each feature implementation.
Using three techniques:Relational Concept Analysis (RCA) for step 1,Latent semantic indexing (LSI) for step 2,Formal concept analysis (FCA) for step 3.
Overview
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Extract abstractions from a set of objects described by attibutesMax. set of objects (extent) that share a max. set of attributes (intent)
x
a4 a5a3a2a1
xxxo4xxo3
xxo2
xo1
Binary context
({o4},{a1,a3,a5})({o3},{a1,a3,a4})
({o1,o2,o3,o4},{})
({o3,o4},{a1,a3})
({},{a1,a2,a3,a4,fa})
({o2,o3,o4},{a3})({o1,o3,o4},{a1})
Concept lattice
({o2},{a2,a3})
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALESTechniques
FCA and RCA in a nutshell
Technique to simplify the readingRepresent only attibutes (resp. objects) when they first appear from top down(resp. botttom up): simplified intent (resp. extent).
Technique to tame complexityconsider a sub-order by removing concepts that have an empty simplifiedintent and empty simplified extent: The lattice becomes an AOC-poset.
Relational Concept AnalysisAn iterative version of RCA in which objects are described by attributes andrelations.Generates a relational lattice family
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
LSI in a nutshell
LSI is an information retrieval technique that compute lexical similaritybetween documents.It is based on the occurrence of terms in documents.
The number of terms (k) is a parameter of this method.The Term Frequency - Inverse Document Frequency (TF-IDF) weightingscheme is applied.Cosine similarity is computed.Before their analysis, texts are pre-processed:
Stop words, articles, punctuation marks, numbers are filtered out.All text is lower cased.Text is stemmed (using WordNet API).
Techniques
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
The use-case diagrams of the 2nd and 4th MTG software variants
Mobile tourist guide (MTG) software variants
Step by step
The previously mined feature implementationspresence in each MTG software variant
(part of the relational context family)
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Relational Context Family (RCF)
Step by step
Use case presence in each MTGsoftware variant
(part of the relational context family) Use case and feature implementation co-occurrence
(part of the relational context family)
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Concept Lattice Family (CLF)
s
Step by step
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Generating the hybrid blocks from the CLF
Step by step
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Documents: Feature implementations
Hybrid Block_1publicviewMap(){int a = 0;while (a > 5){if (a != 20) {} else {a = 30;}}}
View Map
Queries: Use-cases and their descriptions
Constructing a raw corpus from each hybrid block
Step by step
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Measuring an hybrid block similarity based on LSI
Step by step
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Measuring an hybrid block similarity based on LSI
Step by step
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Clustering use-cases described by features
Step by step
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Conclusion
Software product variants’ code
STEP 1Feature mining
Common features’ implementations
Variable features’ implementations
Circle
Square
Rectangle
Equilateral triangle
Parallelogram
Right triangle
STEP 2Feature
documentation
My product line
Circle
Square
XOR
Rectangle
Equilateral triangle
Parallelogram
OR
Right triangle
« Canonical » FODA feature modelwith constraints
STEP 3Feature model building
use-case-1
use-case-2
Use cases and their descriptions
Conclusion
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Conclusion
Software product variants’ code
STEP 1Feature mining
Common features’ implementations
Variable features’ implementations
Circle
Square
Rectangle
Equilateral triangle
Parallelogram
Right triangle
STEP 2Feature
documentation
My product line
Circle
Square
XOR
Rectangle
Equilateral triangle
Parallelogram
OR
Right triangle
« Canonical » FODA feature modelwith constraints
STEP 3Feature model building
use-case-1
use-case-2
Use cases and their descriptions
Use
1) RCA to reduce search space,
2) LSI to compute similarity between use cases and feature implementations
3) and, FCA to build the feature to use case correspondance (clustering).
Conclusion
Christelle URTADO - LGI2P / ECOLE DES MINES D’ALES
Conclusion
Related contributionsFeature mining (step 1) @ last year’s SEKETool implementationExperiments on 3 real use casesFeature model building (and constraint identification)
Improve the techniquesSearch based algorithms as an alternative clustering techniqueAutomatically identify junctions between feature implementationsBuild a « more » hierarchical feature model
Experiment at a wider scale.
Conclusion
Documenting the feature implementationsmined from the OO source code
of a collection of software variants
Rafat AL-MSIE’DEEN, Abdelhak-Djamel SERIAI, Marianne HUCHARD,Christelle URTADO and Sylvain VAUTTIER