Top Banner
AN APPROACH TO RECOVER FEATURE MODELS FROM OBJECT-ORIENTED SOURCE CODE R. AL- Msie’deen, A. Djamel Seriai, M. Huchard, C. Urtado, S. Vauttier, and H. S. Eyal Salman
36

An Approach to Recover Feature Models From Object-Oriented Source Code

May 10, 2015

Download

Software

An Approach to Recover Feature Models From Object-Oriented Source Code
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Approach to Recover Feature Models From Object-Oriented Source Code

AN APPROACH TO RECOVER FEATURE MODELS FROM

OBJECT-ORIENTED SOURCE CODE

R. AL- Msie’deen, A. Djamel Seriai, M. Huchard, C. Urtado, S.

Vauttier, and H. S. Eyal Salman

Page 2: An Approach to Recover Feature Models From Object-Oriented Source Code

Outlines:

INTRODUCTION.

SOFTWARE PRODUCT LINE ENGINEERING (SPLE).

FEATURE AND FEATURE MODEL (FM).

FORMAL CONCEPT ANALYSIS (FCA).

INFORMATION RETRIEVAL (IR) & LATENT SEMANTIC INDEXING (LSI).

MOTIVATIONS.

APPROACH OVERVIEW => THE MAPPING MODEL + FEATURE EXTRACTION PROCESS.

OBJECT-ORIENTED SOURCE CODE VARIATIONS.

OBJECT-ORIENTED BUILDING ELEMENTS (OBE).

COMMONALITY AND VARIATION IDENTIICATION USING FCA.

ATOMIC BLOCK OF VARIATIONS (FEATURE) IDENTIFICATION USING LSI AND FCA.

RELATED WORK.

CONCLUSION.

FUTURE WORK.

REFERENCES.2

Page 3: An Approach to Recover Feature Models From Object-Oriented Source Code

Introduction:

Software product variants are a set of similar products that are developed by

copy-paste-modify technique not by software product line (SPL) strategy.

Copy - paste - modify

Software product variants represent a starting point to build software product line

(SPL) [FIG 08]

3

Page 4: An Approach to Recover Feature Models From Object-Oriented Source Code

Software Product Line [SPL]:

A SPL is "a set of software intensive systems sharing a common, managed set of features that

satisfy the specific needs of a particular market segment or mission and are developed from a

common set of core assets in a prescribed way" [CLE 01]

Products Features

Domain

Core Assets Production Plan

4

Page 5: An Approach to Recover Feature Models From Object-Oriented Source Code

Software Product Line Engineering [SPLE]:

SPLE consists in two major steps: [CLE 01]

1. Domain Engineering: Core Assets + Feature Model.

2. Application Engineering: Product Configurations.

5

Page 6: An Approach to Recover Feature Models From Object-Oriented Source Code

Feature and Feature Model:

The Feature is a system property relevant to some stakeholder used to capture commonalities or

discriminate [variations] among systems in a family [CZA 00]

Feature models FMs are tree-like graph of features and relationships among them. FMs in SPLE

are used to represent commonality and variability of SPL members at different levels of

abstraction [POH 10]

Feature

Feature Models

Optional Feature Mandatory Feature

6

Page 7: An Approach to Recover Feature Models From Object-Oriented Source Code

Feature and Feature Model:

Text editing system 1 FM

Text Editing System

File Management

Basic

Help

Edit

Basic Edit Select All

Resize Case conversion

Clear

Read Only Font ColorSearch Split

Require

ExcludeAND

OR

XOROptional Feature

Mandatory Feature

Legend

Change Display Settings

Replacement

Unsplit AllHorizontal Vertical

require

exclude

1 http://www.lirmm.fr/TextEditingSystemSPL 7

Page 8: An Approach to Recover Feature Models From Object-Oriented Source Code

Formal Concept Analysis (FCA):

FCA is a mathematical method that provides a way to identify "meaningful groupings of objects

that have common attributes“ [LOE 07]

A formal context is a triple K = (O, A, R) where O and A are sets (objects and attributes,

respectively) and R is a binary relation, i.e., R ⊆ O × A.

Galois lattices [BAR 70] and concept lattices [GAN 99] are core structures of a data analysis

framework (Formal Concept Analysis, or FCA for short) for extracting an ordered set of concepts

from a dataset, called a Formal Context, composed of objects described by attributes.

8

Page 9: An Approach to Recover Feature Models From Object-Oriented Source Code

Formal Concept Analysis (FCA):

Class (Open) Class (Close) Class (Edit) Class ( Print) Class (Select All) Class (Red) Class (Green) Class (Blue) Class (Black)

Product 1 x x x x x x

Product 2 x x x x x

Product 3 x x x x x x

Product 4 x x x x x x

Product 5 x x x x x x

A formal context describing product variants by Source code elements

The concept lattice for the formal context of Table above

Common Concept

Concept shared by two Products Concept specific for one product

9

Page 10: An Approach to Recover Feature Models From Object-Oriented Source Code

Information Retrieval (IR) & Latent Semantic Indexing (LSI):

INFORMATION RETRIEVAL (IR) has proven useful in many disciplines such as software

maintenance and evolution, image extraction, speech recognition and horizontal search

engines like Google. Furthermore feature location is one of the most common applications of IR

in software engineering [DAV 11]

LATENT SEMANTIC INDEXING (LSI) assumed that there are some implicit relationships among

the words of documents that always appear together even if they do not share any terms; that is

to say, there are some latent semantic structures in free text [DAV 11]

The effectiveness of IR methods is measured using IR METRICS: RECALL and PRECISION.

10

Page 11: An Approach to Recover Feature Models From Object-Oriented Source Code

Latent Semantic Indexing (LSI):

In our work, we consider the most widely used threshold for cosine similarity that equals to 0.70

[MAR 03]

In LSI all information must be manipulated and normalized to become suitable as input of LSI.

This preprocessing step include: all capital letters must be transformed into lower case letters,

removing stop words (such as: numbers, etc.), all Documents must be split into terms and

performing word stemming.

Similarity between Documents is described by similarity matrix. The similarity is computed based

on cosine similarity.

11

Page 12: An Approach to Recover Feature Models From Object-Oriented Source Code

Motivations:

Many companies at first develop a number of similar software products without explicitly

planning for strategic reuse. Once released, if the product is SUCCESSFUL and meets the market,

similar products are to be developed [JOH 09]

INDIVIDUAL SYSTEMS

A B C D E F G

A B V G O J L

A B F D E R X

A B C D E F G

A B V G

A B F D E

SOFTWARE PRODUCT LINE

12

Page 13: An Approach to Recover Feature Models From Object-Oriented Source Code

Motivations:

Creating manually a feature model for an existing system is time-consuming, error-prone, and

requires substantial effort from a modeler [SHE 11]

13

Page 14: An Approach to Recover Feature Models From Object-Oriented Source Code

Motivations:

REVERSE ENGINEERING FM from source code aims to improve product maintenance, ease

system migration [CHI 90], and the extracted feature model may lead to the production of

new products.

Feature Model

P1P2

Product Variants

REVERSE ENGINEERING FM

14

Page 15: An Approach to Recover Feature Models From Object-Oriented Source Code

Motivations:

The general OBJECTIVE of our approach is to EXTRACT INITIAL FM which model common and

variable features of product variants. We present IN THIS PAPER the part concerning about

FEATURE IDENTIFICATION from the OO source code of product variants using FCA and LSI.

We Assumed in this paper That The Product

Variants Use The Same Vocabulary To Name

Packages, Classes, Attributes And Methods

In Its Source Code.

15

Page 16: An Approach to Recover Feature Models From Object-Oriented Source Code

Motivations:

Reverse engineering a feature model from source code for a set of product variants make

system features and dependencies explicit and clear.

There are needs to extract feature models, especially from source code the most important

source of information, where features and dependencies are hidden.

16

Page 17: An Approach to Recover Feature Models From Object-Oriented Source Code

Approach Overview:

Optional Feature

Product Variants

Product

Source code elements

Package

Source code

Source code variation Block of variations

1..*Has a 1

Class

Attribute

Method

1..*

1..*

Atomic block of variations

Feature

1..*

1..*

1..*

1

1correspond

Feature Model

1..*

Mandatory Feature1..*

1..*

1..*

A. The Mapping Model:

To identify features we rely on a mapping model between these features and object-oriented

building elements (OBE).

Object-oriented source code variations

17

Page 18: An Approach to Recover Feature Models From Object-Oriented Source Code

Approach Overview:

For object-oriented source code, the mandatory features are realized by OBE that are common

to all product variants.

The optional features are realized by variable OBE that can appear in some product variants or

in single product but not all product variants.

We consider that a feature corresponds to one and only one set (group) of OBE. This means

that a feature always has the same implementation in all products where it is present.

18

Page 19: An Approach to Recover Feature Models From Object-Oriented Source Code

Approach Overview:

As a feature corresponds to one and only one set of OBE, then an optional feature is

implemented by the same set of Variables OBE (VOBE) in all products where it is present.

We define a block of variations (BV) as a set of VOBE which are always associated (i.e., which are

always identified together in all the products in which they appear).

The subsets of VOBE that belong to a BV and represent one and only one feature are called

Atomic Blocks Of Variations (ABV). A BV is composed of set of ABVs. To determine its various

parts (sub-groups), we rely on the clustering of the closest VOBEs considering the similarity

measures that are related to LSI method.

19

Page 20: An Approach to Recover Feature Models From Object-Oriented Source Code

Approach Overview: An illustrative example:

20

Page 21: An Approach to Recover Feature Models From Object-Oriented Source Code

Approach Overview:

B. Feature extraction process:

The approach that we propose is illustrated in Figure below. Feature extraction process consists

of the following steps:

1. OO Source code is analyzed to extract object-oriented building elements (packages, classes,

methods, attributes) for all product variants.

2. Commonalities and variations are extracted for all product variants using FCA. “Blocks of

variations are given by using FCA”

3. Blocks of variations are divided into atomic blocks of variations. Each atomic block of

variations corresponds to one and only one feature. “using LSI and FCA“

21

Page 22: An Approach to Recover Feature Models From Object-Oriented Source Code

Object-oriented source code variations:Package Variation

Package Set Variation Package Content VariationClass Variation

Class Content Variation

Class Signature Variation

Attributes Set Variation

Methods Set Variation

Method Variation

Signature Body

Attribute Variation ( Access Level, Data Type. etc.)

1:

2:

3:

4:

(Name)

(Name)

Relationship

Public , Private, ...Access Level

Access Level

Returned Data Type

Parameters List order & data type

Exception

Local Variable

Invocation

Access

22

Page 23: An Approach to Recover Feature Models From Object-Oriented Source Code

Object-oriented source code variations example:

Package Set Variation

Package Content Variation

23

Page 24: An Approach to Recover Feature Models From Object-Oriented Source Code

Object-oriented building elements OBE:

in our case each product variant PN is abstracted as a set of OBE as follow:

OBE for PN ={

Package (name);

Class (name, owner);

Attribute (name, owner);

Method (name, owner);

Parameter (name, owner);

Local Variable (name, owner);

Method Invocation (name, accessed in, owner);

Method Exception (name, owner)}.24

Page 25: An Approach to Recover Feature Models From Object-Oriented Source Code

Commonality and variation identification using FCA:

Formal context describing text editing systems by object-oriented building elements (OBE)

In the Formal contextproducts constitute therows of the Table.

In the Formal context OBE constitute the columns of the Table.

25

Page 26: An Approach to Recover Feature Models From Object-Oriented Source Code

Commonality and variation identification using FCA:

The concept lattice for the formal context of previous Table.

The common block

Block of Variations

26

Page 27: An Approach to Recover Feature Models From Object-Oriented Source Code

Atomic block of variations (feature) identification using LSI and FCA:

To identify the atomic block of variations that represent a single feature from a block of

variations, we consider LSI and FCA to recover all atomic block of variations.

27

Page 28: An Approach to Recover Feature Models From Object-Oriented Source Code

Atomic block of variations (feature) identification using LSI and FCA:

In our case, each line in the block of variations represents a single document and at the same

time represents a query.

.

.

1 0.70

0.70 1

10 0

0

0

x x

x x

x0 0

0

0

The Similarity Matrix

The Context (Similarity Matrix) For θ= 0.70

28

Page 29: An Approach to Recover Feature Models From Object-Oriented Source Code

Atomic block of variations (feature) identification using LSI and FCA:

Concept lattice shows three atomic blocks of variations extracted from one block of variations.

29

Page 30: An Approach to Recover Feature Models From Object-Oriented Source Code

Related Work:

Ziadi et al. [ZIA 12] propose an automatic approach for feature identification from source code

for a set of product variants.

30

Their approach only investigates products in which the variability is represented in the name of

classes, methods and attributes, without considering a product lines in which the variability is

mainly represented in the body of methods

Page 31: An Approach to Recover Feature Models From Object-Oriented Source Code

Related Work:

Ziadi approach gather all common features as a single mandatory feature under title base

feature.

We use FCA to extract commonalities and variations from product variants and distinguish

between the mandatory features by using LSI and FCA based on the lexical similarity, and

extracts all optional features and constraints such as: "and" and "require".

31

Page 32: An Approach to Recover Feature Models From Object-Oriented Source Code

Conclusion:

In this paper, we proposed an approach based on FCA and LSI to extract a features from the

object-oriented source code of software system variants.

FCA can be used to extract common block and blocks of variations.

LSI is used with FCA to recover atomic blocks of variations that represent a single feature, using

the textual similarity.

32

Page 33: An Approach to Recover Feature Models From Object-Oriented Source Code

Future Work:

We will use both textual and semantic similarity to determine more precisely each feature

implementation from the OO source code .

We will organize the extracted features as a feature model including all cross-tree constraints

and group of feature constraints, using the information contained in the concept lattice.

We will Integrate our approach with the linguistic matching techniques; in case product variants

use different vocabulary to names packages, classes, attributes, and methods.

33

Page 34: An Approach to Recover Feature Models From Object-Oriented Source Code

[ZIA 12] ZIADI T., FRIAS L., DA SILVA M. A. A., ZIANE M., “Feature Identification from the Source Code of Product Variants”, MENS T.CLEVE A. F. R., Ed., Proceedings of the 15th European Conference on Software Maintenance and Reengineering, Los Alamitos, CA, USA, 2012,IEEE, p. 417–422.

[CLE 01] CLEMENTS P. C., NORTHROP L. M., Software product lines: practices and patterns, Addison-Wesley, 2001.

[JOH 09] JOHN I., EISENBARTH M., “A decade of scoping: a survey”, Proceedings of the 13th International Software Product LineConference, Pittsburgh, PA, USA, 2009, Carnegie Mellon University, p. 31–40.

[FIG 08] FIGUEIREDO E., CACHO N., SANT’ANNA C., MONTEIRO M., KULESZA U., GARCIA A., SOARES S., FERRARI F., KHAN S.,CASTOR FILHO F., DANTAS F., “Evolving software product lines with aspects: an empirical study on design stability”, Proceedings of the 30thinternational conference on Software engineering, ICSE ’08, New York, NY, USA, 2008, ACM, p. 261-270.

[CZA 00] CZARNECKI K., EISENECKER U. W., Generative programming: methods, tools, and applications, ACM Press/Addison-WesleyPublishing Co., New York, NY, USA, 2000.

[POH 10] POHL K., BCKLE G., VAN DER LINDEN F. J., Software Product Line Engineering: Foundations, Principles and Techniques,Springer Publishing Company, Incorporated, 1stedition, 2010.

[LOE 07] LOESCH F., PLOEDEREDER E., “Restructuring Variability in Software Product Lines using Concept Analysis of ProductConfigurations”, KRIKHAAR R. L. VERHOEF C. L. G. A. D., Ed., Proceedings of the 11th European Conference on Software Maintenance andReengineering, Amsterdam, Netherlands, March 2007, IEEE, p. 159–170.

[GAN 99] GANTER B., WILLE R., Formal Concept Analysis, Mathematical Foundations, Springer-Verlag, 1999.

[BAR 70] BARBUT M., MONJARDET B., Ordre et Classification: Algèbre et combinatoire, vol. 2, Hachette, 1970.

[DAV 11] DAVID B., LAWRIE D., “Information Retrieval Applications in Software Maintenance and Evolution”, In Encyclopedia of SoftwareEngineering, 2011, p. 454-463.

[SHE 11] SHE S., LOTUFO R., BERGER T., WASOWSKI A., CZARNECKI K., “Reverse engineering feature models”, ICSE, 2011, p. 461-470.

[MAR 03] MARCUS A., MALETIC J. I., “Recovering documentation-to-source-code traceability links using latent semantic indexing”,Proceedings of the 25th International Conference on Software Engineering, ICSE ’03, Washington, DC, USA, 2003, IEEE Computer Society, p.125–135.

References:

34

Page 35: An Approach to Recover Feature Models From Object-Oriented Source Code

35

Page 36: An Approach to Recover Feature Models From Object-Oriented Source Code

Banking systems example: