Top Banner
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications Department of Electrical Engineering and Information Technologies University of Naples “Federico II”, Italy Domenico Amalfitano Nicola Amatucci Vincenzo De Simone Anna Rita Fasolino Porfirio Tramontana 2 nd Workshop on Software Engineering Methods in Spreadsheets Florence, Italy 18 th May, 2015
19

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Jan 23, 2018

Download

Software

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA

Based Excel Spreadsheets

Applications

Department of Electrical Engineering and Information Technologies

University of Naples “Federico II”, Italy

Domenico Amalfitano

Nicola Amatucci

Vincenzo De Simone

Anna Rita Fasolino

Porfirio Tramontana

2nd Workshop on Software Engineering Methods in Spreadsheets

Florence, Italy 18th May, 2015

Page 2: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Context and Motivations

Context

◦ Reverse Engineering of Excel

Spreadsheet Applications

Motivation

◦ Propose techniques and tools to support

the comprehension of VBA based

Spreadsheet applications

2 of 19SEMS 15 – Florence, Italy – May 18th

Page 3: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Why reverse engineering

spreadsheet applications?

Spreadsheets are widely adopted◦ for different purposes: calculation, storage,

collaboration, etc.

◦ in different domains: business, automotive, engineering, science, medical, etc.

Need for their comprehension in different scenarios◦ individual comprehension

◦ knowledge transferring

◦ re-documentation

◦ Maintenance and evolution

◦ migration towards different architectures

3 of 19SEMS 15 – Florence, Italy – May 18th

Page 4: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Excel Spreadsheet Applications

Comprehension Issues

Poor or absent internal and external documentation

No clear distinction between different layers

◦ Data, Business Logic, Presentation

Spreadsheet application can be complex

◦ Data spread on different sheets

◦ Data dependencies through formulas

◦ Use of VBA code

Enhanced user interface (User Forms, shapes, controls)

User defined functionalities (Macros) and functions

Handling of events related to default or user defined elements

Direct and indirect dependencies through VBA code

4 of 19SEMS 15 – Florence, Italy – May 18th

Page 5: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Proposed Reverse Engineering

Approach

Data Model Reverse Engineering

◦ performed to reconstruct a conceptual model of the data stored in

the spreadsheet application.

User Interface and Business Logic Reverse Engineering

◦ performed to comprehend both the structure and the behavior of

User Interface (UI) and the functionalities provided by the

application.

5 of 19SEMS 15 – Florence, Italy – May 18th

Page 6: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Data Model Reverse

Engineering

We propose an heuristic-based approach

◦ Based on our experience in an industrial domain1,2,3

◦ process made of seven sequential steps to infer, with gradual

refinements, an UML class diagram of the considered spreadsheet

application

◦ in each step, one or more heuristic rules are executed.

Heuristic rules

◦ adapted from rules defined in the literature or

◦ defined by us exploiting some formatting properties typical of

spreadsheet applications and analyzing the cells content

1. Amalfitano, D.; Fasolino, A.R.; Maggio, V.; Tramontana, P.; Di Mare, G.; Ferrara, F.; Scala, S., “Migrating legacy spreadsheets-based

systems to Web MVC architecture: An industrial case study” - CSMR-WCRE 2014

2. Amalfitano, D.; Fasolino, A.R.; Maggio, V.; Tramontana, P.; De Simone V., “Reverse Engineering of Data Models from Legacy

Spreadsheets-Based Systems: An Industrial Case Study” - SEBD 2014

3. Amalfitano, D.; Fasolino, A.R.;Tramontana, P.; De Simone V.; Di Mare, G.; Scala, S., “Information Extraction from Legacy

Spreadsheet-based Information System - An Experience in the Automotive Context” - DATA 2014

6 of 19SEMS 15 – Florence, Italy – May 18th

Page 7: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Application of heuristics -

Examples

7 of 19SEMS 15 – Florence, Italy – May 18th

For example, applying one of the

rule to the worksheet Sheet1, two

separate areas are identified. In

this case we introduce a class for

each areas and a composition

relation between these classes and

the one related to the sheet.

Sheet1 is composed by

Sheet1_Area1 and Sheet1_Area2

Page 8: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Application of heuristics -

Examples

8 of 19SEMS 15 – Florence, Italy – May 18th

In this other example, we applied another rule that

exploits the identification of merged cells to extract

different classes from an area under analysis. As

the figure shows we were able to identify from the

Area1 in Sheet1 2 different classes and an

association between them

Page 9: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Business Logic Reverse

Engineering

definition of a model that takes into account the main

elements of VBA-based spreadsheet applications and

their relationships

Introduction of different views of these applications on

the basis of the model we presented

development of a tool to support the comprehension of

these kind of applications providing extraction and

visualization features

9 of 19SEMS 15 – Florence, Italy – May 18th

Page 10: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Modeling VBA based

Spreadsheet Applications - 1

10 of 19SEMS 15 – Florence, Italy – May 18th

We provided different

views of a

spreadsheet

application: we

reported the main

elements composing

the application and

their composition and

generalization

relationship.

In the left side are

reported the graphical

elements of a

spreadsheet

application whereas in

the right one the code

elements.

Page 11: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Modeling VBA based

Spreadsheet Applications - 2

11 of 19SEMS 15 – Florence, Italy – May 18th

In this other view we

reported the main

dependencies between

the elements composing

these applications. In

particular we considered

• call dependencies

between procedures

• write/read

dependencies

between procedures

and cells

• open/hide and unload

dependencies

between procedures

and user form

Besides we also

reported what procedure

handles events of given

elements.

Page 12: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

EXACT – EXcel Application

Comprehension Tool

Extraction of data from the spreadsheet application

Abstraction of the extracted data according to the

proposed model

Generation of different views

12 of 19SEMS 15 – Florence, Italy – May 18th

Page 13: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

EXACT – EXcel Application

Comprehension Tool – provided

views

Structural Views

◦ Elements composing the application and their relationships

◦ Details related of an element by clicking on it

User Functionalities Views

◦ List all the user defined functionalities present

Cell Dependencies Views

◦ List of potential dependencies between cells through formulas,

validation rules and VBA code

Report & Metrics Views

◦ Metrics about the complexity of the application (worksheets,

shapes, userforms, modules, procedures, LOCs)

13 of 19SEMS 15 – Florence, Italy – May 18th

Page 14: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Visualization features provided

by EXACT - 1

14 of 19SEMS 15 – Florence, Italy – May 18th

In this view, the

main structure

of the

application is

shown.

Page 15: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Visualization features provided

by EXACT - 2

15 of 19SEMS 15 – Florence, Italy – May 18th

By clicking on

an element,

further details

on the element

... and a view

showing its

dependencies

are reported

Page 16: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Visualization features provided

by EXACT – User Functionalities

Views

16 of 19SEMS 15 – Florence, Italy – May 18th

This view

shows all the

events defined

on an element

(Workbook,

Active

Worksheet and

UserForms).

In this example

all the events

related to the

selected User

Form are

reported

Page 17: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Visualization features provided

by EXACT – User Functionalities

Views

17 of 19SEMS 15 – Florence, Italy – May 18th

By clicking on an

event (in this case

UserForm_Initialize

) a new view

appears showing

the potential

dependencies of

the event handler.

Page 18: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Future Works

Evaluation of the tool in real business and industrial

contexts to support professionals in executing

comprehension tasks of VBA spreadsheet applications

Extending the model taking into account other Excel

features

Improving the features and the views provided by

EXACT

18 of 19SEMS 15 – Florence, Italy – May 18th

Page 19: Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications

Thanks for your attention

19 of 19SEMS 15 – Florence, Italy – May 18th

Questions

?

Further Information:

http://reverse.dieti.unina.it

@REvERSE_UNINA

[email protected]