Top Banner
Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code 1 Tse-Hsun(Peter) Chen Supervised by Ahmed E. Hassan
38

ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Jul 16, 2015

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Improving the Quality of Large-Scale Database-Centric Software Systems by

Analyzing Database Access Code

1

Tse-Hsun(Peter) Chen

Supervised by Ahmed E. Hassan

Page 2: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Common performance problems in database-access code

2

DBMSCode

• Inefficient data access

Page 3: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Inefficient data access

3

for each userId{…executeQuery(“select … where

u.id = userId”);}

Code

SQL select … from user where u.id = 1 select … from user where u.id = 2…

Page 4: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Common performance problems in database-access code

4

DBMSCode

• Inefficient data access• Unneeded data access

Page 5: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Unneeded data access

5

DBMSCode

Requireuser data in

the code

Actual request sent

Team TableUser Tablejoin

Page 6: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Common performance problems in database-access code

6

DBMSCode

• Inefficient data access• Unneeded data access• Overly-strict isolation level

Page 7: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Overly-strict isolation level

7

DBMSCode

Reading onlyuser name

set read-write transactionselect … from user

Page 8: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Cannot find problems by only looking at the DBMS side

8

DBMS

select … from user where u.id = 1 select … from user where u.id = 2…for each userId{

…executeQuery(“

select … where u.id = userId”);}

Page 9: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Database accesses are abstracted

9

DBMSCode

AbstractionLayers

Problems become more complex and frequent after adding abstraction layers

Page 10: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Accessing data using ORM incorrectly

10

@EntityClass User{

…}

for each userId{User u = findUserById(userId);u.getName();

}

A database entity class

Objects

SQL

select … from user where u.id = 1 select … from user where u.id = 2…

Page 11: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Research statement

There are common problems in how developers write database-access code, and these problems are hard to diagnose by only looking at the DBMS. In addition, due to the use of different database abstraction layers, these problems become even harder to find.

By finding problems in the database access code and configuring the abstraction layers correctly, we can significantly improve the performance of database-centric systems.

11

Page 12: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

12

Detecting inefficientdata access code

Overview of the thesis

Detecting unneededdata access

Finding overly-strict isolation level

Finished work Future workUnder

submission

Page 13: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Detecting unneededdata access

13

Detecting inefficientdata access code

Overview of the thesis

Finding overly-strict isolation level

Finished work Future workUnder

submission

Page 14: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Inefficient data access detection framework

Inefficient data access detection and ranking framework

Ranked according to performance impact

Ranked inefficient

data access code

Source Code

detection

ranking

14

Page 15: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Inefficient data access detection framework

Inefficient data access detection and ranking framework

Ranked according to performance impact

Ranked inefficient

data access code

Source Code

detection

ranking

15

Focus on one type of inefficient data access: one-by-one processing

Page 16: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Detecting one-by-one processingusing static analysis

16

First find all the functions that read/write from/to DBMS

Class User{getUserById()…getUserAddress()…

}

Identify the positions of all loopsfor each userId{

foo(userId)}

Check if the loop contains any database-accessing function

foo (userId){getUserById(userId)

}

Page 17: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Inefficient data access detection framework

Inefficient data access detection and ranking framework

Ranked according to performance impact

Ranked inefficient

data access code

Source Code

detection

ranking

17

Page 18: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Inefficient data access detection framework

Inefficient data access detection and ranking framework

Ranked according to performance impact

Ranked inefficient

data access code

Source Code

detection

ranking

18

Page 19: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Assessing inefficient data access impact by fixing the problem

Execution

Response timefor u in users{

update u}

Code with inefficient data access

users.updateAll()

Code withoutinefficient data access

19

Execute test suite 30 times

Response time after the fix

Avg. % improvement

Execution

Execute test suite 30 times

Page 20: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Inefficient data access causes large performance impacts

0%

20%

40%

60%

80%

100%

One-by-one processing20

% im

pro

vem

en

t in

res

po

nse

tim

e

Page 21: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Detecting unneededdata access

21

Detecting inefficientdata access code

Overview of the thesis

Finding overly-strict isolation level

Finished work Future workUnder

submission

Page 22: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

22

Detecting inefficientdata access code

Overview of the thesis

Finding overly-strict isolation level

Finished work Future workUnder

submission

Detecting unneededdata access

Page 23: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Intercepting system executions using byte code instrumentation

ORMDBMS

23

Intercept called functions in the code

Intercept ORM generated SQLs

Needed data

Requested data

diff

Page 24: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Mapping data access to called functions

24

@Entity@Table(name = “user”)public class User{

@Column(name=“name”)String userName;

@OneToManyList<Team> teams;

public String getUserName(){return userName;

}

public void addTeam(Team t){teams.add(t);

}

User.java

We apply static analysis to find which database column a function is reading/modifying

24

Page 25: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Identifying unneeded data access from intercepted data

Only user name is needed in the application logic

Needed user name in code

Requested id, name, address, phone number

25

Page 26: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

26

Detecting inefficientdata access code

Overview of the thesis

Finding overly-strict isolation level

Finished work Future workUnder

submission

Detecting unneededdata access

Page 27: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Detecting unneededdata access

27

Detecting inefficientdata access code

Overview of the thesis

Finding overly-strict isolation level

Finished work Future workUnder

submission

Page 28: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

A real life example of transaction abstraction problem

28

All functions in foo will be executed in transactions

@Transactionalpublic class foo{

int getFooVal(){return foo.val;

}…

}

This function does not need to be executed in transactions

Where to put the transaction may affect system behavior and performance

Page 29: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

Plans for finding transaction problems related abstraction

29

Bug Reports

Categories of transaction-related bugs

We plan to empirically study bug reports to find root causes and implement a

detection framework

Page 30: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

30

Page 31: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

31

Page 32: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

32

Page 33: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

33

Page 34: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

34

Page 35: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

35

Page 36: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

36

Page 37: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

37

Page 38: ICDE2015PhD - Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code

38

http://petertsehsun.github.io