Top Banner
Memories of Bug- Fixes Sunghun Kim, Kai Pan, Jim Whitehead {hunkim, pankai, ejw}@cs.ucsc.edu University of California, Santa Cruz
39

Memories of Bug Fixes

Jun 23, 2015

Download

Technology

Sung Kim
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Memories of Bug Fixes

Memories of Bug-Fixes

Sunghun Kim, Kai Pan, Jim Whitehead{hunkim, pankai, ejw}@cs.ucsc.edu

University of California, Santa Cruz

Page 2: Memories of Bug Fixes

What is a bug (Zeller 2006)?

• This pointer, being null, is a bug► An incorrect program state

• This software crashes; this is a bug► An incorrect program execution

• This line 11 is buggy►An incorrect program code

Page 3: Memories of Bug Fixes

Bugs?

• //null dereference• public nullDeref () {

   MyObject o = null;    if (isGoodDay) {

o = new MyObject(“Hi”);

}    

System.out.println(o.toString()); }

Page 4: Memories of Bug Fixes

Bugs?

• //null dereference• public nullDeref () {

   MyObject o = null;    if (isGoodDay) {

o = new MyObject(“Hi”);

}    

System.out.println(o.toString()); }

Page 5: Memories of Bug Fixes

Bugs?

//stack buffer overun for sizes greater than 14 stack_buffer(void* src, int size ) {     char buffer[14];    memcpy(buffer, src, size );  }

Page 6: Memories of Bug Fixes

Bugs?

//stack buffer over-run for sizes greater than 14 stack_buffer(void* src, int size ) {     char buffer[14];    memcpy(buffer, src, size );  }

Page 7: Memories of Bug Fixes

Bugs?

if (…) {

setSelectedText("\t");

}

Page 8: Memories of Bug Fixes

• There are many bug fix patterns that are specific to an individual project, and may not match one of the static patterns

• Example from jEdit project:

JEditTextArea.java at transaction 114- setSelectedText("\t"); + insertTab();

JEditTextArea.java at transaction 86 -setSelectedText("\t");+ insertTab();

Project-Specific Bug Fix Patterns

Page 9: Memories of Bug Fixes

Bug?

if (requiredProjectRsc.exists() &&

requiredProjectRsc.isOpen()) {

}

Page 10: Memories of Bug Fixes

• Example from Eclipse project:

JavaProject.java, transaction 2024 (“Fix for bug 28434”)- if (requiredProjectRsc.exists() &&- requiredProjectRsc.isOpen()) {

+ if (JavaProject.hasJavaNature(requiredProjectRsc)) {

DeltaProcessor.java, transaction 1945 (“Fix for bug 27499”)- boolean isOpened=proj.isOpen();- if (isOpened && this.hasJavaNature(proj))

+ if (JavaProject.hasJavaNature(proj))

Project-Specific Bug Fix Patterns

Page 11: Memories of Bug Fixes

Horizontal and Vertical Bug Patterns

Buffer over run

Horizontal: general bugs

Vertical: project specific

Null dereference

JEditexample

Eclipseexample

Page 12: Memories of Bug Fixes

Bug-Fix Memories – Basic Idea

Extract patterns in bug fix change history

……

Bug fix changes in revision 1 .. n-1

Memory

Page 13: Memories of Bug Fixes

Bug-Fix Memories – Basic Idea

Extract patterns in bug fix change history

……

Search for patterns against Memory

Bug fix changes in revision 1 .. n-1

Memory

Code to examine

Page 14: Memories of Bug Fixes

Talk Overview

• Detection of bug fix changes• Mining vertical bugs

► Abstracting code

• Evaluation • Conclusions• Future Work

Page 15: Memories of Bug Fixes

Retrieving Bug Fix Changes

• Software projects today record their development history using Software Configuration Management tools

• As developers make changes, they record a reason along with the change

► In the change log message• When developers fix a bug in the software, they tend to

record log messages with some variation of the words “fixed” or “bug”

► “Fixed null pointer bug”• It is possible to mine the change history of a software

project to uncover these bug-fix changes• That is, we retrospectively recover those changes that

developers have marked as containing a bug fix► We assume they are not lying

Page 16: Memories of Bug Fixes

Hunks, and Hunk PairsRevision n-1(has bug hunks)

Revision n(has fix hunks)

modification

addition

deletion

added hunk

hunk pair type

deleted hunk

empty deleted hunk

empty added hunk

Page 17: Memories of Bug Fixes

Detecting Vertical Bugs (Patterns)

• Detecting bug patterns► Saving exact code in bug and fix hunks doesn’t

work, since there is rarely an exact match.► Need a method for abstracting changes to find

patterns

• Approach► Abstract code in each bug fix change► Save abstracted bug and fix code in a database (the

“bug fix memory”)► Can search existing code to see if it matches a bug

fix pattern► Can suggest code to fix the bug

Page 18: Memories of Bug Fixes

Process for Abstracting Code

• Four step process► Raw component extraction

• Parse source code, and burst out individual syntactic elements

► Normalization• Substitute type names for variables, string literals,

constants (abstract to types)► Information filtering

• Remove elements that are too common to yield project-specific patterns

► Diff filtering• Remove code components that are common in bug and fix

hunks, yielding only code unique to the change

Page 19: Memories of Bug Fixes

Raw Component Extraction

• Step 1: Convert statements inside change hunks so they lie on a single line

► Eliminate whitespace► Concatenate multi-line statements to one line► Concatenate conditionals for complex statements (if, while,

etc.) to one line

• Step 2: Extract raw components► Component is a non-leaf node in the syntax tree of a single line► Bursts out complex statements into constituent parts

• Each portion of a complex conditional is a separate component► Additionally, separate out a method call and its parameters

Page 20: Memories of Bug Fixes

Raw Component Extraction Example

• Initial code

if (foo.flag > 5 && foo.ready()) {

i=1;

foo.create(“example”);

initiate(6,bar);

}

• Extracted Raw Componentsfoo.flag

foo.flag > 5

foo.ready()

ready()

foo.flag > 5 && foo.ready ()

if (foo.flag > 5 && foo.ready())

i=1

“example”

foo.create(.) “example”

create(.) “example”

initiate(,) 6, bar

if

>

&&.

.

foo flag

5 foo ready()

ready

Page 21: Memories of Bug Fixes

Normalization

• To further improve the ability to match code, perform abstraction of instances to types

► Replace variable instance with its type• Permits matching on type, rather than instance• foo.flag >= 5 Foo.flag >= 5 (type of foo is Foo)

► For literals, insert new component with type• i=1 yields int=1 and int=int

► For method calls, replace each parameter with type of parameter

• Use “*” for unknown types (we only do one-pass parse)• initiate(,) 6, bar initiate(,) int,* (type of bar is unknown)

Page 22: Memories of Bug Fixes

Information Filtering Goal

• After normalization, resulting components are candidates for insertion into database

► Problem: many commonly occurring statement types• int=int

► Want to eliminate these, and others that don’t contribute unique information about bug fixes

Page 23: Memories of Bug Fixes

Diff Filtering and Storing Memories

• As a final filtering step, keep only those components that are unique to either bug or fix hunks

► Duplicate components are eliminated, since they do not represent the bug or its fix

• After diff filtering step, store all components into the database (“memory”)

► Components record their transaction, file name, bug or fix hunk, etc.

► Also store initial source code of bug and fix hunks

Page 24: Memories of Bug Fixes

Searching the Memory

• The memory database contains extracted adaptive bug and fix patterns for a given project

• Can use this memory to find code that matches bug code in the memory

• Use scenario► Developer working in their favorite development

environment► Receives feedback when code they are developing

matches a stored bug pattern► Can also suggest potential fixes from stored bug fix

code

Page 25: Memories of Bug Fixes

IDE IntegrationBug

detection

Fix suggestion

Page 26: Memories of Bug Fixes

Evaluation

• We evaluated the memory to determine how well it captures new bug fix changes

► Online learning approach► Specifically, we create a memory for transactions 1 to n-1► At transaction n, for bug fix changes we examine whether the

bug hunks are found in the memory• This is a “half hit”

► If found, we also examine whether the fix hunk is found too• This is a “full hit”

► Examined same 5 project histories• ArgoUML, Columba, Eclipse, jEdit, Scarab

• This can be viewed as a proxy for how well the approach might work for bug and fix prediction

Page 27: Memories of Bug Fixes

Half and Full Hit

Build memories based on transaction 1 .. n-1

……

Transaction 1 .. n-1

MemoriesBug | Fix

Fix change caseat transaction n

Half hit Full hit

Page 28: Memories of Bug Fixes

True and False Positives

Build memories based on transaction 1 .. n-1

……

False positive half hit, if found

True positive half hit, if found

Transaction 1 .. n-1

Memories

Non-fix change case at transaction n

Fix change caseat transaction n

Page 29: Memories of Bug Fixes

True Positive Hit Rates

True Positive Hit Rate

0

5

10

15

20

25

30

35

40

45

ArgoUML Columba Eclipse jEdit Scarab

Projects

Hit

Rate

Full hit

Half hit

Page 30: Memories of Bug Fixes

False Positive Hit Rates

False Positive Hit Rate

0

5

10

15

20

25

30

35

ArgoUML Columba Eclipse jEdit Scarab

Projects

Hit

Rate

Full hit

Half hit

Page 31: Memories of Bug Fixes

True Positive and False Positive Full Hit Rates

0

2

4

6

8

10

12

14

16

18

ArgoUML Columba Eclipse jEdit Scarab

Projects

Hit

Rate

TP full hit

FP full hit

Page 32: Memories of Bug Fixes

True Positive and False Positive Full Hit Rates

• Bug fix memories work well► Captures 19.3%-40.3% of bugs (half-hits)► But, also captures a lot of non-bug changes (20.8%-

32.5%)

Page 33: Memories of Bug Fixes

PMD VS Fix Memories

• PMD is a bug finding tool based on a static syntax checker

Bug

Page 34: Memories of Bug Fixes

PMD VS Fix Memories

• PMD is a bug finding tool based on a static syntax checker

Bug

PMD

Page 35: Memories of Bug Fixes

PMD VS Fix Memories

• PMD is a bug finding tool based on a static syntax checker

Bug

PMD

Fix Memories

Page 36: Memories of Bug Fixes

PMD VS Fix Memories

• PMD is a bug finding tool based on a static syntax checker

Bug

PMD

Fix Memories

Page 37: Memories of Bug Fixes

40.3%6.5%

PMD VS Fix Memories

• PMD is a bug finding tool based on a static syntax checker

• Found bugs by PMD and Fix memories are largely exclusive

PMD

Fix Memories

3%

ArgoUML

38.7%6.5%

PMD

Fix Memories

2.3%

Eclipse

Page 38: Memories of Bug Fixes

Conclusions

• It is now possible to reliably extract bug fix memories from software project evolution data

• Bug fix memories work well► Captures 19.3%-40.3% of bugs (half-hits)► But, also captures a lot of non-bug changes (20.8%-

32.5%)

• Found bugs using fix memories and PMD are mostly exclusive

► Our approach complements other bug finding tools

Page 39: Memories of Bug Fixes

Future Work

• Developing other pattern extracting algorithms► To remove false positives► AST, Slicing, Control flow, etc.

• Comparing fix memories with more bug finding tools

► FindBugs, JLint, etc.