Top Banner
Nicolas Bettenburg Saarland University Queenʼs University Rahul Premraj Saarland University Vrije Uni. Amsterdam Tom Zimmermann University of Calgary Microsoft Research Sunghun Kim MIT CSAIL Hong Kong University Duplicate Bug Reports Considered Harmful ... Really?!
48

Duplicate Bug Reports Considered Harmful ... Really?

Dec 18, 2014

Download

Education

Talk given at ICSM 2008 Conference in Beijing, China.
Duplicate Bug reports are commonly to pollute bug reporting systems and have negative effects on a development teams' productivity. Therefore, duplicate bug reports are ignored, once identified. The findings in this research work show, that duplicate reports actually contain extra information that is not present in the original bug reports and developers can potentially benefit from this information. We conduct experiments and a case study on ECLIPSE to quantify the amount of extra information. We show that this extra information can be used to enhance techniques related to bug fixing, such as triaging.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Duplicate Bug Reports Considered Harmful ... Really?

Nicolas BettenburgSaarland UniversityQueenʼs University

Rahul PremrajSaarland UniversityVrije Uni. Amsterdam

Tom ZimmermannUniversity of CalgaryMicrosoft Research

Sunghun KimMIT CSAILHong Kong University

Duplicate Bug Reports Considered Harmful ...Really?!

Page 2: Duplicate Bug Reports Considered Harmful ... Really?

2

Page 3: Duplicate Bug Reports Considered Harmful ... Really?

# 2271

A

3

Bug Database

Duplicate Bug Reports

Page 4: Duplicate Bug Reports Considered Harmful ... Really?

# 2271

A

3

Bug Database

Duplicate Bug Reports

BUG

Page 5: Duplicate Bug Reports Considered Harmful ... Really?

# 2271

A

3

Bug Database

Duplicate Bug Reports

# 3219

B

Page 6: Duplicate Bug Reports Considered Harmful ... Really?

# 2271

4

# 3219

Bug Database

Duplicate Bug Reports

A B

Page 7: Duplicate Bug Reports Considered Harmful ... Really?

# 3219# 2271

4

Bug Database

Duplicate Bug Reports

A B

Page 8: Duplicate Bug Reports Considered Harmful ... Really?

What are thereasons for duplicates?

5

Page 9: Duplicate Bug Reports Considered Harmful ... Really?

Inexperienced Users6

Page 10: Duplicate Bug Reports Considered Harmful ... Really?

Poor Search Feature7

Page 11: Duplicate Bug Reports Considered Harmful ... Really?

Multiple Failures - One Defect8

Page 12: Duplicate Bug Reports Considered Harmful ... Really?

Accidental Resubmission9

Page 13: Duplicate Bug Reports Considered Harmful ... Really?

Intentional Resubmission

FIX THAT BUG!

10

Page 14: Duplicate Bug Reports Considered Harmful ... Really?

ECLIPSE20% Duplicates

371 per month

11

Page 15: Duplicate Bug Reports Considered Harmful ... Really?

Duplicate reports areusually ignored once identified!

12

Page 16: Duplicate Bug Reports Considered Harmful ... Really?

But Wait!Is this really

the right thing to do?

13

Page 17: Duplicate Bug Reports Considered Harmful ... Really?

“Duplicates [...] often add useful information.

[It is unfortunate that this information is filed

in a new report.]”

DeveloperWhat Makes a Good Bug Report?to appear in FSE 2008

14

Page 18: Duplicate Bug Reports Considered Harmful ... Really?

15

Alan Page Director of Test Excellence, Microsoft

Page 19: Duplicate Bug Reports Considered Harmful ... Really?

15

Alan Page Director of Test Excellence, Microsoft

Bug duplicates can provide valuable information [...]

Page 20: Duplicate Bug Reports Considered Harmful ... Really?

Experiment 1 Do duplicate bug reports contain additional information?

Experiment 2 Can additional

information improve bug triaging?

16

2 EXPERIMENTS

Page 21: Duplicate Bug Reports Considered Harmful ... Really?

Experiment 1 Do duplicate bug reports contain additional information?

17

Page 22: Duplicate Bug Reports Considered Harmful ... Really?

SOURCE CODE

PATCHES

The infoZilla ToolDetects and Extracts

Structural Information:

STACK TRACES

SCREENSHOTS

Bug 137808

Summary: Exceptions from createFromString lock-up the editor

Product: [Modeling] EMF Reporter: Patrick Sodre <[email protected]>

Component: Core Assignee: Marcelo Paternostro <[email protected]>

Status: VERIFIED FIXED QA Contact:

Severity: normal

Priority: P3 CC: [email protected]

Version: 2.2

Target Milestone: ---

Hardware: PC

OS: Windows XP

Whiteboard:

Description: Opened: 2006-04-20 14:25 -0400

As discussed on the newsgroup under the Thread with the same name I am opening

this bug entry. Here is a history of the thread.

-- From Ed Merks

Patrick,

The value is checked before it's applied and can't be applied until it's valid.

But this BigDecimal cases behaves oddly because the exception thrown by

new BigDecimal("badvalue")

has a null message and the property editor relies on returning a non-null

message string to indicate there is an error.

Please open a bugzilla which I'll fix like this:

### Eclipse Workspace Patch 1.0

#P org.eclipse.emf.edit.ui

Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java

===================================================================

RCS file:

/cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v

retrieving revision 1.10

diff -u -r1.10 PropertyDescriptor.java

--- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006

16:42:30 -0000 1.10

+++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006

11:59:10 -0000

@@ -162,7 +162,8 @@

}

catch (Exception exception)

{

- return exception.getMessage();

+ String message = exception.getMessage();

+ return message == null ? exception.toString() : message;

}

}

Diagnostic diagnostic =

Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value);

Patrick Sodre wrote:

Hi,

It seems that if the user inputs an invalid parameter that gets created from

"createFromString" the Editor locks-up until the user explicitly calls "restore

Default Value".

Is this the expected behavior or could something better be done? For

instance if an exception is thrown restore the value back to what it was before

after displaying a pop-up error message.

I understand that for DataTypes defined by the user he/she should take care

of catching the exceptions but for the default ones like BigInteger/BigDecimal

I think the EMF runtime could do some of the grunt work...

If you think this is something worth pursuing I could post an entry in

Bugzilla.

Regards,

Patrick Sodre

Below is the stack trace that I got from the Editor...

java.lang.NumberFormatException

at java.math.BigDecimal.<init>(BigDecimal.java:368)

at java.math.BigDecimal.<init>(BigDecimal.java:647)

at

org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559)

at

org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116)

at

org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183)

at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449)

at

org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135)

at

org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249)

at

------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 -------

The fix has been committed to CVS. Thanks for reporting this problem.

Extracting Structural Information from Bug ReportsMSR 2008

Page 23: Duplicate Bug Reports Considered Harmful ... Really?

Master Report

BUGthisasd

asdlknasdklnasdlk

askdnaklsdn

aksdnlaksdnlkasdkn

asd

sadddda

asdaddasd

aksdnlaskdnlkansd

Duplicate Report

BUGthisasd

asdlknasdklnasdlk

askdnaklsdn

aksdnlkasdkn

asdasdasdasdasd

a

s adddda

a

daddasd

asdasdasdasdasd

askdnlkansd

infoZilla

Experimental Setup

Elements

Elements

19

Page 24: Duplicate Bug Reports Considered Harmful ... Really?

Master Report

BUGthisasd

asdlknasdklnasdlk

askdnaklsdn

aksdnlaksdnlkasdkn

asd

sadddda

asdaddasd

aksdnlaskdnlkansd

Elements

20

Extended Report

BUGthisasd

asdlknasdklnasdlk

askdnaklsdn

aksdnlaksdnlkasdkn

asd

sadddda

asdaddasd

aksdnlaskdnlkansdBUGthisasd

asdlknasdklnasdlk

askdnaklsdn

aksdnlkasdkn

asdasdasdasdasd

a

s

adddda

a

daddasd

asdasdasdasdasd

askdnlkansd

Elements

compare

Experimental Setup

Page 25: Duplicate Bug Reports Considered Harmful ... Really?

ECLIPSE

21

Page 26: Duplicate Bug Reports Considered Harmful ... Really?

16,511 Master ReportsECLIPSE

21

Page 27: Duplicate Bug Reports Considered Harmful ... Really?

16,511 Master Reports

27,838 Duplicate ReportsECLIPSE

21

Page 28: Duplicate Bug Reports Considered Harmful ... Really?

16,511 Master Reports

27,838 Duplicate ReportsECLIPSE

Master Extended21

Unique elements per report:

Page 29: Duplicate Bug Reports Considered Harmful ... Really?

16,511 Master Reports

27,838 Duplicate ReportsECLIPSE

0

0.5

1.0

1.5

2.0

2.5

Patches Stacktraces Screenshots

0.29

1.42

1.94

0.14

0.50

1.83

Master Extended21

Unique elements per report:

Page 30: Duplicate Bug Reports Considered Harmful ... Really?

Experiment 1 Do duplicate bug reports contain additional information?

22

Page 31: Duplicate Bug Reports Considered Harmful ... Really?

Experiment 1 Do duplicate bug reports contain additional information?

22

They do!

Page 32: Duplicate Bug Reports Considered Harmful ... Really?

Experiment 2 Can additional information improve bug triaging?

23

Page 33: Duplicate Bug Reports Considered Harmful ... Really?

Developer

Bug Triage

24

Page 34: Duplicate Bug Reports Considered Harmful ... Really?

DeveloperReport

BUG

Bug Triage

24

Page 35: Duplicate Bug Reports Considered Harmful ... Really?

DeveloperReport

BUG

Fixed

BUG✓

Bug Triage

24

Page 36: Duplicate Bug Reports Considered Harmful ... Really?

BUG

DeveloperReport

BUG

Fixed

BUG✓

Bug Triage

BUG

BUG

BUG

BUG

BUG

BUG

24

Page 37: Duplicate Bug Reports Considered Harmful ... Really?

BUG

DeveloperReport

BUG

Fixed

BUG✓

Bug Triage

BUG

BUG

BUG

BUG

BUG

BUG Triager

24

Page 38: Duplicate Bug Reports Considered Harmful ... Really?

25

Experimental Setup

•Machine learning to predict developers

•Train using master reports

•Train using extended reports

•10 Runs

Page 39: Duplicate Bug Reports Considered Harmful ... Really?

Results for predicting Top-5 developers

43.75

52.50

61.25

70.00

1 2 3 4 5 6 7 8 9 10 All

56

65

6058

51

6061

525352

4751

5657

52

48

5755

484747

42

Master Extended

Run

26

Precision

Page 40: Duplicate Bug Reports Considered Harmful ... Really?

Experiment 2 Can additional information improve bug triaging?

27

Page 41: Duplicate Bug Reports Considered Harmful ... Really?

Experiment 2 Can additional information improve bug triaging?

27

They can!

Page 42: Duplicate Bug Reports Considered Harmful ... Really?

Duplicate reports areusually ignored once identified!

28

Page 43: Duplicate Bug Reports Considered Harmful ... Really?

Duplicate reports areusually ignored once identified!

28XMerge Reports

Page 44: Duplicate Bug Reports Considered Harmful ... Really?

29

Page 45: Duplicate Bug Reports Considered Harmful ... Really?

29

Page 46: Duplicate Bug Reports Considered Harmful ... Really?

29

Page 47: Duplicate Bug Reports Considered Harmful ... Really?

29

Page 48: Duplicate Bug Reports Considered Harmful ... Really?

29