Top Banner
Searching for Configurations in Clone Evaluation: A Replication Study C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke J. H. Drake CENTRE FOR RESEARCH ON EVOLUTION, SEARCH AND TESTING DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY COLLEGE LONDON
20

Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Apr 13, 2017

Download

Science

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study

C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke J. H. Drake

CENTRE FOR RESEARCH ON EVOLUTION, SEARCH AND TESTING DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY COLLEGE LONDON

Page 2: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Code Clone

2

Page 3: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Clone Detectors

3

if (x==0) then y=y+1;

if (check==0) then count=count+1;

$p ($p==0) $p $p=$p+1;

$p ($p==0) $p $p=$p+1;

if_s

if ( cond_e ) then assign_e

if_s

if ( cond_e ) then assign_e

Deckard

CCFinder

SimianNiCad

Page 4: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Oracle Problem in Code Clone

Absence of the possibility to establish a ground truth, we do not know if code is actually cloned

4

?

Page 5: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Agreement

5

?

Page 6: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Parameters Tuning

6

Page 7: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

EvaClone

7

T. Wang, M. Harman., Y. Jia, & J. Krinke. Searching for Better Configurations: A Rigorous Approach to Clone Evaluation. in FSE’13

6 Clone Detectors:PMD, iClones ConQAT, Simian, NiCad, CCFinder

8 Software Projects:weltab, cook, snns, psql, javadoc, ant, jdtcore, swing15�year

s

Page 8: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Maximising Agreement

8

C D N S

Maximise

Clone detectors

Page 9: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

EvaClone (cont.)

9

EvaClone favors recall over precision and more candidates will be reported.

Page 10: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Replication Study

10

Page 11: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Fitness Function

11

4x3x2x1x ++ +4 x (All clone lines)

Page 12: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Replication Study (cont.)

12

DeckardCCFinder

SimianNiCad 25 parameters

Population size 100No. of Generation 100

Crossover 0.8Mutation 0.1Elitism 0.25

2 x 1012

Page 13: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake13

Ver. 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44

SLOC (k) 5.5 6.7 6.78 6.82 7.2 7.6 8.4 8.9 10.1 12.4 17.9 22.8 23.6 25.3

%Inc N/A 21% 2% 1% 6% 5% 11% 7% 13% 23% 44% 28% 3% 8%

Note: there are 2 complete libraries (cglib and asm) embedded in release 1.5 — 1.9 and have been removed before the analysis

Page 14: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

RQ1: Optimised Agreement How do the default parameters perform in terms of

clone agreement on each Mockito release compared to the optimised ones?

14

0.30

0.35

0.40

0.45

0.50

0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44Mockito

Fitn

ess

Valu

e

DefaultEvaClone HighestEvaClone Lowest

Comparison of optimised tools agreement (the highest and the lowest in 20 runs) to the default agreement over 14 Mockito releases

Page 15: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

RQ2: Stability of Optimised Parameters

15

Are there noticeable differences in the values of optimised parameters over releases?

Tool Parameter DFOptimised

0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44

CCFinder MinToken TKS

50 12

10 10

70 16

70 18

70 19

80 18

80 18

80 19

80 20

10 14

10 17

10 10

10 10

10 10

10 10

DeckardMinToken Stride Similarity

30 5

0.9

30 inf

0.9

50 8

1.0

50 8

1.0

50 8

1.0

50 8

1.0

50 8

1.0

50 8

1.0

50 8

1.0

50 16

0.95

50 5

1.0

50 inf

0.9

50 inf

0.9

50 inf

0.9

50 inf

0.9

NiCad

MinLine MaxLine UPI Blind Abstract

6 1K 0.3

0 0

5 200 0.3

1 4

7 100 0.0

0 6

7 100 0.1

0 6

7 400 0.0

0 6

6 400 0.0

0 6

6 200 0.1

0 5

6 200 0.1

0 5

7 200 0.0

1 6

6 200 0.3

1 6

5 100 0.1

1 2

5 100 0.3

1 4

5 100 0.3

1 4

5 200 0.3

1 4

5 200 0.3

1 4

Page 16: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

RQ2: Stability of Optimised Parameters

16

Tool Parameter DFOptimised

0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44

Simian

ignoreCurlyBraces 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0ignoreIdentifiers 0 1 0 0 0 0 0 0 0 1 1 1 1 1 1ignoreIdentifierCase 0 ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱

ignoreStrings 0 1 0 0 0 0 0 0 0 1 0 ✱ ✱ ✱ ✱

ignoreStringCase 1 ✱ 1 1 0 0 0 0 0 ✱ 0 ✱ ✱ ✱ ✱

ignoreNumbers 0 1 0 1 0 1 1 0 1 1 0 ✱ ✱ ✱ ✱

ignoreCharacters 0 0 0 1 0 0 0 1 0 0 1 ✱ ✱ ✱ ✱

ignoreCharacterCase 1 0 0 ✱ 1 1 0 ✱ 1 1 ✱ ✱ ✱ ✱ ✱

ignoreLiterals 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1ignoreSubtypeNames 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1ignoreModifiers 1 1 1 0 1 0 0 0 0 0 0 1 1 1 1ignoreVariableNames 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1balanceParentheses 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0balanceSquareBrackets 0 1 0 0 0 1 1 0 1 1 1 1 1 1 0MinLine 6 5 6 6 6 6 6 6 6 7 7 5 5 5 5

Are there noticeable differences in the values of optimised parameters over releases?

Page 17: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

RQ3: Clones over Releases

17

How many clones in Mockito are reported with the highest agreement over releases?

DefaultEvaClone

Page 18: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Maximising Agreement

18

C D N S

Maximise

Clone detectors

Page 19: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake

Open Challenge“A better fitness function for EvaClone is needed” It must not only rely on the number of cloned lines, but also include other aspects:

How often a line is found to be cloned to other places? Precision vs. Recall? Location of clones

19

???

Page 20: Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake20

0.30

0.35

0.40

0.45

0.50

0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44Mockito

Fitn

ess

Valu

e

DefaultEvaClone HighestEvaClone Lowest

Opt. params vs Def. params

Tool Parameter DF

Optimised

0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.102.0.0

2.0.44

CCFinder MinToken TKS

50 12

10 10

70 16

70 18

70 19

80 18

80 18

80 19

80 20

10 14

10 17

10 10

10 10

10 10

10 10

DeckardMinToken Stride Similarity

30 5

0.9

30 inf 0.9

50 8 1.0

50 8 1.0

50 8

1.0

50 8

1.0

50 8 1.0

50 8 1.0

50 8 1.0

50 16 0.95

50 5 1.0

50 inf 0.9

50 inf 0.9

50 inf 0.9

50 inf 0.9

NiCad

MinLine MaxLine UPI Blind Abstract

6 1K 0.3 0 0

5 200 0.3 1 4

7 100 0.0 0 6

7 100 0.1 0 6

7 400

0.0 0 6

6 400

0.0 0 6

6 200 0.1 0 5

6 200 0.1 0 5

7 200 0.0 1 6

6 200

0.3 1 6

5 100 0.1 1 2

5 100 0.3 1 4

5 100 0.3 1 4

5 200

0.3 1 4

5 200

0.3 1 4

Opt. params are not stable over releases

DefaultEvaClone

Fitness func. needs improvements