Top Banner
Mining the Modern Code Review Repositories: A Dataset of People, Process and Product Xin Yang Raula G. Kula Norihiro Yoshida Hajimu Iida May 14–15, 2016. Austin, Texas MSR 2016 data showcase Osaka University Japan Nagoya University Japan NAIST Japan NAIST Japan
14

MSR 2016 data showcase - Mining Code Review Repositories

Apr 12, 2017

Download

Software

Xin Yang
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MSR 2016 data showcase - Mining Code Review Repositories

Mining the Modern Code Review Repositories: A Dataset of People, Process and Product

Xin Yang Raula G. Kula Norihiro Yoshida Hajimu Iida

May 14–15, 2016. Austin, Texas

MSR 2016 data showcase

Osaka University Japan

Nagoya University Japan

NAISTJapan

NAISTJapan

Page 2: MSR 2016 data showcase - Mining Code Review Repositories

A Code Review Dataset

1

Code Review

Source Code

Human / Social (anonymized usernames and email addresses)

Page 3: MSR 2016 data showcase - Mining Code Review Repositories

Why we made this dataset?

2

*Hamasaki et al., “Who does what during a code review? datasets of OSS peer review repositories”. MSR '13

Our previous work (Hamasaki et al. MSR '13)*

Page 4: MSR 2016 data showcase - Mining Code Review Repositories

Our previous work (Hamasaki et al. MSR '13)*

Why we made this dataset?

2

Some feedback:“Hard to query...”“Hard to convert...”“Unable to access the source code...”

*Hamasaki et al., “Who does what during a code review? datasets of OSS peer review repositories”. MSR '13

Page 5: MSR 2016 data showcase - Mining Code Review Repositories

Our previous work (Hamasaki et al. MSR '13)*

Why we made this dataset?

2

Some feedback:“Hard to query...”“Hard to convert...”“Unable to access the source code...”

*Hamasaki et al., “Who does what during a code review? datasets of OSS peer review repositories”. MSR '13

Page 6: MSR 2016 data showcase - Mining Code Review Repositories

*Hamasaki et al., “Who does what during a code review? datasets of OSS peer review repositories”. MSR '13

Our previous work (Hamasaki et al. MSR '13)*

Some feedback:“Hard to query...”“Hard to convert...”“Unable to access the source code...”

Why we made this dataset?

2

★ Easy to query / analyze★ Easy to export / convert★ Able to access the source code

Page 7: MSR 2016 data showcase - Mining Code Review Repositories

3

Modern Code Review (MCR) Key Attributes

Large Codebases

(Repositories)

High Volume Submissions

(Patches)

LargeCommunities(Participants)

</></></></></></>

</></></>

Page 8: MSR 2016 data showcase - Mining Code Review Repositories

The Concept

4

Page 9: MSR 2016 data showcase - Mining Code Review Repositories

Process

Product

People

The Concept

4

Page 10: MSR 2016 data showcase - Mining Code Review Repositories

4 years 3 years 7 years 4 years 3 years

611 20 567 111 189

173,749 13,597 63,610 110,172 9,168

5,091 437 3,334 1,437 759

Dataset Statistics (updated to May 2015)

5

</></></>

Page 11: MSR 2016 data showcase - Mining Code Review Repositories

6

Dataset Schema (Check our wiki for details)

Page 12: MSR 2016 data showcase - Mining Code Review Repositories

★ Promote peer review research and link to other research topics

★ Encourage researchers to use this as a benchmark of techniques and different approaches

7

Our Goals

Page 13: MSR 2016 data showcase - Mining Code Review Repositories

goo.gl/Wi4UoJ

Get Your Copy Now!!!

Page 14: MSR 2016 data showcase - Mining Code Review Repositories

Thanks!Any questions?

Contact: Xin Yang

[email protected]

@seeleather