Top Banner
Improving the Integration Process of Large Software Systems Yujuan Jiang , Bram Adams MCIS, Polytechnique Montreal, Canada 1 Friday, 18 April, 14
12

Improving the Integration Process of Large Software Systems

Aug 19, 2014

Download

Engineering

Yujuan Jiang

It's my research proposal for the poster of Releng workshop 2014.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Improving the Integration Process of Large Software Systems

Improving the Integration Process of Large Software Systems

Yujuan Jiang, Bram AdamsMCIS, Polytechnique Montreal, Canada

1

Friday, 18 April, 14

Page 2: Improving the Integration Process of Large Software Systems

Integration & its Challenges

external library

host project

platform:Java 7

dependency: Java 8

2

Friday, 18 April, 14

Page 3: Improving the Integration Process of Large Software Systems

I do hold out hope that Google does come around and works to fix their codebase to get it merged upstream to stop the huge blockage that they have now caused in a large number of embedded Linux hardware companies […] But I need the help of the Google developers to make it happen, without them, nothing can change.

http://www.kroah.com/log/linux/android-kernel-problems.html3

GregKroah-Hartman

Friday, 18 April, 14

Page 4: Improving the Integration Process of Large Software Systems

Our Approach

Understand how does integration

work?

Analyze why integration fails?

Propose solution to integration

issues.

4

Friday, 18 April, 14

Page 5: Improving the Integration Process of Large Software Systems

5

linux-usb

linux-scsi

lkml

linux 3.5

subsystemmaintainer1

subsystemmaintainer1

Reviewing Integration Staging

maintainer Linus Torvalds

Case study: Linux Kernel

contributor

contributor

contributor

Friday, 18 April, 14

Page 6: Improving the Integration Process of Large Software Systems

Few Patches Get In!Long Integration Time!

2005 2006 2007 2008 2009 2010 2011 2012

accepted/rejected patches

year

perc

enta

ge o

f pat

ches

0

20000

40000

60000

80000

100000

120000

28.6328.7

27.03

32.83 32.79 33.8733.55 30.74

71.37

71.3

72.97

67.1767.21 66.13

66.45

69.26

% accepted by linus% rejected by linus

33% of patches make it!2005 2006 2007 2008 2009 2010 2011 2012

year

perc

enta

ge o

f acc

epte

d pa

tche

s of

eac

h ye

ar

020

4060

80

instantlywithin_hourwithin_day

within_weekwithin_monthwithin_quarter

within_half_yearwithin_yeartook_ages

Integration requiring 1~6months!

6

Friday, 18 April, 14

Page 7: Improving the Integration Process of Large Software Systems

reviewing history

Tracking Evolution Process of the Patch

7

Friday, 18 April, 14

Page 8: Improving the Integration Process of Large Software Systems

It takes 25% of patches more than 4 weeks to be reviewed!

8

Table 4: Average value of characteristics in di↵erenttypes of threads (including rejected patches).

metric name MM MS SS

Rev

iew

thr_volume 3.838 6.051 3.533

nr_reviews 1.046 1.936 1.165

review_time (day) 1.932 3.022 2.271

response_time (day) 0.871 1.131 1.030

first_response_time (day) 0.801 1.215 1.010Patch size 81.660 146.100 25.430

spread 2.398 3.811 1.016

spread_subsys 1.387 1.750 1.003

Oth

er

acceptance 46.05% 43.97% 23.26%

bug-fix 49.94% 36.72% 31.10%

Table 5: Time duration (#days) of the super-threads of type MM.

time duration # of patch versionsMin. 0 2.000

1st Qu. 2.687 2.000Median 10.061 2.000Mean 21.341 3.172

3rd Qu. 32.923 3.000Max. 107.524 108.000

of MM). Table 5 shows that most of the super-threads con-sist of only few patch versions (Mean. value is 3.172), butcan last a long time (Mean. value is 21.341 days). Morethan 25% of the patches have a reviewing time that is 4.5weeks longer than considered by researchers thus far [16].

Qb) What kind of patches undergo multiple patch versions?

Patches evolving across multiple versions are largerand a↵ect more files than those with a single version.The patches from threads of type MM (especially) and MShave higher values for “size”, “spread” and “spread subsys”,as shown in Table 4. This indicates that the patches under-going multiple versions tend to be larger and more complex,and hence need more attention before being integrated. Sur-prisingly, such invasive patches seem to feature especially insingle threads, rather than multiple ones. As we will seebelow, this means that they are still integrated relativelyquickly, whereas the patches that need more versions tendto be slightly less invasive. A Kruskal-Wallis test with post-hoc tests verified the significant di↵erence.

Qc) What kind of patches undergo multiple threads?

Kernel developers use multiple threads if too muchtime has passed since the previous patch version. Wecompared the time distribution of the interval between twosuccessive threads to that of two successive patch versionswithin one thread. The result is shown in the boxplots ofFigure 10. We can see that the time interval of threadsis much longer than that of patch versions. This seemsto confirm the intuition that people typically start a newthread when too much time has passed since the last re-view or version of a patch, whereas they would continue thesame thread otherwise. A t-test with as null hypothesis “nodi↵erence between the average of both time distributions”obtained a p-value <2.2e-16, which confirms that the di↵er-ences are statistically significant.

within threadbetween 2 successive threadsbetween 2 successive patch versions

#"of"Days

Figure 10: Boxplot of average time interval (#days)between two successive patch versions/threads ofsuper-threads.

Threads of type MM especially consist of bug-fixes. Out of all MM threads, 49.94% are bug-fixes, com-pared to 36.7% for MS and 31.10% for SS threads. The pair-wised Mann-Whitney test show that MM has no significantdi↵erence with MS, but that MM and MS are significantlydi↵erent from SS.On the one hand, this finding seems surprising, since one

would expect bug-fixes to be smaller and hence require lessdiscussion. On the other hand, bugs might be risky to fix,and hence require care and thorough reviews.

Qd) Do reviewers lose interest in multi-version patches?

Patches of MM and MS threads involve more dis-cussion than SS threads. We compared the number ofreviews discussing a patch for the three di↵erent types ofthreads. The value of “thr volume” of MM and MS patchesis higher than for the SS, which means that if a patch needsto undergo one or more additional versions reviewers seem todiscuss more about it and provide more constructive com-ments to help improve it to be accepted. The amount ofreviewing hence does not su↵er from having multiple ver-sions of a patch. A Kruskal-Wallis test with post-hoc testsshowed that the three groups are di↵erent from each other.Patches of type SS and MS have fewer number

of reviews. SS and MS patches receive the most receives(nr reviews), and hence take more review time as well. Hence,it seems like patches evolving across multiple versions at-tract fewer reviews. A Kruskal-Wallis test with post-hoctests showed that the di↵erence is significant. One possibleexplanation could be that MM patches received the majorityof their reviews early on, with later patch versions receivingmore focused reviewing.Reviewers are more eager to review MM threads.

Although we did not find a statistically significant di↵erencebetween the values of metrics“response time”and“first response time”for SS and MS, we found that MM threads significantly takeless time before the first review. Reviewers seem to considersuch patches as having a higher priority than other types ofsuperthreads.

Qe) Do multi-version patches have a lower chance of accep-

Friday, 18 April, 14

Page 9: Improving the Integration Process of Large Software Systems

9

linux-usb

linux-scsi

lkml

linux 3.5

subsystemmaintainer1

subsystemmaintainer1

Reviewing Integration Staging

maintainer Linus Torvalds

Figure out the Integration Black-box!

contributor

contributor

contributor

Friday, 18 April, 14

Page 10: Improving the Integration Process of Large Software Systems

Does it have external dependencies?

How much effort do I need to pay?

Is this integration worth the effort?

What will this integration change?

Will it cause further risk?

....... integrator

Propose Solution to Improve Integration

10

Friday, 18 April, 14

Page 11: Improving the Integration Process of Large Software Systems

ISOMO: Integration of Software cOst

MOdel

11

Friday, 18 April, 14

Page 12: Improving the Integration Process of Large Software Systems

ISOMO: Quantify the Cost of Integration

Merge Cost

Update Cost

Maintenance Cost

Removal Cost

ISOMO

12

Friday, 18 April, 14