W5-GettingWhatYouMeasure-BouwersEtAl2012.pdf

54COMMUNI CATI ONSOFTHEACM| JULY2012| VOL. 55| NO. 7practiceILLUSTRATION BY GARY NEILLARESOFTWAREMETRICS helpful tools or a waste of time? For every developer who treasures these mathematical abstractions of software systems there is a developer who thinks software metrics are invented just to keep project managers busy. Software metrics can be very powerful tools that help achieve your goals but it is important to use them correctly, as they also have the power to demotivate project teams and steer development in the wrong direction. For the past 11 years, the Software Improvement Group has advised hundreds of organizations concerning software development and risk management on the basis of software metrics. We have used software metrics in more than 200 investigations in which we examined a single snapshot of a system. Additionally, we use software metrics to track the ongoing development effort of more than 400 systems. While executing these projects, we have learned some pitfalls to avoid when using software metrics in a project management setting. Thisarticle addresses the four most important of these: Metric in a bubble; Treating the metric; One-track metric; and Metrics galore. Knowingaboutthesepitfallswill help you recognize them and, hopeful-ly, avoid them, which ultimately leads tomakingyourprojectsuccessful.As asoftwareengineer,yourknowledge of these pitfalls helps you understand why project managers want to use soft-waremetricsandhelpsyouassistthe managers when they are applying met-ricsinaninefcientmanner.Asan outsideconsultant,youneedtotake thepitfallsintoaccountwhenpre-senting advice and proposing actions. Finally,ifyouaredoingresearchin the area of software metrics, knowing these pitfalls will help place your new metricintherightcontextwhenpre-sentingittopractitioners.Beforediv-ingintothepitfalls,letslookatwhy softwaremetricscanbeconsidereda useful tool. Software Metrics Steer People Yougetwhatyoumeasure.This phrasedenitelyappliestosoftware projectteams.Nomatterwhatyoude-ne as a metric, as soon as it is used to evaluate a team, the value of the metric movestowardthedesiredvalue.Thus, to reach a particular goal, you can con-tinuouslymeasurepropertiesofthe desiredgoalandplotthesemeasure-mentsinaplacevisibletotheteam. Ideally,thedesiredgoalisplotted alongside the current measurement to indicate the distance to the goal. Imagine a project in which the run-timeperformanceofaparticularuse caseisofcriticalimportance.Inthis caseithelpstocreateatestinwhich theexecutiontimeoftheusecaseis measureddaily.Byplottingthisdaily datapointagainstthedesiredvalue, andmakingsuretheteamseesthis measurement,itbecomescleartoev-eryonewhetherthedesiredtargetis being met or whether the development actionsofyesterdayareleadingthe team away from the goal. Getting What You MeasureDOI : 10. 1145/2209249. 2209266 Article development led by queue.acm.orgFour common pitfalls in using software metrics for project management. BY ERIC BOUWERS, JOOST VISSER, AND ARIE VAN DEURSENCREDIT TKJULY2012| VOL. 55| NO. 7| COMMUNI CATI ONSOFTHEACM5556COMMUNI CATI ONSOFTHEACM| JULY2012| VOL. 55| NO. 7practiceEventhoughitmightseemsimple, thistechniquecanbeappliedincor-rectlyinanumberofsubtleways.For example,imagineasituationinwhich customersareunhappybecausethey reportproblemsinaproductthatare notsolvedinatimelymanner.Toim-provecustomersatisfaction,theproj-ectteamtrackstheaverageresolution timeforissuesinarelease,following the reasoning that a lower average res-olutiontimeresultsinhighercustom-er satisfaction. Unfortunately,realityisnotso simple.Tostart,solvingissuesfaster might lead to unwanted side effectsforexample,aquickxnowcouldre-sult in longer x times later because of incurred technical debt. Second, solv-ing an issue within days does not help the customer if these xes are released onlyonceayear.Finally,customers areundoubtedlymoresatisedwhen no x is required at allthat is, issues donotendupintheproductinthe rst place. Thus,usingametricallowsyou tosteertowardagoal,whichcanbe eitherahigh-levelbusinessproposi-tion(thecostsofmaintainingthis systemshouldnotexceed$100,000 peryear)ormoretechnicallyori-ented(allpagesshouldloadwithin 10seconds).Unfortunately,using metricscanalsopreventyoufrom reachingthedesiredgoal,depend-ing on the pitfalls encountered. In the remainderofthisarticle,wediscuss someofthepitfallswefrequentlyen-counteredandexplainhowtheycan be recognized and avoided.What Does the Metric Mean? Softwaremetricscanbemeasuredon differentviewsofasoftwaresystem. Thisarticlefocusesonmetricscalcu-lated on a particular version of the code base of a system, but the pitfalls also ap-ply to metrics calculated on other views. Assumingthecodebasecontains onlythecodeofthecurrentproject, softwareproductmetricsestablish agroundtruth.Calculatingonlythe metricsisnotenough,however.Two moreactionsareneededtointerpret the value of the metric: adding context; andestablishingtherelationshipwith the goal. Toillustratethesepoints,weuse theLOC(linesofcode)metrictopro-Figure 1. The lines of code of a software system from January 2010 to July 2011.Figure 2. Measuring lines of code in two different ways.025,00050,00075,000100,000125,000150,000175,000200,000225,000250,000275,000300,000325,000350,000375,000400,000Lines of code LinesJan2010Mar2010May2010Jul2010Sep2010Nov2010Jan2011Mar2011May2011Jul2011Figure 3. Measuring number of les used. Nr. of lesJan2010Mar2010May2010Jul2010Sep2010Nov2010Jan2011Mar2011May2011Jul201102505007501,0001,2501,5001,7502,0002,2502,5002,7503,0003,2503,5003,7504,0004,2504,5004,7505,000025,00050,00075,000100,000125,000150,000175,000200,000225,000250,000275,000300,000325,000350,000Lines of codeJan2010Mar2010May2010Jul2010Sep2010Nov2010Jan2011Mar2011May2011Jul2011practiceJULY2012| VOL. 55| NO. 7| COMMUNI CATI ONSOFTHEACM57videdetailsaboutthecurrentsizeof aproject.Eventhoughtherearemul-tipledenitionsofwhatconstitutes alineofcode,suchametriccanbe usedtoreasonaboutwhethertheex-amined code base is complete or con-tains extraneous code such as copied-inlibraries.Todothis,however,the metricshouldbeplacedincontext, bringing us to our rst pitfall.Metricinabubble.Usingametric withoutproperinterpretation.Recog-nizedbynotbeingabletoexplainwhat agivenvalueofametricmeans.Canbe solved by placing the metric inside a con-text with respect to a goal.Theusefulnessofasingledata pointofametricislimited.Knowing thatasystemis100,000LOCismean-inglessbyitself,sincethenumber alone does not explain if the system is largeorsmall.Tobeuseful,thevalue ofthemetricshould,forexample,be comparedagainstdatapointstaken from the history of the project or from abenchmarkofotherprojects.Inthe rstscenario,youcandiscovertrends thatshouldbeexplainedbyexternal events.Forexample,thegraphinFig-ure 1 shows the LOC of a software sys-tem from January 2010 to July 2011. Therstquestionthatcomesto mind here is: Why did the size of the systemdropsomuchinJuly2010? Iftheanswertothisquestionis,We removedalotofopensourcecode wecopiedinearlier,thenthereis noproblem(otherthantheinclusion ofthiscodeintherstplace).Ifthe answeris,Weaccidentallydeleted partofourcodebase,thenitmight bewisetointroduceadifferentpro-cessofsource-codeversionmanage-ment.Inthiscasetheansweristhat an action was scheduled to drastically reducetheamountofconguration needed; given the amount of code that wasremoved,thisactionwasappar-ently successful. Note that one of the benets of plac-ingmetricsincontextisthatitallows youtofocusontheimportantpartof thegraph.Questionsregardingwhat happenedatacertainpointintime orwhythevaluesignicantlydeviates fromothersystemsbecomemoreim-portant than the specic details about howthemetricismeasured.Often people,eitheronpurposeorbyacci-dent,trytosteeradiscussiontoward Howisthismetricmeasured?in-steadofWhatdothesedatapoints tellme?Inmostcasestheexactcon-structionofametricisnotimportant fortheconclusiondrawnfromthedata. Forexample,considerthethreeplots showningures2and3represent-ingdifferentwaysofcomputingthe volumeofasystem.Figure2shows thelinesofcodecountedasevery linecontainingatleastonecharacter thatisnotacommentorwhitespace (blue) and lines of code counted as all newlinecharacters(orange).Figure3 shows the number of les used. Thetrendlinesindicatethat,even thoughthescalediffers,thesevol-ume metrics all show the same events. Thismeansthateachofthesemet-ricsisagoodcandidatetocompare thevolumeofasystemagainstother systems.Aslongasthevolumeofthe other systems is measured in the same manner,theconclusionsdrawnfrom the data will be very similar. Thedifferenttrendlinesbringup a second question: Why does the vol-umedecreaseafteraperiodinwhich thevolumeincreased?Theanswer canbefoundinthenormalwayin whichalterationsaremadetothis particularsystem.Whenthevolume ofthesystemincreases,anactionis scheduledtodeterminewhethernew abstractionsarepossible,whichis usuallythecase.Thistypeofrefac-toringcansignicantlydecreasethe size of the code base, which results in lowermaintenanceeffortandeasier waystoaddfunctionalitytothesys-tem.Thus,thegoalhereistoreduce maintenance effort by (among others) keeping the size of the code base rela-tively small. Intheidealsituationadirectrela-tionship exists between a desired goal (such as, reduced maintenance effort) andametric(suchas,asmallcode base). In some cases this relationship is based on informal reasoning (for ex-ample, when the code base of a system is small it is easier to analyze what the systemdoes);inothercasesscientic researchhasshownthattherelation-shipexists.Whatisimportanthereis that you determine both the nature of therelationshipbetweenthemetric andthegoal(direct/indirect)andthe strength of this relationship (informal reasoning/empirically validated). To be useful,the value ofthe metric shouldbe compared against datapoints taken fromthe historyof the project or from a benchmark of other projects.58COMMUNI CATI ONSOFTHEACM| JULY2012| VOL. 55| NO. 7practiceThus,ametricinisolationwillnot help you reach your goal. On the other hand,assigningtoomuchmeaningto a metric leads to a different pitfall. Treatingthemetric.Makingaltera-tionsjusttoimprovethevalueofamet-ric.Recognizedwhenchangesmadeto the software are purely cosmetic. Can be solvedbydeterminingtherootcauseof the value of a metric.Themostcommonpitfallismak-ing changes to a system just to improve thevalueofametric,insteadoftrying to reach a particular goal. At this point, thevalueofthemetrichasbecome agoalinitself,insteadofameans ofreachingalargergoal.Thissitua-tionleadstorefactoringsthatsimply pleasethemetric,whichisawaste ofpreciousresources.Youknowthis hashappenedwhen,forexample,one developer explains to another develop-erthatarefactoringneedstobedone becausetheduplicationpercentage is too high, instead of explaining that multiplecopiesofapieceofcodecan causeproblemsformaintainingthe code later on. It is never a problem that thevalueofametricistoohighortoo low:thefactthisvalueisnotinline with your goal should be the reason to perform a refactoring. Consideraprojectinwhichthe numberofparametersformethods ishighcomparedwithabenchmark. Whenamethodhasarelativelylarge numberofparameters(forexample, morethanseven)itcanindicatethat thismethodisimplementingdif-ferentfunctionalities.Splittingthe methodintosmallermethodswould makeiteasiertounderstandeach function separately. A second problem that could be sur-facing through this metric is the lack of agroupingofrelateddataobjects.For example, consider a method that takes asparametersaDateobjectcalled startDateandanothercalledend-Date.Thenamessuggestthatthese two parameters together form a Date-PeriodobjectinwhichstartDate will need to be before endDate. When multiplemethodstakethesetwopa-rametersasinput,introducingsucha DatePeriodobjecttomakethisex-plicit in the model could be benecial, reducingbothfuturemaintenanceef-fort,aswellasthenumberofparam-eters being passed to methods. Sometimes,however,parameters are,forexample,movedtotheelds of the surrounding class or replaced by amapinwhicha(String,Object) pairrepresentsthedifferentparam-eters.Althoughbothstrategiesreduce thenumberofparametersinside methods, it is clear that if the goal is to improvereadabilityandreducefuture maintenanceeffort,thenthesesolu-tionsarenothelping.Itcouldbethat this type of refactoring is done because thedeveloperssimplydonotunder-stand the goal and thus are treating the symptoms.Therearealsosituations, however,inwhichthesenon-goal-ori-entedrefactoringsaredonetogame thesystem.Inbothsituationsitisim-portanttomakethedevelopersaware oftheunderlyinggoalstoensurethat effort is spent wisely.Thus a metric should never be used as-is,butitshouldbeplacedinside acontextthatenablesameaningful comparison.Additionally,therela-tionshipbetweenthemetricandde-siredpropertyofyourgoalshouldbe clear; this enables you to use the met-rictoschedulespecicactionsthat willhelpreachyourgoal.Makesure thescheduledactionsaretargeted towardreachingtheunderlyinggoal instead of only improving the value of the metric.How Many Metrics Do You Need? Eachmetricprovidesaspecicview-pointofyoursystem.Therefore,com-biningmultiplemetricsleadstoabal-ancedoverviewofthecurrentstateof your system. The number of metrics to beusedleadstotwopitfalls,westart with using only a single metric.One-track metric. Focusing on only a singlemetric.Recognizedbyseeingonly one(orjustafew)metricsondisplay. Can be solved by adding metrics relevant to the goal.Usingonlyasinglesoftwaremetric tomeasurewhetheryouareontrack towardyourgoalreducesthatgoalto a single dimension (that is, the metric thatiscurrentlybeingmeasured).A goalisneveronedimensional,how-ever.Softwareprojectsexperience constant trade-offs between delivering desiredfunctionalityandnonfunc-tionalrequirementssuchassecurity, performance,scalability,andmain-tainability.Therefore,multiplemet-The most common pitfall is making changes toa system justto improvethe valueof a metric,instead of tryingto reacha particular goal.practiceJULY2012| VOL. 55| NO. 7| COMMUNI CATI ONSOFTHEACM59ricsarenecessarytoensurethatyour goal,includingspeciedtrade-offs, isreached.Forexample,asmallcode basemightbeeasiertoanalyze,butif thiscodebaseismadeofhighlycom-plexcode,thenitcanstillbedifcult to make changes. In addition to providing a more bal-anced view of your goal, using multiple metricsalsoassistsyouinndingthe rootcauseofaproblem.Asinglemet-ricusuallyshowsonlyasinglesymp-tom,whileacombinationofmetrics canhelpdiagnosetheactualdisease within a project. Forexample,inoneprojectthe equalsandhashCodemethods (thoseusedtoimplementequality forobjectsinJava)wereamongthe longestandmostcomplexmethods within the system. Additionally, a rela-tivelylargepercentageofduplication occurred in these methods. Since they use all the elds of a class, the metrics indicatethatmultipleclasseshavea relativelylargenumberofeldsthat arealsoduplicated.Basedonthisob-servation, we reasoned the duplicated eldsformanobjectthatwasmiss-ingfromthemodel.Inthiscasewe advisedlookingintothemodelofthe systemtodeterminewhetherextend-ing the model with a new object would be benecial. In this example, examining the met-ricsinisolationwouldnothaveledto this conclusion, but by combining sev-eral unit-level metrics, we were able to detect a design aw.Metrics galore. Focusing on too many metrics.Recognizedwhentheteamig-nores all metrics. Can be solved by reduc-ing the number of metrics used.Although using a single metric over-simpliesthegoal,usingtoomany metricsmakesitdifcult(oreven impossible)toreachyourgoal.Apart frommakingithardtondtheright balanceamongalargesetofmetrics, itisnotmotivatingforateamtosee that every change they make results in thedeclineofatleastonemetric.Ad-ditionally,whenthevalueofametric is far off the desired goal, then a team canstarttothink,Wewillneverget there, anyway, and simply ignore the metrics altogether. Forexample,therehavebeenmul-tipleprojectsthatdeployedastatic-analysistoolwithoutcriticallyexam-Ifyouarealreadyusingmetricsin yourdailywork,trytolinkthemto specic goals. If you are not using any metricsatthistimebutwouldliketo seetheireffects,wesuggestyoustart small:deneasmallgoal(methods shouldbesimpletounderstandfor newpersonnel);deneasmallsetof metrics (for example, length and com-plexityofmethods);deneatarget measurement (at least 90% of the code shouldbesimple);andinstallatool that can measure the metric. Commu-nicateboththegoalandthetrendof themetrictoyourcolleaguesandex-perience the inuence of metrics.Related articleson queue.acm.orgMaking a Case for Efcient Supercomputing Wu-chun Fenghttp://queue.acm.org/detail.cfm?id=957772Power-Efcient Software Eric Saxehttp://queue.acm.org/detail.cfm?id=1698225Sifting Through the SoftwareSandbox: SCM Meets QA William W. Whitehttp://queue.acm.org/detail.cfm?id=1046945Eric Bouwers (at [email protected] ) is a software engineer and technical consultant at the Software Improvement Group in Amsterdam, The Netherlands. He is a part-time Ph.D. student at Delft University of Technology. He is interested in how software metrics can assist in quantifying the architectural aspects of software quality. Joost Visser ([email protected] ) is head of researchat the Software Improvement Group in Amsterdam,The Netherlands, where he is responsible for innovation of tools and services, academic relations, and general research. He also holds a part-time position as professor of large-scale software systems at the Radboud University Nijmegen,The Netherlands.Arie van Deursen ([email protected]) is a full professor in software engineering at Delft University of Technology, The Netherlands, where he leads the Software Engineering Research Group. His researchtopics include software testing, software architecture,and collaborative software development. 2012 ACM 0001-0782/12/07 $15.00ining the default conguration. When thetoolinquestioncontains,forex-ample,acheckthatagstheuseofa tabcharacterinsteadofspaces,the rst run of the tool can report an enor-mousnumberofviolationsforeach check(runningintothehundredsof thousands).Withoutproperinter-pretationofthisnumber,itiseasyto conclude that reaching zero violations cannot be done within any reasonable amountoftime(eventhoughsome problemscaneasilybesolvedbya simple formatting action). Such an in-correct assessment sometimes results inthetoolbeingconsidereduseless by the team, which then decides to ig-nore the tool. Fortunately,inothercasesthe teamadaptsthecongurationtosuit thespecicsituationbylimitingthe numberofchecks(forexample,by removingchecksthatmeasurehighly related properties, can be solved auto-matically, or are not related to the cur-rentgoals)andinstantiatingproper default values. By using such a specic conguration, the tool reports a lower number of violations that can be xed in a reasonable amount of time. Toensureallviolationsarexed eventually,thecongurationcan beextendedtoincludeothertypes ofchecksormorestrictversionsof checks.Thiswillincreasetheto-talnumberofviolationsfound,but whendonecorrectlythenumberof reportedviolationsdoesnotdemo-tivatethedeveloperstoomuch.This process can be repeated to extend the set of checks slowly toward all desired checks without overwhelming the de-velopers with a large number of viola-tions at once. Conclusion Softwaremetricsareusefultoolsfor projectmanagersanddevelopers alike. To benet from the full potential ofmetrics,keepthefollowingrecom-mendations in mind: Attachmeaningtoeachmetricby placingitincontextanddeningthe relationshipbetweenthemetricand your goal, while at the same time avoid making the metric a goal in itself. Usemultiplemetricstotrackdif-ferentdimensionsofyourgoal,but avoid demotivating a team by using too many metrics.

W5-GettingWhatYouMeasure-BouwersEtAl2012.pdf

Documents

software development

basis of software metrics

area of software metrics

software improvement

metric onetrack metric

steer development

project teams

project management setting