YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Guidelines to False Positive Testing - AMTSO

GuidelinestoFalsePositiveTesting

Page 2: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

2

NoticeandDisclaimerofLiabilityConcerningtheUseofAMTSODocuments

ThisdocumentispublishedwiththeunderstandingthatAMTSOmembersaresupplyingthisinformationforgeneraleducationalpurposesonly.Noprofessionalengineeringoranyotherprofessionalservicesoradvice isbeingofferedhereby. Therefore,youmustuseyourownskillandjudgmentwhenreviewingthisdocumentandnotsolelyrelyontheinformationprovidedherein.

AMTSObelievesthattheinformationinthisdocumentisaccurateasofthedateofpublicationalthoughithasnotverifieditsaccuracyordeterminedifthereareanyerrors.Further,suchinformationissubjecttochangewithoutnoticeandAMTSOisundernoobligationtoprovideanyupdatesorcorrections.

Youunderstandandagreethat thisdocument isprovidedtoyouexclusivelyonanas-isbasiswithoutanyrepresentationsorwarrantiesofanykindwhetherexpress, impliedorstatutory. Without limitingthe foregoing, AMTSO expressly disclaims all warranties of merchantability, non-infringement,continuousoperation,completeness,quality,accuracyandfitnessforaparticularpurpose.

InnoeventshallAMTSObeliableforanydamagesorlossesofanykind(including,withoutlimitation,any lost profits, lost data or business interruption) arising directly or indirectly out of any use of thisdocument including, without limitation, any direct, indirect, special, incidental, consequential,exemplary and punitive damages regardless of whether any person or entity was advised of thepossibilityofsuchdamages.

Thisdocument isprotectedbyAMTSO’s intellectualpropertyrightsandmaybeadditionallyprotectedbytheintellectualpropertyrightsofothers.

Page 3: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

3

GuidelinestoFalsePositiveTestingPreamble

It is a very challenging problem to measure a security product’s false positive rate and to furthercharacterizetheimpactofthisfalsepositiverateonbothconsumersandenterprises,inrelationtotheproduct’soverallefficacy.Thepurposeofthisdocumentistopointoutthemostsignificantissuesthatwe identified during our investigation over the past year to help testers bettermitigate these issuesduringfutureevaluations.Wewelcomeanysuggestedsolutionstotheproblemsdescribed.

There are different types of False Positives. For the purposes of this document, a false positive is adetection(ornotification/alert)onafileorresourcewhichhasnomaliciouspayload.ThereisanotherrelevantareaofFalsePositivesregardingdynamicobjectssuchasURL’s.Thesearenotaddressedinthisdocument,butwillbeaddressedinafutureAMTSOdocument.

Introduction

Asmost securitycompaniesknow,FalsePositives (FPs)canhavea larger impactoncustomers thanaproduct’sprotection–andtheyarealsorememberedfarlonger.Asmoreandmoresecurityproductsleverageproactivetechnologiessuchasbehaviorblocking,genericsignaturesandheuristicstoaddresstheexpandedthreatlandscape,thelikelihoodofFPshasincreaseddramatically.Inadditiontoharmingthereputationofaproduct,falsepositivescandisruptoperationswithinabusinessandcausefinancialdistresstotheaffectedsoftwarevendor.WhilesignificantFPsoccurrarely,theconsequencesofsuchasignificantfalsepositivecanfaroutweightheconsequencesofafalsenegative.

It is striking thatanumberof tests thatdonot considerFPs. Even those tests thatdoevaluate falsepositives take a simplistic approach.Most testers simply scan a large collectionof non-malicious files(often including grey-ware) and then report the number of non-malicious files that each productdetected.Forexample:onthesamesetofcleanfilesProductAfalselydetects100files,whileProductBfalselydetectsonly50,ergoProductBhasalowerFalsePositiverate.QED.Whyisthissimplistic?IsthisnottheverydefinitionofFalsePositiveandFalsePositiverate?Theproblem(and,aswewillseeitisquiteacomplexproblem)isthatthispresumesthatallnon-maliciousfilesareequallyimportant.Butarethey?

WhatIsaFalsePositive?

Afalsepositiveisadetection(ornotification/alert)onafileorresourcewhichhasnomaliciouspayload.Defining amalicious payload is not always clear cut. There are some gray areas such as PotentiallyUnwantedApplications,alsoknownasRiskware.Forexample,alegitimateremote-accessclient(e.g.,aVNCclient)mightbeentirelylegitimateiftheuserknowinglyinstallsit.Ontheotherhand,ifapieceofmalwaresurreptitiouslyinstallsthatsameVNCclienttouseittoobtainaccesstothevictim’scomputer,suchaprogramwouldbeunwanted.AdetectionofsuchaVNCclientintheformercasewoulddefinitely

Page 4: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

4

constituteafalsepositive,whiledetectioninthelattercasecouldbearguedtobelegitimate.Thus,thecontextofanapplicationdetermineswhetherornotitisafalsepositiveornot.

Additionally,somevendorsopttodetectkeygeneratorsorcracksthatbypasssoftwarepiracychecks.While these are not, strictly speaking, malicious, many corporate customers request that they bedetectedandremoved.

HowtoDeterminetheMagnitudeofaFalsePositive?

WilliamBlackstoneoncesaidofthejusticesystem,“Betterthattenguiltypersonsescapethanthatoneinnocentsuffer."Themodernequivalentmightbe,“itisbetterthat10maliciousfilesrunthanthatonenon-maliciousfileisdetected.”Butwouldtheusersagree?Giventhegrowthinthethreatlandscape,usersconsistentlydemandbetterprotection–andtheonlywaysecurityvendorsknowtodeliversuchimprovedprotection is todeploymoreproactive technologies (e.g.,heuristicsandbehaviorblocking),whichareoftensubjecttohigherfalsepositiveratesthantraditionalsignatures.Whileitisunlikelythatasecurityvendorwouldeverknowinglydetectacleanfile,oneofthecostsofthisincreasedprotectionisahigherchanceofFalsePositives.Thereisatradeofftobeweighed,andthisiswheretheconceptofMagnitudecomesin.

Thereareanumberofdifferentcriteriathatneedtobeconsidered:

1.1Criticality

Wearguethatitisimportanttodeterminethecriticalityofeachfalsepositive.NotallFPshavethesameimpactontheuserexperience.

Werecommendthattheindustrysegmentfalsepositivesintothefollowingcategorieswhenconductingafalsepositivetest:

Ideally, the software industry shouldagreeupon commonmetrics for eachof these categories (innoparticularorder):

• Systemcritical:includesfalsepositivesthatrenderthecomputerunusable.

• Network critical: includes false positives that prevent the computer from connecting to thenetwork.

• Browsingcritical:includesfalsepositivesthatpreventtheuseofawebbrowser,limitingtheuser’saccesstoreachtheinternet.

• Business critical: includes false positives on applications or data files which are critical to theoperationofabusiness.

• Core OS, non-critical: includes false positives on core OS files, such as notepad, which are notrequiredforthecomputer’sbasicoperation.

Page 5: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

5

• Applicationcritical:includesfalsepositiveson3rd-partyapplicationsthatrenderstheseapplicationsunusable.

• Application non-critical1: this category of false positive leaves critical elements of an applicationfunctional,butwithreducedancillaryfunctionality.

• Data file/Non-executable critical: this includes falsepositivesondocuments suchasWord, Excel,PDF,andSWF.

• Data file/Non-executable non-critical: includes false positives on temporary files, caches, non-criticalsettingswhichdon’timpacttheoperationofthecoreOSorofsystemapplications.

SystemCritical

SystemCriticalfilesarethoserequiredforthesystemtobootup,theusertobeabletologin,andstillbefunctional.SVCHostorWinLogonareexamplesofSystemCriticalfiles.

NetworkCritical

NetworkCriticalfilesarethoserequiredfornormalnetworkconnectivity,suchasbeingabletobrowsetheinternetorprocessemail.WinINet.dllisanexampleofanetworkcriticalfile.

BrowsingCritical

BrowsingCritical filesarethosewhicharerequired inorder tobeable tobrowsethe internet. WhilerelatedtoNetworkCritical,thesearespecifictowardsbrowsing.Firefox.exeisanexampleofBrowsingCriticalfiles.

BusinessCritical

Business Critical are those applications or data files which are important to business operations. Acustom application used for production, or a PDF with important business information would beexamples.Thesewillbeparticularlydifficultforatestertotest,astheywouldnotgenerallybepubliclyavailable.

CoreOS,Non-Critical

CoreOS,non-criticalarethosefileswhicharepartoftheoperatingsystem,butarenotrequiredtobootandlogin.NotepadorCalcwouldbeexamplesofCoreOS,non-criticalfiles.

ApplicationCritical

Application Critical are those files required for the operation of a given application. Word.exe is anexampleofanApplicationCriticalfile.

1Testersmaygroupapplicationnon-criticalandapplicationcriticalFPstogetherforresourcepurposesasitcanbeverytimeconsumingtodifferentiate.

Page 6: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

6

ApplicationNon-Critical

ApplicationNon-Criticalfilesarethosefileswhichbelongtoaspecificapplication,butarenotrequiredforitsbasicoperation.VarioustypesofpluginsareexamplesofApplicationNon-Criticalfiles.

DataFile/Non-ExecutableCritical

Data File/Non-executable Critical are those user files containing critical information, such as Worddocumentsormailarchives.

DataFile/Non-ExecutableNon-Critical

DataFile/Non-executableNon-Critical filesarethosefileswhichbelongtotheapplication,butarenotcriticaltoitsfunctions.CachesortemplatesareexamplesofDataFile/Non-executableNon-Criticalfiles.

BrowsingNon-Critical

BrowsingNon-Criticalfilesarethosewhichareusedbythebrowsers,butarenotintegraltoitsfunction.TemporaryinternetfilesorhistoryURL’sareexamples.

1.2PrevalenceofanObject

Next to criticality, the prevalence of an object is an important measure to determine what themagnitude of a false positive is. How many users would be impacted by the FP? Those affectingthousandsormillionsofusersaredifferentthanthosethataffectfive.

Thefollowingshouldbetakenintoconsideration:

• Whenpossible, falsepositives shouldbe rankedaccording to theprevalenceof the impacted file;manysecurityvendorsnowmeasureprevalence,sotestersmaywishtoqueryvendorsforthisdata,post-evaluation.

o Manysecurityproductssubmit telemetricdatatothevendor. This informationshouldbesharedwithtesterstobetterallowthemtoassesstheprevalenceofFPs.Thetestershouldmergeprevalencedata fromdifferent vendors, sincedifferent vendorswill havedifferentdatabasedonthesizeandmakeupoftheircustomerbase(inmostcases,theseprevalencestatistics are unlikely to overlap). This metric is still problematic, since some files areextremely prevalent yet a false positive on them would have literally no impact. Forexample,Windows 7might include a legacy hard disk driver for a long-defunctmodel ofhardware.EvenifnousersintheWindows7user-baseusedthisfile,itsprevalencewouldbecountedinthetensofmillions).

• Prevalence statistics frompopular download portalsmay be used to corroborate prevalence, butshouldbevettedfirst.Populardownloadportals,likedownload.com,oftentracktheprevalenceofhosteddownloads.Itisimportanttocheckwiththeportalbeforeusingthesestatistics,however;insomecases,theprevalencecountsarecumulative(forallversionsofanapplication,ratherthanthe

Page 7: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

7

currentlypostedversiontheapplication). Forexample,download.commightstatethatGraphEdithas1M(cumulative)users,whereas,infact,only10usershavedownloadedthelatestversionoftheapplication.Afalsepositiveonthelatestversionoftheapplicationwouldthereforeonlyimpact10users,whereasitwouldappearthatsuchafalsepositiveisimpactingamillionusers.

1.3Recoverability

TheramificationsofaFalsePositivearenotalwaysthesame. Forexample,havingtodownloadafileagainfromawebsitemightbeannoying,butitisquitedifferentfromhavingtouseanoff-linerecoverytooltorepairmachinesthatnolongerboot.

ThefollowingshouldbetakenintoconsiderationwhenratingtherecoverabilityofaFalsePositive:

• Permanentdestruction:Isthedatairreparable,suchasthelossofadocumentoraphotograph?

• Off-linerecovery:Doesthesystemhavetobetakenofflineinordertorecover?

• Recovery from product quarantine/backup: can the file/data be recovered from the product’squarantine

o Including centralized admin recovery: Does this recovery require an administrator tophysicallyaccessthemachine,orcanitberecoveredremotely?

• Website/download:Cantheuserdownloadthedataagain?

1.4Environment

Testers should take into consideration the intended purpose of the products they are testing. Forinstance, perimeter defense solutions (such asmail gateways)may havemuch looser heuristics thandesktopsolutions. Inthese instancestheFalsePositive ismoreaDenialofServicethanatrue lossofdataoranimpactonoperations.Assuch,theimpactisgenerallyalsomuchlesssevere.

Thefollowingconsiderationsshouldbemade:

• Policy detections vs. core protection detections: If a core protection technology (in its defaultsettings)encountersafalsepositive,this isdifferentthanafalsepositiveduetoanadministrator-configuredblockingpolicy(whichmaybeintendedtoblockmorethanjustmalware).

• How significant is the impact of a false positive: Incorrectly detecting and blocking a legitimatesvchost.exefileinemailisnotnearlyasbadasblockingthatsamecriticalfileonthedesktop.

PolicyDetections

Detectionsorblockswhichoccurasa resultofpolicyshouldbeseparated fromthosewhichoccurbysignature. For example, many email clients prevent the user from accessing attachments which areexecutable. It could be argued that this is 100% False Positive rate. The difference is the user hasselectedthispolicyhimselfanditwashischoice.

Page 8: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

8

UnlikelyScenarios

Reviewers should take into consideration the conditions under which a False Positive occurs, andwhetherthatconditionislikelytohappen.Forexample,asecuritysolutionwhichdetectsanoperatingsystemcomponentonanemailgateway(presumingitdoesnotdetectitonthemachine).Itisunlikelythatagatewaywouldevernaturallyseesuchfiles,sosuchdetectionsshouldbediscounted.

1.5ResponseTime

Atesterneedstoconsideralsofactoringintheamountoftimeittookavendortofixaparticularfalsepositive.Vendorstendtoveryquicklyrespondtomajorfalsepositives.Mostalsohaveamechanismforcustomers to report potential false positives. In assessing the effectiveness of a security solution itwouldbeusefultomeasurehowquicklythevendorrespondstoreportedfalsepositives.

1.6ProductContext

Many products have differentmodes of operation. These so called “paranoid”modes can often beactivated by user selection. When the user selects thismode they aremaking a conscious choice toincreaseprotectionatthegreaterriskoffalsepositives.Sincefeweruserswouldchoosetousesuchamode,falsepositivesdetectedinthismodeshouldberatedaslesssevere–evenifonprevalentfiles.Criticalityshouldbetreatedthesame.

1.7OtherConsiderations

Hereareafewotherconsiderationswhichdonotfitneatlyintotheabovementionedcategories.First,FPsoftencomewithhigherdetectionrates. CorrelatingTruePositive(TP)andFPratioscanprovideamoreaccuratereflectionoftheefficacyofasecuritysolution.

Tester should take into consideration the version of the program. If an anti-malware productexperiences a false positive on v1.7 of a program yet v1.9 is the latest version (and presuming theproductinquestiondoesnotyieldafalsepositiveonv1.9),thenthisshouldbereported.Ofcourse,justbecausethereisalaterversionofaprogramavailabledoesnotmeanthattheearlierversionisnotinuse(andinsomecasescanbemoreprevalentthanthelaterversion).

MeasuringFalsePositives

Havingafalsepositiveonasystem-criticalfileismuchworsethanonaregularfileorresource.

Ideally, there should be a non-linear scale of sorts to rate FPs based on critically, prevalence, andrecoverability.

Togiveananalogy:‘Falsing’onhighly-criticalsystemfilesshouldbeviewedinasimilarwayasmissingfilesfromtheWildCore.

Additionalconsiderationshouldbegiventothefollowingwhendoingthistesting:

Page 9: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

9

• Checkifthedetectionitselfmayactuallybevalid.ThisspecificallyappliestoRiskWare/PUAssuchasmIRC.Thesamecaremustbegiventoconfirmingthelegitimacyofa“clean”sampleasisgiventoa“malicious”one. Therehavebeen anumberof instanceswhere “clean” files havebeen infectedpriortosigningandreleasing.Theoldaxiomstillholds:trustbutverify.

• When dealing with AdWare/RiskWare detections, make sure that detected files are notmisclassified. For instance, if the file is detected as AdWare then it is not a false detection.However,ifthefileisdetectedasavirusorTrojanthenthatwouldbeafalsedetection.

• Thevendorshouldbecontactedtomakesurethatdetectionwasnotaddedintentionally.Vendorsdonothaveuniformpolicies–particularlyregarding“greyware”applications.

• Somevendorsmayemploycontextualpolicies.Forexample,theproductmaynotblocktftprunningas tftp.exe from thewindowsdirectory, butmightblock the sameprogram runningas sldjfsjl.exefromthe temporary filedirectory) . Additionally, filenameand foldernamecanbothbeseparatecontributingfactorsincontextualdetections).

• Ahistoryof a filemayplay a role – e.g. files installed fromCD-ROMsmaybe treatedwith a lesssuspicionthanfilesfromtheInternetorUSBdrives.

Telemetry

Manysecurityvendorshavetheabilitytocollecttelemetryfromtheircustomers.Thiscanincludefiles,URL’s,hashes,andevents.Thisdatacanbeextremelyusefulinassessingtheprevalenceoffileswithinaproduct’suserbase. While thisdata canbe strategically important to thevendor, sharing someof itwithtesterscanhelpdeterminethefiles’prevalence.

Ideallythetelemetrysharedwithatesterincludesthefollowing:

• Freshnessoffile(whenwasitfirstseen,whenwasitlastseen?)

• Prevalenceofthefile(howmanycustomermachinesisthefileon?)

• Breakdownofprevalenceperregion(wherearethesemachineslocated?)

o Ideally the testerwouldgroupFPs into countriesoforigin. For instance,aproductmayhaveanFPonprogramscreatedinChina.Thisisimportant.

• Origin of distribution. Does it come with the operating system (or some other very popularapplication),orisitaspecializedutility?

HowtoPerformFalsePositiveTesting?

Ideally FP testing is performed in a similar fashion to dynamic testing. A stream of fresh clean filesshouldbeusedtomoreaccuratelytestFPefficacy.Thisisbecausevendorstendtowhitelistprevalentcleanfilesquickly,sodelaysintestingcanyieldmisleadingresults.

1.1StaticTesting

Page 10: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

10

While AMTSO does not advocate Static Testing, we recognize that these tests will continue to beperformed.Giventhat,therearesomebasicruleswhichshouldbeapplied:

• Usefreshfilesthatarelikelytobeinusebyrealusers

• Contextisimportant.Productsarebuilttoprotectcustomers,andsomeofthatinvolvesidentifyingsituationswhichdeviate from thenorm. In “normal” situations fileshaveusual names/locations.Additionally,“normal”systemsdonothavemillionsofmalicioussamples.

o TestFalsePositivesspecifically.CleansystemsshouldbeusedfortestingFalsePositives.

o Use files in their “natural” locationandname. Similar to above, clean systems shouldbeused.

1.2DynamicTesting/WholeProductTesting

Sometestersmayopttotestforfalsepositivesinthesametestwheretheyaredoingdetectiontesting(i.e. they may intersperse 1000 legitimate files among 10,000 bad files to check for false positivesand/or “gaming”of the testingmethodology). This is a reasonableapproach;however, explicitnoteshouldbemadeofthis.KeepinmindthatperformingFPtestingincombinationmayleadtodifferentresultscomparedtoperforming individualFPtesting. Thiscanbeduetoaproductperhapsswitchingautomaticallytoamoreparanoidmodewhenmalwareisdetectedenteringthesystem(inthesecasestheproductmustbehavethesameinboththerealworldandthetestingenvironment).

Additionally, there might be a “guilt by association” tendency of some products. If a maliciousapplication also drops some non-malicious files (such as tftp) these might also be detected andremoved. This specific context needs to be noted such that the reader can make their owndeterminationastotheusefulnessorproblemwiththisapproach.

Some security products will take into account the name and location of certain applications in anattempttodiscovermaliciousintent–theyconditionallydetectbasedonthecontextofthedetection.Whenatesterhasadirectoryfullofcleanfiles,perhapsnamedastheirhashvalue,thesecurityproductmightflagthisastheapplicationbeinginthewrongplaceorunderthewrongname.

Fornotificationsvs.detections the same rules shouldbemaintainedbetweenTPandFP testing. If apromptingdialogispresenteditmustbeansweredthesameregardlessofFPorTPtesting.Thiscanbecomplicated by some products whichmight provide contextual information in order to elicit amorecorrectresponsefromtheuser–buthowtodecide?Onewayistocaptureanumberofdialogsandusethem to conduct a poll of a number of “typical” users to determine how they would answer thoseprompts2.

ArtificialTestScenarios

2SeeAMTSOBestPracticesforValidationofSamplesatwww.amtso.org.

Page 11: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

11

Similartocreatingnewmaliciousprogramsfortesting3,creatingnewprogramsforfalsepositivetestinghasbeenconsidered.However,suchartificialscenariosshouldnotbeemployed.Thetestshouldreflectreal life scenarios. For further explanation see the Issues Involved in the "Creation" of Samples forTestingdocument.

OtherConsiderationsforFalsePositiveTesting

Lastly,thereareanumberofotherconsiderationsthetestershouldaccountfor:

TestingwithOtherSecurityProducts

Ingeneraltestersshouldavoidscanningcompetinganti-malwareproductstoseeifFalsePositivesoccur.

Oneofthemainreasonsforthisbeingthatascannermaydetectthedatabasesofthecompetitor.Thisisanedgecasewithacomplicatedscenario frombothproducts’perspectives. Moreover, in thecasethatoneproductdoesnotallowcoexistencewithothersecurityproducts, thisbecomesan“artificial”testscenario.Theseshouldbestbereportedtoallconcernedandnotincludedinatest.

Corrupted,Disinfected,orModifiedFiles

Testers should refrain from having corrupted, (incorrectly) disinfected or otherwise modified files intheir FP test set. One exception to this would under dynamic or whole product testing. Here theproduct may encounter False Positives on incomplete files (for instance when the browser isdownloadingafile).InsuchcasesthetestershouldtreatthedetectionasanFP.

PotentiallyUnwantedPrograms(PUPs)/Riskware

Differentvendorsmayhavedifferentpolicies regardingPUPsorRiskware (usefulprograms thatcan–andare–usedbymalwarefornefariouspurposes).Ifsuchprogramsaregoingtobetested,thisshouldbe specifically identified and the samplesproperly verified. Thisway the reader canmake their ownevaluationastohowimportantthesedetections(ornon-detections)are.

Non-ViralDetections

Whenadetectionoccurs,theclassificationofthatdetectionisimportant.Forexample,aServUsamplecould be detected either as not-a-virus:Riskware.ServU.501 or Trojan.agent.blabla. The first is not aFalsePositive,itisacorrectclassification.ThesecondisaFalsePositive.

CaseStudy:AV-Comparative’sVendorExperiment

In March of 2010, AV Comparatives conducted an experiment with several security companies todetermineifitwaspracticaltousetheprevalenceandcriticalityinformationprovidedbythevendorstoassesstheimpactofaFalsePositive.Theresultswerequiteinteresting.3SeeAMTSOIssuesInvolvedinthe"Creation"ofSamplesForTestingatwww.amtso.org.

Page 12: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

12

ThefollowingemailwassenttoalltheAMTSOmembersbyAndreasClementiofAVComparatives:

All,

At the AMTSO members meeting in Santa Clara I proposed a challenge to the Security Vendors to test the efficacy of some of the proposals for qualifying False Positives. It has been asked that testers classify FPs by prevalence and importance. However, this may be easier said than done. To find out how feasible this will be I propose the following challenge:

Part 1: You will be provided the MD5/SHA1/SHA256 of 11 files. You are to determine the prevalence of these files to assess that portion of the importance ranking.

Part 2: You are to assess the importance of these files to either the Operating System or the Application to which they belong. You should report your results back no later than 20th March 2010 (if you do not answer by then, it will be assumed that you do not take part into this experiment). Your report should include the following:

• Your classification of the importance of this FP (based on Parts 1 & 2 above)

• The number of man hours required to obtain the data for all 11 samples.

Rules:

• Do not consult with other vendors regarding these samples (judge independently)

• Measure the resources taken to perform the classification • Send your results to [email protected] (DO NOT POST YOUR

RESULTS TO THE AMTSO MEMBERS LIST!)

Good luck! And let the experiment begin!

MD5 7bd87ca2644d39fbec5cf98baaa42b5db5d963ff2e09514256b9a4e6b9eea6e8c3510870130e140843513208c7a0e199407b31366865462ac20d42989c7f03cf234530ef053c83ce40aa14f440a1ac91d5ad0d6fbd0a992ad164d9b10c4bb4d31f559cbfe3476d1f2d6ff5cec2d0bfe25cbc5f0f69d52dff07cd6d93ef1820f4da70de606de15ae140c3e7d444509550e7ea288cd0567e78fa98cc27495e427319c173f4f57daa49e57fd483be193db3 SHA1 394dc6c68e46e05247453c93fc9f3f24b144bd56d9b9c4d2e8a7bbccb8e50f9ab0b5659e047cf49f009a49c4a6255e9d7a3f7ffaf2ab482ea9732d0d16a5215853099febb53a0784367d4808a8100f7644d8babb1b3e70ea23e24beec967b67d15a968ece07c94ba6efc493d3f383f661a76e3e5924bc846895d20829e664ef474ac21c769b9b74e130a7ca35ceb7a41cbc1e1c

Page 13: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

13

f7451ac7b4d2820c2254d14ebc2f5580d62acc63e3986ede88397ff8f89d3f4d2ff8f79f7647aaaa06e7d30a8400f5b710171d685 bbcf1d633c2ad645b41d841ee483b89508946e1a SHA256 ae440c5b00fbd5ea63d3837021cd703beee3289faba7ecb3343c0edc6848186491f95e504232ca78fad5344ea581164be5162f07be66588c1fe59b2e4df913ec959b22856900beda29135b23c70db6761e5d44273ee2fd8b66d4f3e1d2449535daf7883556604fed26de460597775d20f8b133c579b5d318ed08f5c7bcac023ba7d444ce2db227145a2672bc9087475b16ab1e1fca22a1366e7d434f572d6afc097d061ffda39c69612cf0102698d588cd92bbe70060c85be39e0223722a8151195921694a68a154ebfc7…

Regards, Andreas

SevenvendorstookpartinthisFPexperiment.AV-Comparativeswasaskedtokeepthevendornamesconfidential. Additionally, AV-Comparatives had access to four of the vendor’s clouds to assessprevalence,andthoseresultsareincludedforeachtestasCloudA,B,C,andD.ThisresultalsocontainssomeofAndreasClementi’spersonalopinions/comments.

The11fileswereallPEfiles.AsmostcloudsareonlyabletoprovideusefuldataforPEfiles,itmaybeevenmoredifficult to get useful data for FPs onnon-PE files. Clouddatamay vary according to userbases.Somecloudsdonotcollectdataonknowndigitallysignedfiles.Notallvendorshavecloudsetc.onwhichtheycouldbasetheirdecisionon,andonevendorusedalsoGooglehitsasoneindicationofprevalenceandfurtherfindings.

TimetoAnalyze

Herearetheresultsofthetimespent(inmanhours)byeachvendortoresearchthe11hashes:

Vendor1 20minutes

Vendor2 30minutes(onlybasicdata;highlevelinfowouldtakeseveralman-hours)

Vendor3 35minutes

Vendor4 2-3hours

Vendor5 2hours

Vendor6 30minutes

Vendor7 5-6hours

SAMPLE1

Name:AreaBluetooth(ProximityMarketingTool)URL:http://www.areabluetooth.comFilename:ABSend.exe

Page 14: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

14

MD5:7bd87ca2644d39fbec5cf98baaa42b5dSHA1:394dc6c68e46e05247453c93fc9f3f24b144bd56SHA256:ae440c5b00fbd5ea63d3837021cd703beee3289faba7ecb3343c0edc68481864

User’sbyCloud4 Count CorrectCloudA around50 ✔CloudB around20 ✔CloudC 4 ✔CloudD5 2(inlast10days) ✔

Download/SalesStats:Around8000

Program Description: “AreaBluetooth is a point to point transmission system that relays on theBluetooth universal protocol.When a bluetooth enabled device (mobile phone, PDA, computer, etc)entersthecoveragearea,thesystemsanalyzes if thereareavailablecontentsforthedeviceandthenpromptstheuserforauthorizationbeforesendingyourcampaignmediafiles.”

Notes: if thesoftware isnotregistered(shareware), itworksonly in30minutes intervalsandsendsabannertogetherwiththesentcampaigns.Thisdoesnothappenwhenthesoftwareisregistered.Somevendorsmay thereforeconsider thisprogramas“Adware”;wedonot. Itwas initiallya falsealarmofvendorxy,duewhichitgotlaterdetectedasmalwarebyseveralvendors(whilevendorxyfixedtheFPinthemeantime).

Prevalence/Importance6AccordingtoVendors:

Vendor Determination CorrectPrevalence CorrectImportance1 lowprevalence- ✔ ✔2 verylowprevalence- ✔ ✔3 verylowprevalencelowimportance ✔ ✔4 verylowprevalencelowimportance ✔ ✔5 lowprevalencelowimportance

(adware)✔ ✔

6 lowprevalencelowimportance ✔ ✔7 verylowprevalence- ✔ ✔

ConclusionFP#1:dataiscongruentandcorrect.

4All clouds are based on the product’s user base. Some clouds primarily measure objects that are activelyrunning.5Uniqueoccurrencesofthefilesamongtheirusersoveraround10days.Thenumbersaredifficulttointerpretintheir absolute numbers, that’s why the vendor normalized the data to reasonable orders to make themcomparable (the real numbers arehigher, butprettymuchon the sameorders; the ratios aremaintained). Forexample,latestIrfanViewversionwouldhave72538occurrences,Firefox196668,Skype257501,etc.6Notallvendorswereabletoratetheimportanceofthefiles.

Page 15: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

15

SAMPLE2

Name:BrockhausMultimedialPremiumURL:http://www.brockhaus.deFilename:cdcops.dllMD5:b5d963ff2e09514256b9a4e6b9eea6e8SHA1:d9b9c4d2e8a7bbccb8e50f9ab0b5659e047cf49fSHA256:91f95e504232ca78fad5344ea581164be5162f07be66588c1fe59b2e4df913ec

UsersbyCloud Count CorrectCloudA severalthousands ✔CloudB around4000 ✔CloudC 0 CloudD 947(inlast10days) ✔

Download/SalesStats:over100000

Notes:Ifcdcops.dllgetsquarantined,programisunusable.Ifusertriedtoruntheprogramwithoutthefile,even ifquarantinedfilegetsthenrestored,programremainsunusableuntil theuserregisterstheprogramagainwiththeserialnumberprovided(whichIdonotfindhereanymore,whichmeansI lost€100duethisFPexperiment:P).

Prevalence/ImportanceAccordingtoVendors:

Vendor Determination CorrectPrevalence CorrectImportance1 (unknown)7verylowprevalence- 2 mediumprevalence- ✔ 3 (unknown)- 4 highprevalencemedium-to-low

importance✔

5 (unknown)verylowprevalenceverylowimportance

6 (unknown)- 7 lowprevalence-

ConclusionFP#2:DataisNOTcongruent

SAMPLE3

Name:EulalyzerURL:http://www.javacoolsoftware.comFilename:eulalyzer.exeMD5:c3510870130e140843513208c7a0e199SHA1:009a49c4a6255e9d7a3f7ffaf2ab482ea9732d0dSHA256:959b22856900beda29135b23c70db6761e5d44273ee2fd8b66d4f3e1d24495357Forsomevendors“unknown/nodata”fromtheircloudcorrespondsto“zeroorverylowprevalence”

Page 16: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

16

UsersbyCloud Count CorrectCloudA severalhundreds ✔CloudB around150 ✔CloudC 0 CloudD 88(inlast10days) ✔

Download/SalesStats:Unknown,butprobably50000(apparently240000downloadsonMajorgeeks-distributed/promotedinmanymagazines)

ProgramDescription:Eulalyzeranalyzeslicenseagreementsforinterestingwordsandphrases.

Notes:Whydownloadstatsarenotalwaysagoodindicatorofprevalence–SeethedownloadstatsofEulalyzeraccordingtothefollowingdownloadportals:

Majorgeeks:239851(MajorgeeksisoneofthemaindownloadhostsforEulalyzer)CNET:27212PCWorld:11108Scanwith:1689Softpedia:1231Freewarefiles:800Betanews:630Datanews:251downloads

Language-specific software may be more popular on some download-portals (depending onlanguage/promoted countries/partnerships). Sometimesdownload sites aggregatedownloadnumbersforallversionsofaparticularpieceofsoftware.Also,noteveryonewhodownloadstheinstalleractuallyinstallsit.Sometimespeopledownloadandinstallthesoftware,butlateron,uninstallit.So,atanytime,theactualnumberofpeopleusingthesoftwarewillbelessthanwhatisreportedonthedownloadsite.

Prevalence/ImportanceAccordingtoVendors:

Vendor Determination CorrectPrevalence CorrectImportance1 veryhighprevalencehighimportance ✔ ✔2 verylowprevalence- 3 medium-to-low prevalence medium

importance✔

4 low prevalence low importance,applicationcritical

5 (unknown)very lowprevalencevery lowimportance

6 lowprevalencelowimportance 7 verylowprevalence-

ConclusionFP#3:DataisNOTcongruent

SAMPLE4

Page 17: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

17

Name:PCKaufmann(SageKHKFormulargestalter)URL:http://www.business-software.at/pckaufmann.htmlFilename:formed.exeMD5:407b31366865462ac20d42989c7f03cfSHA1:16a5215853099febb53a0784367d4808a8100f76SHA256:daf7883556604fed26de460597775d20f8b133c579b5d318ed08f5c7bcac023b

UsersbyCloud Count CorrectCloudA around100 ✔CloudB around20 ✔CloudC 0 CloudD 0(inlast10days)

Download/SalesStats:Over10000

Note:PCKaufmannisoneofthemostwell-knownERPsystemsforSMBintheGerman-speakingarea.

Prevalence/ImportanceAccordingtoVendors:

Vendor Determination CorrectPrevalence CorrectImportance1 verylowprevalence- ✔

2 verylowprevalence- ✔ 3 (unknown)- ✔ 4 verylowprevalencemedium-to-low

importance✔

5 (unknown) very low prevalence very lowimportance

6 verylowprevalenceverylowimportance ✔ 7 verylowprevalence- ✔

ConclusionFP#4:PrevalenceOK,ImportanceNOTOK

SAMPLE5

Name:IKEAHomePlannerFurnishProURL:http://www.ikea.comFilename:Furnish.exeMD5:234530ef053c83ce40aa14f440a1ac91SHA1:44d8babb1b3e70ea23e24beec967b67d15a968ecSHA256:a7d444ce2db227145a2672bc9087475b16ab1e1fca22a1366e7d434f572d6afc

UsersbyCloud Count CorrectCloudA severalhundredsofthousands ✔CloudB around45000 ✔CloudC 7

Page 18: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

18

CloudD 13035(inlast10days) ✔

Download/SalesStats:Unknown,butsupposedaroundonemillion

Prevalence/ImportanceAccordingtoVendors:

Vendor Determination CorrectPrevalence CorrectImportance1 veryhighprevalencehighimportance ✔ ✔2 highprevalence- ✔ 3 (unknown)- 4 high prevalence medium importance,

applicationcritical✔ ✔

5 lowprevalencelowimportance 6 lowprevalencelowimportance 7 highprevalence- ✔

ConclusionFP#5:DataisNOTcongruent

SAMPLE6

Name:3-WebToGoURL:http://www.drei.atFilename:InstallWTGService.exeMD5:d5ad0d6fbd0a992ad164d9b10c4bb4d3SHA1:e07c94ba6efc493d3f383f661a76e3e5924bc846SHA256:097d061ffda39c69612cf0102698d588cd92bbe70060c85be39e0223722a8151

UsersbyCloud Count CorrectCloudA thousands ✔CloudB 0 CloudC 3 CloudD 50(inlast10days) ✔

Download/SalesStats:Over700000,buteffectiveonlyabout150000inuse

ProgramDescription:Thisprogramisrequiredformobileinternetaccessthrumobilesticks.Somecloudproducts may not notice it on the USB stick as it is usually launched/accessed before an Internetconnectionisestablished.

Prevalence/ImportanceAccordingtoVendors:

Vendor Determination CorrectPrevalence CorrectImportance1 veryhighprevalenceveryhighimportance ✔ ✔2 (unknown)- 3 (unknown)- 4 highprevalencemedium-to-lowimportance ✔

Page 19: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

19

5 lowprevalencelowimportance 6 verylowprevalenceverylowimportance 7 verylowprevalence-

ConclusionFP#6:DataisNOTcongruent

SAMPLE7

Name:KonicaMinoltamagicolor2490/2590MFPrinterDriverURL:http://www.konicaminolta.comFilename:MSDMLT0B.DLLMD5:1f559cbfe3476d1f2d6ff5cec2d0bfe2SHA1:895d20829e664ef474ac21c769b9b74e130a7ca3SHA256:195921694a68a154ebfc763bd83df5e58bcb3726d5a92d4e6026570e6bc9d460

UsersbyCloud Count CorrectCloudA around50 CloudB 0 CloudC 0 CloudD 0(inlast10days)

Download/SalesStats:Over100000

Prevalence/ImportanceAccordingtoVendors:

Vendor Determination CorrectPrevalence CorrectImportance1 verylowprevalence- ✔ 2 (unknown)- ✔ 3 (unknown)- ✔ 4 very low prevalence medium-to-low

importance✔

5 (unknown)verylowprevalenceverylowimportance

6 (unknown)- ✔ 7 verylowprevalence- ✔

ConclusionFP#7:Dataisalmostcongruent

SAMPLE8

Name:Wood-OnlineRoomPlanURL:http://www.b2b-wood.euFilename:NETShop.exeMD5:5cbc5f0f69d52dff07cd6d93ef1820f4SHA1:5ceb7a41cbc1e1cf7451ac7b4d2820c2254d14ebSHA256:74cb3a3d328a032dac06f90bb1d2da9f541ae612e547ede7e05c3f42a671159e

Page 20: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

20

UsersbyCloud Count CorrectCloudA around100 ✔CloudB 0 ✔CloudC 0 ✔CloudD 4(inlast10days) ✔

Download/SalesStats:Around1000

Program Description: Wood-Shop Software (software used by business users to order for theircustomers).

Prevalence/ImportanceAccordingtoVendors:

Vendor Determination CorrectPrevalence CorrectImportance1 lowprevalence- ✔ 2 (unknown)- ✔ 3 (unknown)- ✔ 4 very low prevalence medium-to-low

importance✔

5 (unknown) very low prevalence verylowimportance

6 (unknown)- ✔ 7 verylowprevalence- ✔

ConclusionFP#8:PrevalenceOK,ImportanceNOTOK

SAMPLE9

Name:MicrosoftWindowsServer2008RTM(PowerManagementConfigurationPanel)URL:http://www.microsoft.comFilename:powercfg.cplMD5:da70de606de15ae140c3e7d444509550SHA1:c2f5580d62acc63e3986ede88397ff8f89d3f4d2SHA256:37835d920afdf7f398b8f8a8a4675d1a77a73947a9963cfa65189eaede06fbb4

UsersbyCloud Count CorrectCloudA severalhundreds CloudB around85000 ✔CloudC ? CloudD 56293(inlast10days) ✔

Download/SalesStats:Unknown,probablyseveralhundredsofthousands

ProgramDescription:MicrosoftWindowsServer2008RTM(PowerManagementConfigurationPanel)

Prevalence/ImportanceAccordingtoVendors:

Page 21: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

21

Vendor Determination CorrectPrevalence CorrectImportance1 verylowprevalence- 2 highprevalence- ✔ 3 very low prevalence very low

importance

4 verylowprevalencehighimportance,OSnon-critical

5 lowprevalencehighimportance ✔6 lowprevalencelowimportance 7 highprevalence- ✔

ConclusionFP#9:DataisNOTcongruent

SAMPLE10

Name:ESETSysInspectorURL:http://www.eset.comFilename:SysInspector.exeMD5:e7ea288cd0567e78fa98cc27495e4273SHA1:ff8f79f7647aaaa06e7d30a8400f5b710171d685SHA256:4b4eb0c2dba139738e8806db17bfc0fab62a7ca3dcf8bd94c132cca450a5992c

UsersbyCloud Count CorrectCloudA severalhundreds ✔CloudB around1000 ✔CloudC ? CloudD 3469(inlast10days) ✔

Download/SalesStats:Around2000008

ProgramDescription:SystemdiagnostictoolforWindowssystems.

Prevalence/ImportanceAccordingtoVendors:

Vendor Determination CorrectPrevalence CorrectImportance1 veryhighprevalence- ✔ 2 lowprevalence- 3 medium-to-lowprevalencehigh

importance

4 highprevalencemediumimportance ✔ 5 lowprevalencemediumimportance 6 lowprevalencelowimportance 8Itistobeexpectedthatcomponentswhicharerelatedtosecurityproductsmightshowlowerinthecloudsofcompetingprograms.

Page 22: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

22

7 mediumprevalence-

ConclusionFP#10:DataisNOTcongruent

SAMPLE11

Name:NotebookHardwareControlURL:http://www.pbus-167.comFilename:uninst.exeMD5:19c173f4f57daa49e57fd483be193db3SHA1:bbcf1d633c2ad645b41d841ee483b89508946e1aSHA256:39c041abfb944625a546683ec94927c05a41bc94b38897bdc2d6e9e192d946e0

UsersbyCloud Count CorrectCloudA severalthousands ✔CloudB around650 ✔CloudC 2 CloudD 1227(inlast10days) ✔

Download/Sales Stats: This version is currently still used on about 13000 notebooks (about 3000 ofthemstillusingthepaidproduct);distributed/promotedinseveralmagazines.

Program Description: Uninstaller for NHC; Notebook Hardware Control allows to easily control thehardwarecomponentsofNotebooks.

Notes:ConsideredasbehavingsuspiciousbyavendorandsuggestedtodonotuseforFPtestingbyavendor

Prevalence/ImportanceAccordingtoVendors:

Vendor Determination CorrectPrevalence CorrectImportance

1 veryhighprevalencehighimportance ✔ 2 lowprevalence- 3 lowprevalencelowimportance 4 high prevalence medium-to-low

importance✔

5 lowprevalencelowimportance 6 (unknown)- 7 mediumprevalence- ✔

ConclusionFP#11:DataisNOTcongruent

ConclusionsfromtheExperiment

HowDotheCloudsWork?

Page 23: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

23

Therewere four vendorswhoprovideda tool to compute cloudprevalenceduring this test. Eachofthesecloudsworksalittledifferently.Someofthevendorsaskedthattheirdescriptionbeanonymous.So,inabsolutelyrandomorder,hereisadescriptionofthefourclouds.

CloudW

HTTP-based signature cloud-scanning for PE files that are non file-infectors, polymorphic nor scriptviruses.

Wecheckbothon-demandaswellason-access.However therearecertaincriteriaweusebeforewecheck against the cloud, checking locally against whitelist, local signatures and local heuristics.Dependingontheresultsfromtheselocaltechnologies,wewillcheckagainstthecloudornot.

Butwehavebothlocalwhitelist(basedondigitalcertificatesforex)aswellascloud-whitelist,sowedocheckmanywhitefilesagainstthecloudaswell,bothon-accessaswellason-demand.

CloudX

Vendor of Cloud Xmade some sort of silent reporting for 10 days, so that every time the fileswerelaunchedonthesystemsoftheircustomerstheygotreportedtothem.Realnumberswerehigher,buttheynormalizedthem.E.g.installerswouldrunonlyonceandreportedonlyonce,endingwithamuchlowerreputation.

CloudY

"Cloud reporting occurs on execution, opening, copying and is done during both on-demand and on-accessscanning.Localwhitelistworksfirstandifithits-thispreventsreportingtothecloud.

Localwhitelistisfrequentlyandautomaticallyupdated-itexcludescommoncleanfiles.

Therefore,duetolocalwhitelistfilteringcloudunder-reportingforcommonitemsisexpected."

CloudZ

ForusersparticipatingwesubmitthehashofallPEandMSIfileswhichareexecutedorcreatedondisk.(Notethatfileswhicharealreadypresentonthesystemandneverexecutewillnotbesubmitted.)Ourprevalence values are an approximate range of the number of submitting, licensed, non-suspicious,uniqueusersofaparticularfile.

Onethingisclearfromthesedescriptions:noneofthecloudsreportsthepresenceofeveryPEfileondisk. Most are limited to “active” ones, and even of those local whitelists will prevent accuratereporting. However, given thedifferentapproaches takenbyeachcloud,when taken in combinationthey should yield an overall fairly accurate picture of the prevalence of executables that are actuallyrunningintheworld.

Withthatinmind,let’slookathoweachClouddid.

Page 24: Guidelines to False Positive Testing - AMTSO

Copyright©2016Anti-MalwareTestingStandardsOrganization,Inc.Allrightsreserved.Nopartofthisdocumentmaybereproducedinanyform,inanelectronicretrievalsystemorotherwise,withouttheprior

writtenconsentofthepublisher.

24

Cloud Samples Correct Percentage

CloudA 11 9 88.8%

CloudB 11 9 88.8%

CloudC 11 1 9.1%

CloudD 11 9 88.8%

It appears that the various clouds areworking quitewell, particularlywhen takenwith the view thattheywillonlycovertheircustomerbase.So,howdidthevendorsdo?

Vendor Samples CorrectPrevalence

Percentage CorrectImportance

Percentage

1 11 9 88.8% 4 36.4%

2 11 7 63.6% 1 9.1%

3 11 5 45.5% 1 9.1%

4 11 9 88.8% 4 36.4%

5 11 4 36.4% 2 18.2%

6 11 4 36.4% 1 9.1%

7 11 7 63.6% 1 9.1%

Pullingitalltogether,itseemsthatthebestavenuefortesterstotakeistousethecloudtoolsprovidedbythevendors,andtocombinethatwiththeirownassessmentoftheimportance.

______________________________________________________________________________

ThisdocumentwasadoptedbyAMTSOonOctober22,2010


Related Documents