Top Banner
On the Reconstruction of Android Malware Behaviors Aristide Fattori, Kimberly Tam, Salahuddin J. Khan Lorenzo Cavallaro, and Alessandro Reina Technical Report RHUL-MA-2014-1 Feb, 4 2014
14

On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

Apr 19, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

On the Reconstruction of AndroidMalware Behaviors

Aristide Fattori, Kimberly Tam, Salahuddin J. KhanLorenzo Cavallaro, and Alessandro Reina

Technical Report RHUL-MA-2014-1Feb, 4 2014

Page 2: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

On the Reconstruction of Android Malware Behaviors

Aristide FattoriUniversity of Milan

Kimberly TamRoyal Holloway University of

London

Salahuddin J. KhanRoyal Holloway University of

London

Alessandro ReinaUniversity of Milan

Lorenzo CavallaroRoyal Holloway University of

London

ABSTRACTToday mobile devices and their application marketplacesdrive the entire economy of the mobile landscape. For in-stance, Android platforms alone have produced staggeringrevenues exceeding 9 billion USD, which unfortunately at-tracts cybercriminals with malware now hitting the Androidmarkets at an alarmingly rising pace.

To better understand this slew of threats, we present Cop-perDroid, an automatic VMI-based dynamic analysis systemto reconstruct the behavior of Android malware. Based onthe key observation that all interesting behaviors are even-tually expressed through system calls, CopperDroid presentsa novel unified analysis able to capture both low-level OS-specific and high-level Android-specific behaviors. To thisend, CopperDroid presents an automatic system call-centricanalysis that faithfully reconstructs events of interests, in-cluding IPC and RPC interactions and complex Androidobjects, to describe the behavior of Android malware re-gardless of whether it is initiated from Java or native codeexecution. CopperDroid’s analysis generates detailed behav-ioral profiles that abstract a large stream of low-level—sometimes uninteresting—events into concise high-level se-mantics, which are well-suited to provide effective insights.

We carried out an extensive evaluation to assess the capa-bilities and performance of CopperDroid on more than 2,900Android malware samples. Our experiments show that Cop-perDroid faithfully describes OS- and Android-specific be-haviors and, through the use of a simple yet effective appstimulation technique, successfully triggers and discloses ad-ditional behaviors on more than 60% (on average) of the an-alyzed malware samples, qualitatively improving code cov-erage of dynamic-based analyses.

1. INTRODUCTIONWith more than 1 billion of Android-activated devices re-ported on Sep 2013, mobile platforms have clearly becomeubiquitous with trends showing such a pace is unlikely slow-

ing down [21].

Application marketplaces, such as Google Play, drive the en-tire economy of mobile applications. For instance, with morethan 1 million installed apps, Google Play has generated rev-enues exceeding 9 billion USD [20, 22]. Such a wealthy andquite unique ecosystem with high turnovers and access tosensitive data has unfortunately spurred an alarming growthin Android malware. Privacy breaches (e.g., access to ad-dress book and GPS coordinates) [41], monetization throughpremium SMS and calls [41], and colluding malware to by-pass 2-factor authentication schemes [12] have become realthreats. Recent studies also report how easily mobile mar-ketplaces have been abused to host malware or seemingly le-gitimate applications embedding malicious components [39].

The nature of Android apps makes it hard—or apparentlyimpossible—to rely on traditional system call dynamic mal-ware analysis systems as is. In fact, Android applicationsare generally written in the Java programming language andexecuted on top of the Dalvik virtual machine [7], but nativecode execution is possible, for instance, via JNI. This mixedenvironment seems to suggest the need to reconstruct, andkeep in sync, different semantics through virtual machineintrospection (VMI) [17] for both the OS and Dalvik views,as recently shown in [36]: OS-level semantics (e.g., writingto a file, executing a program) would allow one to character-ize low-level and native code-induced actions, while Dalvik-level semantics would enable to disclose high-level Android-specific behaviors (e.g., sending an SMS). More recently,Zhang et al. stressed this concept further in VetDroid, a dy-namic analysis platform for reconstructing sensitive behav-iors in Android apps from a permission use perspective [37].Zhang et al. point out that traditional system call analysisis ill-suited to characterize the behaviors of Android apps asit misses high-level Android-specific semantics and fails atreconstructing inter-process communications (IPC) and re-mote procedure call (RPC) interactions, which are essentialto understanding Android application behaviors. While truein principle, we however observe that low-level OS-specificas well as high-level Android-specific behaviors are actuallycarried out via system call invocations (see Section 3). Infact, Android applications also interact with the system (andother applications) via well-defined system call-initiated IPCand RPC to carry out their tasks. A naive analysis of systemcalls (with arguments) would therefore miss Android-specificbehaviors. The key challenge here is therefore to properlyreconstruct such interactions from a single point of observa-

Page 3: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

tion (system calls), which also requires the ability to dissectcomplex Android objects, automatically.

Inspired by this key observation, we present CopperDroid1,an approach built on top of QEMU [6] to automatically per-form out-of-the-box (VMI-based) dynamic analysis to char-acterize the behaviors of Android malware. CopperDroidpresents a unified analysis that seamlessly captures bothlow-level OS-specific and high-level Android-specific behav-iors. In particular, based on the observation that such be-haviors are all achieved through the invocation of systemcalls, CopperDroid’s VMI-based system call-centric analysisfaithfully describes Android malware behavior regardless ofwhether it is initiated from Java, through JNI or native code.However, the ability to automatically and faithfully recon-struct system calls semantics, including IPC and RPC in-teractions, and (complex) Android objects, is a challengingtask. In fact, a number of useful high-level Android-specificinformation are missing at the system call level. To ad-dress this challenge, we introduce the concept of the unmar-shalling oracle, which seamlessly recreates complex Androidobjects to enrich the semantics of the reconstructed OS- andAndroid-specific behaviors.

A preliminary description of CopperDroid, focused on intro-ducing our preliminary analysis appeared in our workshoppaper [29]1. Conversely, in this paper, we present our ma-ture research effort whose contributions can be summarizedas follows:

1. We present the design and implementation of a noveland practical oracle-based technique that automati-cally and seamlessly reconstructs Android-specific ob-jects involved in system call-related IPC and RPC in-teractions. Our approach avoids manual developmentefforts and transparently addresses the challenge ofdealing with the ever increasing number of complexAndroid objects introduced in different Android re-leases. This allows to perform large-scale, automatic,and faithful analysis of Android malware behaviors.

2. To abstract a number of low-level system calls to high-level semantics (e.g., network communications, file sys-tem access) and further enrich our generated behav-ioral profiles, we automatically build data dependencygraphs over sets of observed system calls and performforward slicing to cluster data-dependent system calls.This gives us the ability to automatically recreate theresources associated with a stream of sliced systemcalls, which, depending on the resource, can be fedback to CopperDroid, downloaded for further inspec-tion, or analyzed by other systems.

3. To disclose additional malware behaviors we designand implement a stimulation technique based on theobservation that Android applications are inherentlyuser-driven and feature a number of implicit, yet well-defined entry points.

1Due to double blind submission requirements, we have useda generic name for our system as well as removing authors,title, and venue information from our workshop bibentry;the real name and workshop paper information have beenprovided to the PC co-Chairs.

4. We provide a thorough evaluation of CopperDroid’s be-havioral reconstruction capability on more than 1,200malware samples from 49 Android malware families asprovided by the Android Malware Genome Project [40],about 400 samples over 12 Android malware familiesfrom the Contagio project [11], and more than 1,300samples from McAfee, divided in roughly 115 fami-lies. Our experiments show that CopperDroid is ableto automatically and faithfully describe the behaviorof all these samples. Furthermore, CopperDroid con-firms the importance of a proper malware stimulation,which allowed us to disclose an average of 28% addi-tional unique behaviors on 60% of malware samples inthe first set, 22% on more than 70% of samples in thesecond set, and 28% on 61% of samples of the last set.

We believe CopperDroid’s unified analysis contributes effec-tively to improve the state-of-the-art in reconstructing thebehavior of Android malware, as outlined in the following.

Recently, Enck et al. proposed TaintDroid in [13], a frame-work for lightweight taint analysis of Android applications.Although taint analysis can be of great aid when analyzingmalware (e.g., detect of information leakage), it has intrinsiclimitations [32] and can be easily evaded by malware [10,31].Furthermore, TaintDroid is not well-suited for large-scaleanalysis of Android malware. Despite reporting good perfor-mances, TaintDroid components are “in-the-box” and thusextremely vulnerable to detection and tampering.

More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis for the Android environment [36]. Althoughour system could have been built on top of DroidScope, itssource code was not available when we began our develop-ment. Furthermore, DroidScope offers basic hooking mecha-nisms and relies on keeping a synchronized 2-level VMI (atthe OS and Dalvik VM semantics), which makes it com-plex and harder to port onto different versions of AndroidOSes (for instance, VMI-related offsets tend to vary morefrequently in the Dalvik VM rather than in the kernel). Inaddition, DroidScope’s effectiveness has been evaluated ononly two Android malware, which makes it hard to under-stand how much of the proposed approach is actually auto-matic or requires manual intervention.

VetDroid [37] presents a framework to construct permissionuse behaviors, which highlights how applications use per-missions to access system resources, and how such resourcesare utilized by the application. Although an interesting ap-proach, VetDroid requires a quite intrusive modification ofthe Android system (both Dalvik VM, Binder, and Linuxkernel), which hampers the ability to easily port the sys-tem to different Android versions. In addition VetDroidbuilds on top of TaintDroid and, therefore, inherits its draw-backs [10,31]. Conversely, our system’s unified analysis doesnot require complex introspection, but only to collect thesystem calls invoked by the processes running on the moni-tored system (as all the analyses are performed outside theVM). This flexibility allows our system to be loosely tied toa specific Android environment, which enables seamless inte-gration across different Android versions. For instance, wehave successfully run CopperDroid on Froyo, Gingerbread,

Page 4: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

and Jelly Bean with no modification to the Android system.

To overcome the limits of dynamic analysis (e.g., code orpath coverage), Anand et al. proposed a concolic-based solu-tion [2] to automatically identify events an application reactsto. Unfortunately, such a solution has two main drawbacks:it is based on instrumentation (i.e., easy to detect) and isextremely time-consuming (i.e., up to hours to exercise asingle application). Although an interesting direction to ex-plore further, that approach is ill-suited to perform large-scale malware analysis. As further described in Section 3.2,CopperDroid relies on a simple-yet-effective stimulation tech-nique that is able to improve basic dynamic analysis cover-age and discover additional behaviors with low overheads.

Although a non-negligible implementation effort, we con-sider the system we have developed as a mere yet necessarymechanism to carry out our actual contributions, i.e., Cop-perDroid’s VMI-based system call-centric analysis—whoseautomatic IPC and RPC dissection and Android-specific ob-jects (and thus behaviors) reconstruction are a key aspect—malware stimulation, and evaluation on large data sets.

2. THE ANDROID SYSTEMAndroid applications are typically written in the Java pro-gramming language and then deployed as Android Pack-ages archive (APKs). Each application runs in a separateuserspace process [3] as an instance of the Dalvik virtual ma-chine (DVM) [7] and usually with a distinct user and groupID. Although isolated within their own sandboxed environ-ment, these applications can interact with other applicationsand the system through well-defined APIs. Every APK isalso considered to be a self-contained application that canbe logically decomposed into one or more components. Eachcomponent is generally designed to fulfill a specific task (e.g.,GUI-related actions, notification receiver) and it is invokedeither by the user or the OS. Android defines activities, ser-vices, content providers, and broadcast receivers.

Activities, services, and broadcast receivers are activatedby intents, i.e., asynchronous messages exchanged betweenindividual components to request an action. Activity andservice intents specify actions to be performed. Conversely,broadcast receiver intents define the received event and aredelivered to the interested receivers.

Android manifests are XML files that must be included inevery APK and contain a number of interesting informationthat can indeed provide preliminary insights about an ap-plication maliciousness [42]. A manifest declares applicationcomponents as well as the set of permissions the applicationrequests along with the hardware and software features theapplication uses. In addition, a manifest may include intentfilters, i.e., the set of intents the application is willing tohandle.

2.1 Binder: IPC and RPCThe Android system implements the principle of least priv-ilege by providing a sandbox for each installed application.Therefore, one process may not manipulate the data of an-other process and can only access system components if itexplicitly requested the related permissions in the manifest.Nevertheless, applications often need a way to communicate

to each other and share data, e.g., an application can re-quest the permission to send SMS through the appropriateservice.

The Android system relies on Binder as its optimized IPCand RPC mechanism. When IPC is performed from processA to process B, the calling thread in A will wait until thenext available thread in the thread pool of B replies with theresults. The calling thread returns as soon as it receives suchresults. The data sent in the transaction is a Parcel, a bufferof flattened data and meta-data information. Dispatchingof the message between A and B takes place by means of aioctl system call handled by the Binder kernel driver.

When a service needs to provide a binding, it must define aclient-server interface that allows applications to bind andinteract with it; such an interface is called bound service. Ifthe service is used by other applications or across separateprocesses and requires multithreading, then the interface isusually defined by means of the Android interface definitionlanguage (AIDL). AIDL performs all the work to decom-pose objects into primitives that can be marshalled acrossprocesses. Any kind of request and data exchanged amongclients and services go through Binder, whose thorough anal-ysis allows therefore to identify Android-specific behaviors(e.g., sending an SMS and accessing private information).

3. COPPERDROIDAn unmodified Android system runs on top of a modifiedAndroid emulator ( CopperDroid emulator), which is builton top of QEMU [6]. To this end, we have enhanced (i.e.,instrumented) the Android emulator to enable system calltracking and support our out-of-the-box system call-centricanalyses. It is worth pointing out that all our analysesare executed outside CopperDroid emulator; we rely on well-known virtual machine introspection (VMI) [17] techniquesto fill the semantic gap between our emulator and the An-droid OS. Due to space constraints, we refer the reader toour preliminary workshop paper [29] for VMI-related details.

This provides, to an analyst or end user, a transparent en-vironment to automatically perform out-of-the-box dynamicbehavioral analysis on any kind of Android application (and,for this work, we are specifically interested in Android mal-ware). To this end, CopperDroid presents a unified analysisto characterize low-level OS-specific and high-level Android-specific behaviors.

In particular, based on the observation that such behaviorsare all achieved through the invocation of system calls, Cop-perDroid’s VMI-based system call-centric analysis faithfullydescribes Android malware behavior whether it is initiatedfrom Java, JNI or ELF code.

Android applications rely on IPC mechanism due to bothoperating system design and the need to interoperate amongservices. An application that requires, for instance, to sendan SMS, must perform a remote method invocation of thecorresponding service. Any exchanged message between theclient and service takes place via the Binder protocol, whichis implemented as a kernel driver. Therefore, when an appli-cation needs to perform IPC, it has to invoke the appropriateioctl system call to allow Binder to dispatch the requested

Page 5: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

action to the corresponding service and viceversa. AndroidBinder then marshalls and unmarshalls the contents of theIPC message, i.e., a Parcel object, based on the informationprovided in the AIDL as described in Section 2.

To this end, we provide and implement in CopperDroid anovel technique to perform automatic unmarshalling of anyAIDL available on the system. This allows for an easy,human-readable and error-free representation of the IPCmessage’s contents, which is of paramount importance todescribe and understand Android-specific behaviors (e.g.,sending an SMS, accessing private information). Androidmalware low-level OS-related behaviors (e.g., writing to afile, creating a process, sending network data) are of courseachieved through system calls and therefore intercepted1 byCopperDroid’s unified analysis.

In other words, any representative application behavior isthe union of low and high level information identified by sys-tem calls, parameters and Binder unmarshalled data. Thisresult highlights and emphasizes the strength of an unifiedanalysis such as our system: a mere system call trackingwould not provide any behavior insights if were not com-bined with Binder information and automatic (complex) An-droid objects reconstruction, as outlined next.

3.1 Tracking System Call InvocationsTracking system call invocations is at the basis of virtuallyall the dynamic malware behavioral analysis systems [19,23,34]. Most—if not all—of such systems implement a form ofVMI to track system call invocations on a virtual x86 CPU.Although similar, the ARM architecture presents a few de-tails that may challenge VMI-based system call invocationstracking and are thus worth to rough out.

The ARM ISA provides the swi instruction for invoking sys-tem calls, which causes the well-known user-to-kernel tran-sition by triggering a software interrupt (see [29] for furtherdetails). To track system call invocations, we instrumentQEMU when the swi instruction is executed. Of course,it is also of paramount importance to detect when a sys-tem call is about to return as that allows to save its returnvalue, which enriches the analysis with additional seman-tic information. Therefore, instead of relying on a cumber-some heuristic, the generic approach CopperDroid adopts isto intercept CPU privilege-level transitions. Due to spaceconstraints, we refer the reader to [29] for further technicaldetails.

3.1.1 Automatic AIDL UnmarshallingAs outlined in Section 2.1, the Android system heavily relieson kernel-implemented IPC and RPC interactions to carryout tasks and (some) permission-related policy enforcement.Therefore, tracking and dissecting the communications thathappen over this media is a key aspect for reconstructinghigh-level Android-specific behaviors. Although recently ex-plored to enforce user-authorized security policies [35], to thebest of our knowledge, CopperDroid is the first approach tocarry out a detailed analysis of such communication channels

1CopperDroid emulator intercepts system calls and extractstheir parameters—Binder is a specific case of such.

Algorithm 1: Unmarshalling Oracle

Data: Marshalled binder transaction and data types(determined through the AIDL)

Result: Unmarshalled binder transactionswhile data → marshalled do

determine type of marshalled item;if type → primitive then

automatically apply correct parcelableread/create functions;append string repr. to results;

elselocate CREATOR field for reflection;use java reflection to get class object;for every class field do

if field → primitive thenappend string repr. to results;

elseexplore field recursively;append string repr. to results;

end

end

end

end

to comprehensively characterize OS- and Android-specificbehaviors of malicious applications.

Let us consider an application that sends an SMS as ourrunning example. From a high-level perspective (e.g., Javamethods), sending an SMS roughly corresponds to obtain-ing a reference to an instance of the class SmsManager, thephone SMS manager, and sending the SMS out by invokingthe method sendTextMessage on the instance, with the des-tination phone number and the text message as the method’sarguments (see Figure 1). This corresponds to locatingthe Binder service isms and remotely invoking its sendTextmethod with proper arguments.

Conversely, from a low-level perspective, the same actionscorrespond to the sender application invoking two ioctl sys-tem calls on /dev/binder: one to locate the service and theother to invoke its method. CopperDroid thoroughly intro-spects the arguments of each binder-related ioctl systemcall to reconstruct the remote invocation. This allows us toidentify the invoked method and its parameters, and to inferthe high-level semantic of the operation. In particular, wefocus our analysis on Binder transactions, i.e., IPC opera-tions that actually transfer data (also responsible for RPC).To identify them, CopperDroid parses the memory structurespassed as a parameter to the ioctl system call and identifiesBC_TRANSACTION and BC_REPLY (see [29]).

However, just intercepting transactions may be of limiteduse when it comes to understanding Android-specific behav-iors. In fact, the raw ioctl-provided Binder data that flowsthroughout transactions are in the form of Parcel objects.Moreover, every interface the client and service both agreeupon has its own set of predefined methods’ signature, and,as the Android framework counts about 300 AIDL interfaces,manual unmarshalling is unfeasible and error-prone.

Page 6: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

...

SmsManager sms =

SmsManager.getDefault();

sms.sendTextMessage(

"1234", null, "Hello", null, null);

OutputStreamWriter out =

new OutputStreamWriter(

openFileOutput("samplefile.txt",

MODE_WORLD_READABLE));

out.write("Data write", 0, 10);

...

String url = "http://www.google.com";

Intent i = new Intent(Intent.ACTION_VIEW);

i.setData(Uri.parse(url));

startActivity(i);

...

(a)

...

ioctl(0x14, BINDER_WRITE_READ, 0x440bec04) = 0

open("files/samplefile.txt, 0x20241, 0x180) = 0x1c

read(0x3f, 0x470bec04, 0xf) = 0xf

...

write(0x1c, "Data write", 0xa) = 0xa

...

ioctl(0x14, BINDER_WRITE_READ, 0x44fc6e17) = 0

ioctl(0x14, BINDER_WRITE_READ, 0x44fd7e10) = 0

ioctl(0x14, BINDER_WRITE_READ, 0x44fe8e27) = 0

ioctl(0x14, BINDER_WRITE_READ, 0x44feae07) = 0

...

socket(0xa, 0x1, 0x0) = 0x26

ioctl(0x14, BINDER_WRITE_READ, 0x44fd7e25) = 0

...

connect(0x26, {’sin_port’: 80,

’in_addr’: ’::ffff:173.194.41.144’, ...},

0x80) = 0

sendto(0x26, 0x43eadb48, 0x185, 0x0, 0x0,

0x0) = 0x185

...

(b)

BINDER_TRAN::com.android.internal.telephony.ISms.sendText(

destAddr = 123456789, srcAddr = None,

text = "Hello world", sentIntent = [PendingIntent null],

deliveryIntent = [PendingIntent null])

FS_ACCESS::Creation of "sampefile.txt"

BINDER_TRAN::android.content.Intent {

class java.lang.String mAction:

"android.intent.action.VIEW",

class android.net.Uri mData:

class android.net.Uri$StringUri {

class java.lang.String uriString:

"http://www.google.com",

}

}

...

NET_ACCESS::HTTP { connect ’173.194.41.152’, ...

’GET /?gfe_rd=cr&ei=wHR2U_CcH4LY8gfhroGYDg HTTP/1.1

...’

...

(c)

Figure 1: CopperDroid behavior reconstruction analysis: (a) shows a snippet of an Android app written in Java; (b) shows anexcerpt of the corresponding system call trace; (c) shows an excerpt of the reconstructed behavioral profile (binder analysisand resource reconstructor). The reconstructed file system resource is made available for download.

To understand the invoked method and the unmarshallingprocedure for its parameters, we extended CopperDroid withthe following. First, we let it identify the InterfaceTokenspecified in the payload (see [29] for further details). This isthen used to find the AIDL description of the interface Cop-perDroid needs to associate the numeric code to the invokedmethod and, therein, understand the types of its parame-ters. This step is necessary because, even if Parcel methodscan create easily unmarshallable streams of bytes (includingmetadata to associate bytes to types), payloads are oftenmarshalled directly as only the receiver knows exactly howto unmarshall them. Our solution then modifies the AIDLcompiler to automatically generate, for each interface, sig-natures of each method. Such signatures are then sent toan unmarshalling oracle, along with the parceled data ex-tracted from a transaction, to determine the values of theseobjects through automatic unmarshalling of binder data.

The unmarshalling oracle is a Java application that run in avanilla Android emulator alongside CopperDroid, and its taskis to receive the Binder method signature and its arguments,in the form of a marshalled Parcel blob, and to return acustom representation of such objects.

By default, the oracle is queried offline, i.e., after all theexecution traces have been collected. The data redirectionfrom CopperDroid’s emulator to the oracle is split into twosets of data of 1) marshalled data derived from binder com-munication by CopperDroid’s binder analysis and 2) the sig-nature of the method, whose argument types corresponds tothe sequence of marshalled data. Once acquired, and withthe use of Java reflection, the oracle is able to unmarshallthe complex serialized Java objects, returning their stringrepresentations to CopperDroid for further Android-specificbehavioral analysis.

Algorithm 1 outlines the working details of the unmarshallingoracle. The oracle Android application currently unmar-shalls binder communication objects in one of two uniqueways depending on whether the type of data is a primitivetype or a class object. While iterating through the list of

types and class names, if the type is identified as primi-tive the correct read function provided by Parcel is imple-mented.

Unlike primitives, to unmarshall a class instance, the oracleapplication requires Java reflection to dynamically retrieve areference to the CREATOR field, before reading the remainingclass data. The CREATOR is then invoked to unmarshall thenew object and read in its data.

Once either a primitive or class type has been unmarshalled,the oracle creates a string representation of the object byinvoking its toString() method, if any, or alternatively in-specting its field through reflection to produce as much in-formation as possible. The string representation is then ap-pended to an output string list, and the marshalled dataoffset is updated to point of the next unmarshalled item.Additionally, the oracle iterates to the next type or classname on the given list. Figure 1 (c) shows the outcome ofthe unmarshalling oracle on a test application.

Unfortunately, not every Binder transaction can be analyzedoffline. To improve performances, indeed, some of the argu-ments of the remote method are not directly marshalled inthe Parcel blob but, rather, in a shared memory area, han-dled by Android ashmem driver. In this case, the Parcel onlycontains a reference to locate the object, and the receivingprocess uses it to identify the shared memory area and re-trieve the data from it. In particular, this reference is afile descriptor representing an ashmem region, created andmmap’ed by the sender process to store the marshalled data.The receiving process can then seamlessly mmap the receivedfd to retrieve data.

With ashmem-based parceling, the oracle cannot work solelyoffline. Indeed, we need CopperDroid to keep track of cre-ated ashmem regions and to retrieve data to be unmarshalledby the oracle. However, to extract relevant data at relevanttimes, CopperDroid must know the file descriptor passed asa reference in the binder transaction. To avoid using ad-hoc extraction procedures, it needs the support of the or-

Page 7: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

# Stimulation Type Parameters

1 Received SMS Text, from number

2 Incoming call From number, duration

3 Location update Geospatial coordinates

4 Battery status Amount of battery

5 Phone Reboot

6 Keyboard input Typed text

Table 1: Main stimulations and parameters.

acle itself, that can automatically extract this information.Thus, whenever CopperDroid intercepts a binder transactionthat uses ashmem, it sends parceled data to the oracle that,instead of unmarshalling the ashmem-based objects, just re-turns the file descriptor. With this information, CopperDroididentifies and dumps the corresponding ashmem region andsends it back to the oracle for the final unmarshalling andlater analyses.

A limitation of our current implementation is that we canonly automatically inspect and unmarshall parameters ofmethods contained in interfaces of which we have the AIDLfiles. Bound services that do not use AIDL, e.g., Activity-Manager, are manually unmarshalled.

3.2 Path CoverageAlthough effective, a simple install-then-execute dynamicanalysis may miss a number of interesting (malicious) be-haviors. This problem has long been affecting traditionaldynamic analysis approaches as non-exercised paths are sim-ply unanalyzed. If such paths host additional (or the only)malicious behaviors, then any dynamic analysis would failunless proper, but generally expensive and complex explo-ration techniques are adopted [8,26]. In addition, this prob-lem is exacerbated by the fact that mobile applications areinherently user driven and interaction with applications isgenerally necessary to increase coverage. For instance, letus consider an application that operates as a broadcast re-ceiver for SMS RECEIVED events. After installation, theapplication would only react to the reception of SMS, show-ing no interesting nor additional behaviors otherwise.

Traditional executables have a single entry point, while An-droid applications may have multiple. Most applicationshave a main activity, but ancillary activities may be trig-gered by the system or by other applications and the exe-cution may reach them without flowing through the main.To address such coverage problem, CopperDroid implementsa novel approach to artificially stimulate the analyzed mal-ware with a number of valid and interesting events basedon the malware’s Manifest. For example, injecting eventssuch as phone calls and reception of SMS texts would leadto the execution of the registered application’s broadcast re-ceivers. Another example that comes from our experiencewith Android Malware is the BOOT_RECEIVED intent, thatmany samples use to get executed as soon as the victimsystem is booted (much like \CurrentVersion\Run registrykeys on Windows systems).

The Android emulator offers the possibility to inject a con-siderable number of artificial events to stimulate a runningapplication. These range from very low-level hardware-related

events (e.g., loss of the 3G signal) to higher-level ones (e.g.,incoming calls, SMS). CopperDroid could adopt a fuzzing-likestimulation strategy and trigger all the events that could beof interest for the analyses, ignoring information that can beextracted from the target application. That would unfortu-nately be of limited effect because of the underlying Androidsecurity model and permission system, which can instead beleveraged to carry out a fine-grained targeted stimulationstrategy. To this end, CopperDroid examines each appli-cation’s Manifest to extract events and permission-relatedinformation to drive the malware stimulation. Furthermore,an application could dynamically register a broadcast re-ceiver for custom events at run-time. CopperDroid is able tointercept such operations and add a proper stimulation forthe newly registered receiver.

To perform its custom stimulation, CopperDroid leveragesthe Android emulator capabilities to inject a number of ar-tificial events into the emulated system. In particular, itleverages Monkeyrunner, a tool that provides an out-of-the-box API to control an Android device or emulator, throughthe Python programming language [4].

A summary of the main events CopperDroid handles is re-ported in Table 1, which also shows the parameters that canbe customized for each event.

3.3 Observed BehaviorsWe manually examined the results of CopperDroid’s analy-ses (i.e., system call invocations tracking, Binder analysis,and complex Android objects reconstruction) on a numberof randomly selected Android malware extracted from thesamples sets at our disposal [11, 25, 40]. Figure 2 shows theinsights of our examination, which allowed us to identify sixmacro class of behaviors. Each class contains one or morebehavioral model, which is defined by a set of actions. Ac-tions are traced through CopperDroid and can belong to anylevel of behavior abstraction (e.g., OS- and Android-specificbehaviors).

Interestingly, some behaviors are well-known and shared withthe world of non-mobile malware. Others, such as those un-der the “Accessing Personal Info.” class, are instead inher-ently specific to the mobile ecosystem.

Every terminating class in the map corresponds to a behav-ioral model that can be expressed by an arbitrary numberof actions, depending on its specific complexity. The com-plexity of these elements vary greatly. Some are defined asa single system call, such as execve. Others, such as “SMSSend” or those under “Access Personal Info”, are defined asa set of transactions of the Binder protocol. Yet others aredefined as multiple consecutive system calls. For instance,outgoing HTTP traffic is modeled as a graph with a connect

system call, followed by an arbitrary number of send-likesystem calls, whose payload is parsed to detect HTTP mes-sages, possibly interleaved by unrelated non-socket systemcalls.

Terminating classes do not forcibly correspond to just oneof the aforementioned models, but may also contain a set ofthem. To clarify, consider the examples shown in Figure 3.

Page 8: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

Figure 2: Hierarchical map of behaviors.execve(’pm’,[’pm’, ’install’, ’-r’, ’New.apk’],...);

(a) App installation via direct system call (OS-specific behavior).

Intent intent = new Intent(Intent.ACTION_VIEW);

intent.setDataAndType(

Uri.fromFile(

new File("/mnt/sdcard/New.apk")),

"application/vnd.android.package-archive");

startActivity(intent);

(b) App installation via Binder transaction (e.g., Intent—Android-specific behavior)

Figure 3: App installation via direct system call and Bindertransaction

CopperDroid recognizes actions triggered by both these snip-pets of code as belonging to the class “Install APK”, butyet they are very diverse (respectively, a system call and aBinder protocol transaction).

3.4 Resource ReconstructorAs we shown in Figure 1, CopperDroid is also capable of ab-stracting a stream of data-dependent (i.e., related) systemcalls to high-level behaviors (e.g., network communications,file system access). We have given CopperDroid the addi-tional capability of recreating resources as produced by theseactions to enable more accurate and in-depth analysis. Thepurpose of the resource reconstructor is then twofold: tomap a stream of related low-level events to a more meaning-ful high-level behavior, and to recreate resources (e.g., file,network communication) associated with an application. Tothis end, CopperDroid analysis builds a system call-relateddata dependency graph and def-use chains. In particular,each observed system call is initially considered as an un-connected node. A forward slicing algorithm then insertsedges for every inferred dependence between two calls. Asthe slicing proceeds, both nodes and edges are annotatedwith the system call argument constraints; these annotationsare essential in the creation of our def-use chains. Def-usechains, where each call is linked by def-use dependencies,are formed when the output value by one system call (thedefinition, e.g., open, dup, dup2) is the input value to a fol-

lowing (non-necessarily adjacent) system call (the use, e.g.,write, writev). Therefore, by building a data dependencygraph over the set of observed system calls, and performingforward slicing, we can recreate file system-related eventsand the actual resources2.

4. EVALUATIONOur experimental setup is as follows. We ran an unmodifiedAndroid image on top of our CopperDroid-enhanced emula-tor. The system was customized to include personal infor-mation, such as contacts, SMS texts, call logs, and picturesto mimic as closely as possible a real device. Each analyzedmalware sample was installed in the emulator and traceduntil a timeout was reached. At the end of the analysis,a clean execution environment was restored to prevent cor-ruptions and side-effects caused by installing more than onemalware sample in the same system. To limit noisy results,each sample was executed and analyzed 6 times: thrice with-out stimulation and thrice with stimulation; results of singleexecutions were then merged.

As shown in Table 2, our experiments were designed to com-ply as much as possible with recently presented guidelineswhen performing experiments on malware [30]. Please notethat most of the unmet principles were out of scope to Cop-perDroid (e.g., no FPs nor TPs are reported as currentlyCopperDroid does not perform any classification nor detec-tion), while a few others simply required additional time(e.g., C.3). However, contrarily to related work (e.g., Droid-Scope [36], Aurasium [35], and VetDroid [37]), CopperDroidis independent from the Android system under analysis.

We evaluated CopperDroid two publicly available [11, 40]data sets and additional one provided by McAfee [25]; thedata sets are composed of 1,226, 395 and 1,365 samples,respectively, counting more than 2,900 samples overall.

To evaluate the effectiveness of CopperDroid stimulation ap-proach we proceeded as follows. First, we analyzed all thesamples without external stimulation. Then, we performedthe stimulation-driven analysis of the same malware sets, asoutlined in Section 3.2.

A summary of the effects of the stimulation on the threedatasets is presented in Table 3; details of the analysis on thefirst two datasets was presented in our preliminary workshopwork [29], while results on the McAfee data set are reportedin Table 7 (see Appendix A).

As Table 3 shows, the results of our analysis on the newMcAfee dataset are consistent with our previous and pre-liminary results (see Table 3 and [29]): 836 out of 1365(61%) McAfee samples exhibited additional behaviors (seeSection 3.3 for what we consider to be a behavior) and, onaverage, the number of additional behaviors was roughly 6.5,out of an average number of exhibited behaviors of 22.8, ob-served during non-stimulated executions.

As we will discuss in Section 5, solutions to improve code

2Our analysis retains deleted files (unlink) and multipleversions of the resources with identical file names. Also,although we focus this discussion on file system-related sys-tem calls, a similar argument holds for network-related calls.

Page 9: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

Criterion Ref OK Criterion Ref OK Criterion Ref OK

Removed Goodware A.1 3 Described NAT B.5 7 Balanced Families A.2 7Interpreted FPs/FNs B.6 N/A Separated Datasets A.3 3 Interpreted TPs B.7 N/AMitigated Artifacts A.5 3 Used Many Families C.1 3 Higher Privileges A.4 3Avoided Overlays A.6 3 Real-world FPs/TPs exp. C.2 N/A Removed moot samples C.1 3Listed Malware B.1 3 Used multiple OSes C.3 7 Used multiple OSes C.3 7Listed Malware Families B.2 3 Described Sampling B.3 N/A Allowed Internet C.5 3Mentioned OS B.4 3

Table 2: Guidelines when performing malware analysis [30]

coverage may be built on top of symbolic execution [2, 9],for instance, but unfortunately they do not scale well andare ill-suited to perform large scale analyses such as thoseperformed by CopperDroid.

Table 4 reports the overall breakdown of the observed behav-iors (see Figure 2) on McAfee dataset. Each row identifiesthe class of behavior and how many samples over the totalexhibited at least one occurrence of such behavior, withoutand with stimulation, respectively. As can be observed thetwo most influenced behavioral class are Access Personal In-formation and Make/Alter Call. The first is triggered by anon-negligible number of samples that receive an incomingmessage sent by CopperDroid stimulation technique (and ex-hibit an access to the user’s personal information, otherwisehidden). The latter is mostly due to a set of malware that,whenever a call is received, hide its notification to the user.Table 6, instead gives a more detailed overview of singlebehavioral subclasses (defined in Section 3.3) and if—andhow—they are influenced by stimulation.

Lastly, we also ran a number of malware samples with no, se-lective, and full stimulation. This is to see which individualstimulus induced what amount of incremental behavior, andwhether combinations of stimulation are more effective thansingular triggers. Again, these stimulations are tailored toeach malware by analyzing the their Manifest to determinewhat triggers were possible due to the permission scheme.

We deliberately chose Android malware samples that in ourexperiments had the highest, average and lowest incremen-tal behavior both percentage wise and amount wise. If sev-eral families had the same maximum amount of incremental

Malware Incr. Behav. Avg. Std.Dataset (Samples) Increment Dev

Genome 752/1226 (60%) 2.9/10.3 (28.1%) 2.4/11.8Contagio 289/395 (73%) 5.2/23.6 (22.0%) 3.3/19.8McAfee 836/1365 (61%) 6.5/22.8 (28.5%) 9.5/30.1

Table 3: Summary of stimulation results, per dataset.

Behavior Class No Stimulation Stimulation

FS Access 889/1365 (65.13%) 912/1365 (66.81%)Access Pers. Info. 558/1365 (40.88%) 903/1365 (66.15%)Network Access 457/1365 (33.48%) 461/1365 (33.77%)Exec. Ext. App. 171/1365 (12.52%) 171/1365 (12.52%)Send SMS 38/1365 (2.78%) 42/1365 (3.08%)Make/Alter Call 1/1365 (0.07%) 55/1365 (4.03%)

Table 4: Overall behavior breakdown of McAfee dataset.

behavior we choose the one with the highest percentage in-cremental behavior and vice versa. Lastly we determinedthe best representative sample from each family based onthe amount and diversity of behaviors. The results of var-ious stimulations on these malware samples can be seen inTable 5. With the table we can begin to see correlationsbetween different stimuli and behaviors. As Table 5 shows,our selective stimulations was able to disclose a number ofadditional previously-unseen (e.g., YZHC SMS stimulationshowed access of personal account information) or already-observed (e.g., SHBreak showed 113 additional generic exe-cution) behaviors.

4.1 Performance EvaluationIn this section we present an evaluation of CopperDroid’soverhead through a number of different experiments con-ducted on a GNU/Linux Debian 64-bit system equippedwith an Intel 3.30GHz core (i5) and 3GB of RAM. Bench-marking a multi-layered system, such as Android, in con-junction with a complex technique, such as CopperDroid (andin an emulated environment), can be a rather complicatedtask. For example, traditional benchmarking suites basedon measuring I/O operations are affected by caching mech-anisms of emulated environments. On the other hand, CPU-intensive benchmarks are meaningless against the overheadof CopperDroid, as it mainly operates on system calls.

To address such issues, we performed two different bench-marking experiments. The first is a macrobenchmark thattests the overhead introduced by CopperDroid on commonAndroid-specific actions, such as accessing contacts and send-ing SMS texts. Because such actions are performed via theBinder protocol, these tests give a good evaluation of theoverhead caused by CopperDroid’s Binder analysis infras-tructure. The second set of experiments is a microbenchmarkthat measures the computational time CopperDroid needs toanalyze a subset of interesting system calls.

To execute the first set of benchmark, we created a fictionalAndroid application that performs generic tasks, such assending (SEND_SMS) and reading (SMS) texts, accessing lo-cal account information (GET_ACC), and reading all contacts(CONTACTS). We then ran the test application for 100 iter-ations and collected the average time required to performthese operations under 3 different settings: on an unmodi-fied Android emulator (i.e., without CopperDroid—baseline),on a CopperDroid-enhanced emulator with CopperDroid con-figured to monitor the test application (the common settingwhen analyzing a piece of malware—CD (targeted)), and ona CopperDroid-enhanced emulator with CopperDroid config-ured to track system-wide events (CD (full)). Results arereported in Figure 4 (A). As can be observed, the overhead

Page 10: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

Sample Behavior Behavior Behaviors Incr. Behavior Incr. Behavior Incr. BehaviorFamily Class Subclass No Stim. Type Stim. SMS Stim. Loc. Stim.

YZHC

Network AccessHTTP 4 - - N/ADNS 1 - - N/A

Exec External App

Generic 3 +10 (+433%) - N/AShell 1 +3(+400%) - N/A

Priv. Escalation 2 - +2(+100%) N/AInstall APK 4 - - N/A

Access Personal Info Account - - +1(⊥) N/AFS Access Write 414 - - N/A

zHash

Network AccessHTTP 2 +2 (+100%) +5 (+350%) N/ADNS - - +1 (⊥) N/A

Exec External App

Generic 1 +12 (+1300%) +3 (+400%) N/AShell 1 +3 (+400%) - N/A

Priv. Escalation 4 - - N/AInstall APK 4 - - N/A

Access Personal Info Account 2 - - N/AFS Access Write 163 - +255 (+257%) N/A

SHBreak

Network Access HTTP 3 - N/A N/A

Exec External AppGeneric 2 +113 (+5750%) N/A N/A

Shell 1 +22 (+2500%) N/A N/AInstall APK 4 +4 (+100%) N/A N/A

FS Access Write 195 +353 (+281%) N/A N/A

Droid KungFu

Network Access HTTP 13 - N/A -

Exec External AppGeneric 1 +2 (+300%) N/A +1 (+200%)

Shell 1 - N/A -Install APK 4 - N/A -

FS Access Write 3 +197 (+6667%) N/A +144 (+4800%)

Fladstep

Network Access HTTP 15 - N/A N/A

Exec External AppGeneric 3 +17 (+633%) N/A N/A

Shell 1 +5 (+500%) N/A N/AInstall APK 4 - N/A N/A

FS Access Write 171 +80 (+47%) N/A N/A

Table 5: Incremental behavior induced by various stimuli, N/A means stimulus not possible based on Manifest

introduced by the targeted analysis is relatively low, respec-tively ≈ 26%, ≈ 32%, ≈ 24% and ≈ 20%. On the otherhand, system-wide analyses increase the overhead consider-ably (>2x) because of the number of Android componentsthat are concurrently analyzed.

The second set of experiments measures the average timeCopperDroid requires to inspect a subset of interesting sys-tem calls. This experiment collected more than 150,000 sys-tem calls obtained by instructing the system to run appli-cations subjected to arbitrary (and artificial) workloads. Astracking a system call requires to intercept entry and exit ex-ecution points, we report such measures separately, as shownin Figure 4 (B) (the average times are 0.092ms for entry and0.091ms for exit).

5. RELATED WORKIn this section we cite and compare against the most relevantwork we believe directly relates with CopperDroid.

DroidScope [36] is a framework to create dynamic anal-ysis tools for Android malware that trades off simplicityand efficiency for transparency: as an out-of-the-box ap-proach, it instruments the Android emulator, but it mayincur high overhead (for instance, when taint-tracking is en-abled). DroidScope leverages a 2-level VMI [17] to gatherinformation about the system and exposes hooks and a setof APIs, which enable the development of plugins to performboth fine and coarse-grained analyses (e.g., system call, sin-gle instruction tracing, and taint tracking). Differently fromCopperDroid, DroidScope offers a set of hooks analyses can

build upon to intercept interesting events and does not per-form any behavioral analysis per-se. For example, a toolleveraging DroidScope can intercept every system call ex-ecuted on an Android system, but would still need to doits own VMI to inspect the parameters of each call. Fol-lowing this principle, CopperDroid could have been built ontop of DroidScope, but at the time we implemented it, theDroidScope framework was not publicly available. More-over, the main focus of our research is not to illustrate howto build a framework or a clever VMI technique for Androidsystems, but rather to point out how a proper system call-centric analysis—which still includes a thorough IPC and

Behavior Class Subclass No Stim. Stim

Network AccessGeneric 483 489HTTP 309 318DNS 416 416

FS Access Write 889 912

Access Personal Info.

SMS 32 266Phone 510 559Accounts 51 672Location 143 147

Exec. External App.

Generic 132 132Priv. Esc. 103 103Shell 73 73Inst. APK 8 8

Send SMS — 38 42

Make/Alter Call — 1 55

Table 6: Behavior breakdown of McAfee samples.

Page 11: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

A B

Figure 4: Binder Macrobenchmark (A) and System Call Microbenchmark (B).

RPC Binder-related protocol analysis as well as automaticand seamlessly complex Android object reconstructions—and stimulation technique can comprehensively expose An-droid malware behaviors, as shown by our evaluation.

Enck et al. presents TaintDroid [13], a framework to en-able dynamic taint analysis of Android applications. Taint-Droid’s main goal is to track how sensitive information flowbetween the system and applications or between applica-tions to automatically identify leaks. Because of the com-plexity of the Android system, TaintDroid relies on differentlevels of instrumentation to perform its analyses. For ex-ample, to propagate taint information through native meth-ods and IPC, TaintDroid patches JNI call bridges and theBinder IPC library. TaintDroid is both extremely effective,as it allows to propagate tainting between many differentlevels, and efficient, as it does that with a very low over-head. Unfortunately, the price to pay is low resiliency andtransparency: modifying internal components of Android in-evitably exposes TaintDroid to a series of detection and eva-sions techniques [10, 31, 32]. For instance, even applicationswith standard privileges can detect TaintDroid’s presence bycalculating checksums over instrumented and readable com-ponents. Moreover, TaintDroid cannot track taintedness ofnative code. Conversely, applications that can escalate theirprivileges can go even further: identifying and disablingTaintDroid’s hooks and analysis. Furthermore, the decisionof modifying internal components also exposes TaintDroidto the issues deriving from constantly adapting the analysiscode to an highly-mutable architecture as the Android OS.

DroidBox is a dynamic in-the-box Android malware ana-lyzer [33] that leverages custom instrumentation of the An-droid system and kernel to track a sample’s behavior throughTaintDroid’s taint-tracking of sensitive information [13]. Us-ing TaintDroid and instrumenting Android’s internal com-ponents makes DroidBox prone to the problems of in-the-box analyses: malware can detect and evade the analysesor, worse, even disabling them.

Andrubis [1] is an extension to the Anubis dynamic mal-ware analysis system to analyze Android malware [5, 19].According to its web site, it is mainly built on top of bothTaintDroid [13] and DroidBox [33] and it thus shares theirweaknesses (mainly due to operating “into-the-box”). In ad-dition, Andrubis does not perform any stimulation-basedanalysis, limiting its effectiveness in discovering interestingAndroid-specific behaviors.

DroidMOSS [39] relies on signatures for detecting malwarein app markets. Similarly, DroidRanger [42] and JuxtApp [18]identify known mobile malware repackaged in different apps.Although quite successful, signature-based techniques limitthe detection effectiveness only to known malware. In [14],Enck et al. report on a study of Android permissions foundin a large dataset of Google Play apps to understand theirsecurity characteristics. Such an understanding is an inter-esting starting point to bootstrap the design of techniquesable to enforce security policies [35] and avoid the installa-tion of apps requesting a dangerous combination [15] or anoverprivileged set of permissions [16, 28]. Although promis-ing, the peculiarity of Android apps (e.g., a potential com-bination of Java and native code) can easily elude policyenforcement or collude to perform malicious actions whilemaintaining a seemingly legitimate appearance. This clearlycalls for continuous research in this direction.

Aurasium [35] is a technique (and a tool) that enables dy-namic and fine-grained policy enforcement of Android appli-cations. To intercept relevant events, Aurasium instrumentssingle applications, rather than adopting system-level hooks.Working at the application level, however, exposes Aurasiumto easy detection or evasion attacks by malicious Androidapplications. For example, regular applications can rely onnative code to detect and disable hooks in the global offsettable even without privilege escalation exploits. Aurasium’sauthors state that their approach can prevent such attacksby intercepting dlopen invocations needed to load nativelibraries. It is however unclear how benign and maliciouscode can be distinguished, as this policy cannot be light-heartedly delegated to Aurasium’s end-users. Conversely,CopperDroid’s VMI-based system call-centric analysis is re-silient to such evasions.

Google Bouncer [24], as its name suggests, is a service that“bounces” malicious applications off from the official GooglePlay (market). Little is known about it, except that it is aQEMU-based dynamic analysis framework. All the other in-formation come from reverse-engineering attempts [27] andit is thus impossible to compare it against our approach.

SmartDroid [38] leverages a hybrid analyses that staticallyidentifies paths that lead to suspicious actions (e.g., access-ing sensitive data) and dynamically determine UI elementsthat take the execution flow down paths identified by thestatic analysis. To this end, the authors instrument boththe Android emulator and Android’s internal components

Page 12: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

to infer which UI elements can trigger suspicious behaviors.They furthermore evaluate SmartDroid on a testbed of 7 dif-ferent malware samples. Unfortunately, SmartDroid is vul-nerable to obfuscation and reflection, which make it hard—ifnot impossible—to statically determine every possible exe-cution path. Conversely, CopperDroid’s dynamic analysis isresilient to static obfuscation and reflection attempts.

Anand et al. propose ACTEve [2], an algorithm that lever-ages concolic execution to automatically generate input eventsfor smartphone applications. ACTEve is fully automatic:it does not require a learning phase (such as capture-and-replay approaches) and uses novel techniques to prevent thepath-explosion problem.

VetDroid is a dynamic analysis platform for reconstructingsensitive behaviors in Android apps from a permissions useperspective [37]. Zhang et al. point out that traditionalsystem call analysis is ill-suited to characterize the behav-iors of Android apps as it misses high-level Android-specificsemantics and fails at reconstructing IPC and RPC interac-tions. Contrarily to this, we have shown CopperDroid uni-fied system call-centric analysis is able to automatically andseamlessly reconstruct IPC and RPC interactions as well ascomplex Android objects, generating insightful behavioralprofiles.

We acknowledge that although CopperDroid’s stimulationsare proper, its approach is a best-effort attempt and couldbenefit from state-of-the-art techniques. We however mustkeep in mind the overhead such techniques may introduce.For instance, the aforementioned work (i.e., [2, 38]) do notseem to pay much attention about performance issues. Smart-Droid [38] reports no overhead measurements and the aver-age running time of ACTEve, as reported in [2], falls withinthe range of hours, which makes it ill-suited to automatedlarge scale analyses.

AvailabilityCopperDroid is available at http://copperdroid.isg.rhul.ac.uk, where users can submit samples to be analyzed. Anal-ysis’ results create behavioral profiles (both in HTML andJSON format, for easy parsing) and many ancillary infor-mation (e.g., generated network traffic).

6. CONCLUSIONIn this paper we proposed CopperDroid, a VM-based dy-namic system call-centric analysis and stimulation techniqueto uniformly, and automatically, reconstruct the behaviors(OS-specific and Android-specific) of Android malware. Con-trary to recent claims [36,37], CopperDroid’s automatic andseamless reconstruction of IPC and RPC interactions as wellas Android-specific objects (and thus behaviors) opens upthe possibility to reconsider rich and unified system call-based approaches as effective techniques to mitigate Androidmalware threats.

We furthermore evaluated the effectiveness and performanceof our analysis on a large data set of more than 2,900 realworld Android malware. Results show also how a properexternal stimulation can influence the analysis and lead tothe discovery of additional behaviors.

7. REFERENCES[1] Andrubis: A tool for analyzing unknown android

applications. http://anubis.iseclab.org/.

[2] S. Anand, M. Naik, H. Yang, and M. Harrold.Automated concolic testing of smartphone apps. InProc. of FSE, 2012.

[3] Android. Android developer reference. http://developer.android.com/reference/packages.html.

[4] Android. Monkeyrunner. http://developer.android.com/tools/help/monkeyrunner_concepts.html.

[5] U. Bayer, C. Kruegel, and E. Kirda. Ttanalyze: A toolfor analyzing malware. In Proc. of EICAR, 2006.

[6] F. Bellard. QEMU, a fast and portable dynamictranslator. In Proc. of USENIX ATC, 2005.

[7] D. Bornstein. Dalvik VM internals. In Google I/O,2008.

[8] D. Brumley, C. Hartwig, Z. Liang, J. Newsome,D. Song, and H. Yin. Automatically identifyingtrigger-based behavior in malware. Botnet Detection,2008.

[9] C. Cadar, D. Dunbar, and D. R. Engler. Klee:Unassisted and automatic generation of high-coveragetests for complex systems programs. In OSDI, 2008.

[10] L. Cavallaro, P. Saxena, and R. Sekar. On the limits ofinformation flow techniques for malware analysis andcontainment. In DIMVA, 2008.

[11] Contagio Mobile. Mila Parkour.http://contagiominidump.blogspot.com.

[12] D. Desai. Malware Analysis Report: Trojan:AndroidOS/Zitmo, Semptember 2011.http://www.kindsight.net/sites/default/files/

android_trojan_zitmo_final_pdf_17585.pdf.

[13] W. Enck, P. Gilbert, B. Chun, L. Cox, J. Jung,P. McDaniel, and A. Sheth. Taintdroid: aninformation-flow tracking system for realtime privacymonitoring on smartphones. In USENIX OSDI, 2010.

[14] W. Enck, D. Octeau, P. McDaniel, and S. Chaudhuri.A study of android application security. In USENIXSecurity, 2011.

[15] W. Enck, M. Ongtang, and P. McDaniel. Onlightweight mobile phone application certification. InACM CCS, 2009.

[16] A. P. Felt, E. Chin, S. Hanna, D. Song, andD. Wagner. Android permissions demystified. In Proc.of CCS, 2011.

[17] T. Garfinkel and M. Rosenblum. A Virtual MachineIntrospection Based Architecture for IntrusionDetection. In Proc. of NDSS, 2003.

[18] S. Hanna, L. Huang, E. Wu, S. Li, C. Chen, andD. Song. Juxtapp: A scalable system for detectingcode reuse among android applications. In Proc. ofDIMVA, 2012.

[19] Iseclab. Anubis. http://anubis.iseclab.org.

[20] J. Kennedy. Revenues from app downloads to reachus$26bn in 2013, Sep 2013.

[21] R. King. Google readies android ’kitkat’ amid 1 billiondevice activations milestone, Sep 2013.

[22] T. Kuittinen. Google play app revenue rockets to morethan half of ios in august, Sep 2013.

[23] A. Lanzi, D. Balzarotti, C. Kruegel,M. Christodorescu, and E. Kirda. AccessMiner: Using

Page 13: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

system-centric models for malware protection. In Proc.of CCS, 2010.

[24] H. Lockheimer. Bouncer. http://googlemobile.blogspot.it/2012/02/android-and-security.html.

[25] McAfee. Mcafee. http://www.mcafee.com.

[26] A. Moser, C. Kruegel, and E. Kirda. Exploringmultiple execution paths for malware analysis. InProc. of the IEEE Symposium on Security andPrivacy, 2007.

[27] J. Oberheide and C. Miller. Dissecting the Android’sBouncer. SummerCon, 2012. http://jon.oberheide.org/files/summercon12-bouncer.pdf.

[28] H. Peng, C. Gates, B. Sarma, N. Li, Y. Qi,R. Potharaju, C. Nita-Rotaru, and I. Molloy. Usingprobabilistic generative models for ranking risks ofandroid apps. In ACM CCS, 2012.

[29] Redacted. Redacted for blind sumbission. , Redacted.

[30] C. Rossow, C. J. Dietrich, C. Grier, C. Kreibich,V. Paxson, N. Pohlmann, H. Bos, and M. van Steen.Prudent practices for designing malware experiments:Status quo and outlook. In IEEE S&P, 2012.

[31] G. Sarwar, O. Mehani, R. Boreli, and M. A. Kaafar.On the effectiveness of dynamic taint analysis forprotecting against private information leaks onAndroid-based devices. In SECRYPT, July 2013.

[32] A. Slowinska and H. Bos. Pointless tainting?:evaluating the practicality of pointer tainting. InW. Schroder-Preikschat, J. Wilkes, and R. Isaacs,editors, EuroSys, pages 61–74. ACM, 2009.

[33] The Honeynet Project. Droidbox.https://code.google.com/p/droidbox/.

[34] C. Willems, T. Holz, and F. Freiling. Towardautomated dynamic malware analysis usingcwsandbox. IEEE S&P, 2007.

[35] R. Xu, H. Saıdi, and R. Anderson. Aurasium:Practical policy enforcement for android applications.In Proc. of USENIX Security, 2012.

[36] L.-K. Yan and H. Yin. DroidScope: SeamlesslyReconstructing OS and Dalvik Semantic Views forDynamic Android Malware Analysis. In Proc. ofUSENIX Security, 2012.

[37] Y. Zhang, M. Yang, B. Xu, Z. Yang, G. Gu, P. Ning,X. S. Wang, and B. Zang. Vetting undesirablebehaviors in android apps with permission useanalysis. In ACM CCS, 2013.

[38] C. Zheng, S. Zhu, S. Dai, G. Gu, X. Gong, X. Han,and W. Zou. SmartDroid: an automatic system forrevealing UI-based trigger conditions in Androidapplications. In Proc. of SPSM, 2012.

[39] W. Zhou, Y. Zhou, X. Jiang, and P. Ning. Detectingrepackaged smartphone applications in third-partyandroid marketplaces. In Proc. of CODASPY, 2012.

[40] Y. Zhou and X. Jiang. Android Malware GenomeProject. http://www.malgenomeproject.org/.

[41] Y. Zhou and X. Jiang. Dissecting android malware:Characterization and evolution. In Proc. of the IEEESymposium on Security and Privacy, 2012.

[42] Y. Zhou, Z. Wang, W. Zhou, and X. Jiang. Hey, you,get off of my market: Detecting malicious apps inofficial and alternative android markets. In Proc. ofNDSS, 2012.

APPENDIXA. OVERALL RESULTS ON MCAFEE DATASET

Page 14: On the Reconstruction of Android Malware Behaviors · More recently, Yan and Yin proposed DroidScope, a general-purpose VM-based out-of-the-box framework to build dy-namic analysis

Malw

are

Samples

Behavior

Incr.

Family

w/Add.

w/oStim

.Behavior

Behaviors

w/Stim

.

Ack

posts

1/1

4+

3(+

75%

)A

ctrack

1/1

4+

1(+

25%

)A

ndro

idSM

S2/2

0+

1(⊥

)A

nserv

er13/21

16.4

8+

5.2

(+32%

)A

pkM

on

1/2

49

+1

(+2%

)A

ppH

nd

4/4

37.2

5+

16.8

(+45%

)A

reSpy

1/1

11

+6

(+55%

)A

rspam

1/1

3+

2(+

67%

)B

ack

Reg

1/1

78

+12

(+15%

)B

ack

script

2/6

9.6

7+

19.5

(+202%

)B

aseB

ridge

10/12

4.5

+3.3

(+73%

)B

gyoulu

3/5

17.6

+4

(+23%

)B

ookF

ri1/1

15

+4

(+27%

)C

aro

tap

2/2

4+

3(+

75%

)C

oolp

ap

erleek1/1

55

+4

(+7%

)C

rusew

in4/4

6.2

5+

8.5

(+136%

)D

ialer

0/1

1+

0(+

0%

)D

iutesE

x23/43

26.5

8+

8.9

(+33%

)D

IYA

ds

18/18

163.7

2+

37.6

(+23%

)D

ougaL

eaker

16/16

4+

1.6

(+40%

)D

rad

5/5

10.6

+6

(+57%

)D

rd.*

30/32

24.7

4+

7.5

5(+

31%

)D

roid

Delu

xe

1/1

9+

1(+

11%

)D

roid

KungF

u63/85

31.0

2+

6.1

(+20%

)D

ropD

ialer

2/11

0+

1.5

(⊥)

Eco

batry

1/1

25

+1

(4%

)E

ICA

R0/2

1.5

+0

(+0%

)E

neso

luty

1/1

11

+2

(+18%

)E

voR

oot

0/1

00

(⊥)

Fake.*

314/677

6.3

9+

5.6

9(+

89%

)F

ladstep

1/1

176

+80

(+45%

)F

lash

Rec

1/2

8+

3(+

38%

)F

ndN

Cll

1/1

36

+2

(+6%

)F

oncy

2/2

1+

4(+

400%

)F

oncy

Dro

pp

er1/1

23

+1

(+4%

)F

rictSpy

8/9

7.5

6+

10

(+132%

)F

rogonal

2/2

27.5

+2.5

(+9%

)F

rutk

1/1

73

+17

(+23%

)F

unsB

ot

2/2

5+

2(+

40%

)G

am

ex1/1

11

+2

(+18%

)

Malw

are

Samples

Behavior

Incr.

Family

w/Add.

w/oStim

.Behavior

Behaviors

w/Stim

.

Gam

exD

ropp

er1/1

8+

1(+

13%

)G

einim

i11/19

23.6

8+

12.4

(+52%

)G

GeeG

am

e1/1

62

+6

(+10%

)G

old

Drea

m7/8

31.1

2+

9.9

(+32%

)G

old

enE

agle

1/1

0+

7(⊥

)G

oneS

ixty

11/11

16.6

4+

5.5

(+33%

)G

psN

ake

0/1

1+

0(+

0%

)H

ipp

oSM

S1/1

16

+4

(+25%

)H

nw

ay0/1

49

+0

(+0%

)Im

log

5/6

19

+9.2

(+48%

)IM

Web

View

er1/1

94

+11

(+12%

)In

stBB

ridge

0/1

9+

0(+

0%

)J

7/13

30.9

6+

3.6

5(+

12%

)Jifa

ke

1/5

1+

4(+

400%

)Jm

sonez

2/2

11.5

+12

(+104%

)L

dB

olt

8/8

46.6

2+

7.8

(+17%

)L

oggerK

id4/4

4.5

+2

(+44%

)L

ogka

re0/1

0+

0(⊥

)L

oveT

rp1/1

5+

6(+

120%

)LV

edu

33/56

26.9

3+

5.2

(+19%

)M

aistea

ler1/1

8+

1(+

13%

)M

aleb

ook

1/1

94

+14

(+15%

)M

ania

1/2

0.5

+2

(+400%

)M

ark

etPay

1/1

98

+7

(+7%

)M

ob.*

11/11

43.6

7+

9.7

5(+

22%

)M

oghava

1/1

0+

2(⊥

)M

oney

Fone

1/1

0+

3(⊥

)N

andro

box

1/1

0+

4(⊥

)N

etisend

1/1

8+

4(+

50%

)N

ickiS

py

2/2

71

+10.5

(+15%

)N

otC

om

patib

le0/1

7+

0(+

0%

)N

yea

rleaker

1/1

23

+5

(+22%

)O

neC

lickF

raud

22/22

16.2

7+

17.2

(+106%

)P

daSpy

1/4

0+

1(⊥

)P

JA

pps

36/39

27.4

1+

6.1

(+22%

)Q

icsom

os

0/1

15

+0

(+0%

)Q

ieTin

g1/1

0+

4(⊥

)Q

uoteD

oor

0/1

6+

0(+

0%

)R

ecCaller

1/1

2+

4(+

200%

)R

ootS

mart

2/2

17

+9

(+53%

)

Malw

are

Samples

Behavior

Incr.

Family

w/Add.

w/oStim

.Behavior

Behaviors

w/Stim

.

RuF

raud

4/6

4.5

+5

(+111%

)SG

Spy

1/1

60

+39

(+65%

)SG

SpyA

ct0/1

0+

0(⊥

)ShdB

reak

0/1

28

+0

(+0%

)Silen

tWap

3/3

2+

5(+

250%

)SM

S.*

16/21

4.7

7+

8.5

9(+

180%

)Sngo

1/1

65

+2

(+3%

)Spitm

o2/2

0+

9(⊥

)SpyB

ubb

2/2

25.5

+20

(+78%

)Spytra

ck1/1

20

+8

(+40%

)Sta

mp

er1/1

63

+7

(+11%

)Stea

myScr

2/2

25.5

+8.5

(+33%

)Steek

15/15

8.4

+2.1

(+25%

)Stin

iter0/1

3+

0(+

0%

)Sum

zand

0/3

7+

0(+

0%

)Susetu

pT

ool

0/1

0+

0(⊥

)Sxjsp

y1/1

24

+4

(+17%

)T

atto

Hack

1/2

6+

1(+

17%

)T

cent

1/1

0+

17

(⊥)

ToorK

ing

1/1

37

+6

(+16%

)T

oorS

atp

3/8

7.5

+1.3

(+17%

)T

opla

nk

6/9

37.4

4+

6(+

16%

)T

wika

bot

1/1

0+

12

(⊥)

TypStu

4/6

0.8

3+

1(+

120%

)U

ranaiC

all

1/1

51

+13

(+25%

)V

DL

oader

10/10

43.7

+8.8

(+20%

)V

idro

1/1

58

+16

(+28%

)V

old

brk

9/17

48.8

2+

1.2

(+2%

)W

alk

Txt

1/1

14

+2

(+14%

)W

apaxy

2/2

0+

9(⊥

)W

oob

oolea

ker

1/1

5+

2(+

40%

)X

anitreS

py

9/9

27.1

1+

5.9

(+22%

)X

obSm

s1/1

28

+15

(+54%

)Y

iCha

10/10

21.5

+4.6

(+21%

)Z

itmo

3/3

2.6

7+

5.7

(+213%

)

Overa

ll836/1365

22.78

+6.54(+

28.7%)

Table

7:

Resu

ltsof

the

stimula

tion

show

nin

three

adja

cent

tables.

First

colu

mn

reports

the

malw

are

fam

ily,seco

nd

colu

mn

reports

the

num

ber

of

sam

ples

that

exhib

itedadditio

nal

beh

avio

rsov

erth

eto

tal

num

ber

of

sam

ples

belo

ngin

gto

the

sam

efa

mily,

third

colu

mn

report

the

avera

ge

num

ber

of

observ

edb

ehav

iors

with

out

stimula

tion

and

last

colu

mn

reports

the

avera

ge

num

ber

of

additio

nal

beh

avio

rsex

hib

itedby

stimula

tedsa

mples

and

their

percen

tage

over

non-stim

ula

tedb

ehav

iors.