Top Banner
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models
33

MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Feb 08, 2017

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

MaMaDroid: Detecting Android Malware by Building Markov Chains of

Behavioral Models

Page 2: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Android & Malware

Android market share is growing…In 2016, 85% of smartphone sales

…and so is the interest of cybercriminalsBypassing two-factor authenticationStealing sensitive information, etc.

2

Page 3: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Current Defenses

Can’t use complex on-device operationsLimited battery and memory resources

Google’s centralized analysisNot perfect, after-the-factMany apps installed outside Play Store

Lots of research in the field! However…Permission-based models prone to false positiveRelying on API calls frequently used by malware needs constant, costly retraining

3

Page 4: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Our Idea

Rely on the sequence of abstracted calls1. Sequence captures the behavioral model2. Abstraction provides resilience to API changes

Intuition: malware uses calls for different actions and in different order than benign apps

E.g. android.media.MediaRecorder used by any app with permission to record audioOnly using it after calls to getRunningTasks(), which allows to record conversations, may suggest maliciousness

4

Page 5: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Overview

5

Page 6: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Call Graph Extraction

Based on static analysisGiven an apk, extract call graphs

ToolsSoot (Java optimization and analysis framework)FlowDroid (ensures contexts & flows preserved)

6

Page 7: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

7

Page 8: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Call Graph

8

Page 9: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Overview

9

Page 10: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Sequence Extraction

Soot gives the sequence of functions that are potentially called by the program, but…

Each execution could take a specific branch of the graph and only execute a subset of the calls

When running example multiple times…Execute() may be followed by different calls, e.g., getShell() only in try or getShell() + getMessage() in catch

10

Page 11: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Sequence Extraction (cnt’d)

We proceed as follows…1. Identify set of entry nodes2. Enumerate reachable paths3. Output set of all paths as the sequences of API calls

11

Page 12: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Abstraction

PackagesUsing the list of 243 packages (as of API level 24) + 95 from the Google APIPackages defined by developers à “self-defined”If we can’t tell what its class implements à “obfuscated”

Families9 families: android, google, java, javax, xml, apache, junit, json, domPlus self-defined and obfuscated

12

Page 13: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Example

13

Page 14: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Overview

14

Page 15: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Markov Chain

Memoryless modelsProb. transitioning from a state to another only depends on the current state

Represented as a set of nodesEach corresponding to a different state, and a set of edges labeled with the probability of transition.

Sum of all probabilities associated to all edges from any node is exactly 1

15

Page 16: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Markov-chain based modeling

Building the Markov ChainsFrom the sequence of abstracted API calls, each package/family is a state, transition is the probability of moving from one to another

16

Page 17: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Feature Extraction

For each app:Feature vector = probabilities of transitioning from one state to another in the Markov chainWith families, 11 possible states à 121 possible transitions in each chainWith packages, 340 states à 115,600 transitions

Principal Component Analysis (PCA)Standard way to reduce/refine features

17

Page 18: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Overview

18

Page 19: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Classification

Build a classifier using the extracted featuresEach app labeled as benign or malware

Can use a few standard algorithms for this task…Random Forests1-NN, 3-NNSVMMaybe deep learning?

19

Page 20: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Datasets

20

Page 21: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

How many API calls?

21

Page 22: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Android/Google family calls?

22

Page 23: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Evaluation

(1) Accuracy of classification on benign and malicious samples developed around the same time

(2) Robustness to the evolution of malware as well as of the Android framework (using older datasets for training and newer ones for testing and vice-versa)

23

Page 24: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Same Year

24

family

package

Page 25: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Training on older samples

25

family

package

Page 26: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Training on newer samples

26

family

package

Page 27: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

MaMaDroid vs DroidAPIMiner

27

Page 28: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Case Studies (2016/newbenign)

False Positives (164 samples)Most of them “dangerous permissions”E.g., SMS permissions not clear why requested

False Negatives (114 samples)Actually not classified as malware by VirusTotal, might actually be legitimateMost of them adware

28

Page 29: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Evasion

Repackaging benign appsDifficult to embed malicious code while keeping similar Markov chain, viceversa is also hard

Imitating Markov chainsLikely ineffective

Obfuscation/ManglingStill captured by the [obfuscated] abstraction

More in the paper…29

Page 30: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Limitations

Classification is memory hungry

Soot is buggy, we lose ~4% of the samples

Limits of static analysis only methods

30

Page 31: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Future Work

Further investigate resilience to evasionFocus on repackaged malicious appsInjection of API calls to mess with Markov chains

EnhancementsFine-grained abstractions (e.g., class)Seed with dynamic analysis

31

Page 32: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Thank you!

32

Paper to appear at NDSS 2017:E. Mariconti, L. Onwuzurike, P. Andriotis,E. De Cristofaro, G. Ross, G. Stringhini.MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Model

Page 33: MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models

Thank you!

33