Top Banner
Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of Computer Science
31

Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

Jul 19, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

StatisticalDeobfuscation forAndroidApplications

BenjaminBichsel

VeselinRaychev

PetarTsankov

MartinVechev

DepartmentofComputerScience

Page 2: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

WhyDe-obfuscate?

GooglePlay

Androidbinaries(APKs)(nocodeavailable)

NumberofAPKsonGooglePlay 2.4MAPKs

’10 ’12 ’14 ’16

Page 3: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

LayoutObfuscationinAndroid

Obfuscate

Non-descriptivenames

Namesprovidekeysemanticinformation

package com.example.dbhelper

class DBHelper extends SQLiteHelper {SQLiteDatabase db;

public DBHelper(Context ctx) {db = getWritableDatabase();

}

Cursor execSQL(String str) {return db.rawQuery(str);

}}

package a.b.c

class a extends SQLiteHelper {SQLiteDatabase b;

public a(Context ctx) {b = getWritableDatabase();

}

Cursor c(String str) {return b.rawQuery(str);

}}

Somenamesremain

Page 4: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

LayoutObfuscationinAndroid

Obfuscate

Non-descriptivenames

Namesprovidekeysemanticinformation

package com.example.dbhelper

class DBHelper extends SQLiteHelper {SQLiteDatabase db;

public DBHelper(Context ctx) {db = getWritableDatabase();

}

Cursor execSQL(String str) {return db.rawQuery(str);

}}

package a.b.c

class a extends SQLiteHelper {SQLiteDatabase b;

public a(Context ctx) {b = getWritableDatabase();

}

Cursor c(String str) {return b.rawQuery(str);

}}

Somenamesremain

SecurityChallenges

CodeInspection

Third-partyLibraryDetection

… manyothers

Page 5: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

LayoutObfuscationinAndroidNon-descriptive

names

Namesprovidekeysemanticinformation

package a.b.c

class a extends SQLiteHelper {SQLiteDatabase b;

public a(Context ctx) {b = getWritableDatabase();

}

Cursor c(String str) {return b.rawQuery(str);

}}

package com.example.dbhelper

class DBHelper extends SQLiteHelper {SQLiteDatabase db;

public DBHelper(Context ctx) {db = getWritableDatabase();

}

Cursor execSQL(String str) {return db.rawQuery(str);

}}

Somenamesremain

Canwereverselayoutobfuscation

Page 6: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

LayoutObfuscationinAndroid

package com.example.dbhelper

class DBHelper extends SQLiteHelper {SQLiteDatabase db;

public DBHelper(Context ctx) {db = getWritableDatabase();

}

Cursor execSQL(String str) {return db.rawQuery(str);

}}

package a.b.c

class a extends SQLiteHelper {SQLiteDatabase b;

public a(Context ctx) {b = getWritableDatabase();

}

Cursor c(String str) {return b.rawQuery(str);

}}

Non-descriptivenames

NamesprovidekeysemanticinformationYes,withroughly80%accuracy!

www.apk-deguard.com

Page 7: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

www.apk-deguard.com

Releasedlastweek,sofar:>5Kusers >5GBAPKs

Redditposts/comments Tweets

... ...

Page 8: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

HowDoesDeGuard Work?

Page 9: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard:SystemOverview

Staticanalysis TransformMAP

Inference

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {

b = getWritableDB();}

}

class DBHelper extends SQLiteHelper{SQLiteDatabase db;public DBHelper(Context ctx) {

db = getWritableDB();}

}

PredictionPhase

Open-source,unobfuscatedapplications

LearningPhase

Staticanalysis Training

Probabilisticmodel𝑃 )

Semanticrepresentation

ObfuscatedCode De-obfuscatedCode

Page 10: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ProbabilisticGraphicalModels

Page 11: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

SQLiteHelper

getWritableDB

aextends

getsfield-in

b

ProbabilisticGraphicalModelsname1 name2 weight

𝑓% SQLiteHelper DBUtils 0.3𝑓& SQLiteHelper DBHelper 0.2

name1 name2 weight𝑓' getWritableDB db 0.7𝑓( getWritableDB instance 0.4

name1 name2 weight𝑓) DBUtils instance 0.5𝑓* DBHelper db 0.4𝑓+ … … …

Graph+featuresdefineaprobabilisticgraphicalmodel

𝑂 𝐾𝑃 ) == 𝑃 𝑎, 𝑏 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑔𝑒𝑡𝑊𝑟𝑖𝑡𝑎𝑏𝑙𝑒𝐷𝐵)

= 1𝑍 exp(0.3 I 𝑓% 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑎

+0.2 I 𝑓& 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑎 +⋯ )

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

𝑂 Unknownvariables

Knownvariables

𝑓%, 𝑓&, . .

𝐾

Featurefunctions

Page 12: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

SQLiteHelper

getWritableDB

aextends

getsfield-in

b

ProbabilisticGraphicalModelsname1 name2 weight

𝑓% SQLiteHelper DBUtils 0.3𝑓& SQLiteHelper DBHelper 0.2

name1 name2 weight𝑓' getWritableDB db 0.7𝑓( getWritableDB instance 0.4

name1 name2 weight𝑓) DBUtils instance 0.5𝑓* DBHelper db 0.4𝑓+ … … …

Graph+featuresdefineaprobabilisticgraphicalmodel

𝑂 𝐾𝑃 ) == 𝑃 𝑎, 𝑏 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑔𝑒𝑡𝑊𝑟𝑖𝑡𝑎𝑏𝑙𝑒𝐷𝐵)

= 1𝑍 exp(0.3 I 𝑓% 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑎

+0.2 I 𝑓& 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑎 +⋯ )

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

𝑂 Unknownvariables

Knownvariables

𝑓%, 𝑓&, . .

𝐾

Featurefunctions

NextHowaretheweightsandfeatureslearned?

Page 13: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

Learning

Page 14: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

Learning

UnobfuscatedAPKs

name1 name2 weight𝑓% SQLiteHelper DBUtils 0.3𝑓& SQLiteHelper DBHelper 0.2𝑓' getWritableDB db 0.7𝑓( getWritableDB instance 0.4𝑓) DBUtils instance 0.5𝑓* DBHelper db 0.4𝑓+ … … …

name1 name2𝑓% SQLiteHelper DBUtils𝑓& SQLiteHelper DBHelper𝑓' getWritableDB db𝑓( getWritableDB instance𝑓) DBUtils instance𝑓* DBHelper db𝑓+ … …

Computeweightsthatmaximize𝑃 𝑂 = 𝑜N 𝐾 = 𝑘N foralltrainingsamples(𝑜N, 𝑘N)

Staticanalysis

TrainModel

Featuretemplates

Features(withcandidatenames)

Dependencygraphs

28templates

Actualgraphshave>1,000nodes

>2,000

>100,000

Page 15: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard:SystemOverview

Staticanalysis TransformMAP

Inference

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {

b = getWritableDB();}

}

class DBHelper extends SQLiteHelper{SQLiteDatabase db;public DBHelper(Context ctx) {

db = getWritableDB();}

}

PredictionPhase

Open-source,unobfuscatedapplications

LearningPhase

Staticanalysis Training

ObfuscatedCode De-obfuscatedCode

Probabilisticmodel𝑃 )

Page 16: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard:SystemOverview

Staticanalysis TransformMAP

Inference

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {

b = getWritableDB();}

}

class DBHelper extends SQLiteHelper{SQLiteDatabase db;public DBHelper(Context ctx) {

db = getWritableDB();}

}

PredictionPhaseObfuscatedCode De-obfuscatedCode

Probabilisticmodel𝑃 )

Page 17: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ObfuscatedCode

SQLiteHelper

getWritableDB

aextends

getsfield-in

b

PredictionPhasename1 name2 weightSQLiteHelper DBUtils 0.3SQLiteHelper DBHelper 0.2

name1 name2 weightgetWritableDB db 0.7getWritableDB instance 0.4

name1 name2 weightDBUtils instance 0.5DBHelper db 0.4DBUtils db 0.2DBHelper instance 0.2

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

Staticanalysis

Page 18: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ObfuscatedCode

SQLiteHelper

getWritableDB

aextends

getsfield-in

b

PredictionPhasename1 name2 weightSQLiteHelper DBUtils 0.3SQLiteHelper DBHelper 0.2

name1 name2 weightgetWritableDB db 0.7getWritableDB instance 0.4

name1 name2 weightDBUtils instance 0.5DBHelper db 0.4DBUtils db 0.2DBHelper instance 0.2

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

Programanalysis

MAPInference

Candidate assignment𝒐 𝑷 𝒐 𝒌)*a =DBUtils b =instance 1.2a =DBHelper b =db 1.3a =DBUtils b =db 0.8a =DBHelper b =instance 1.2

*Non-normalized

�⃗� = 𝑎𝑟𝑔𝑚𝑎𝑥𝑃 𝑂 = �⃗�′ 𝐾 = 𝑘�⃗�′ ∈ Ω

Page 19: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ObfuscatedCode

SQLiteHelper

getWritableDB

aextends

getsfield-in

b

PredictionPhasename1 name2 weightSQLiteHelper DBUtils 0.3SQLiteHelper DBHelper 0.2

name1 name2 weightgetWritableDB db 0.7getWritableDB instance 0.4

name1 name2 weightDBUtils instance 0.5DBHelper db 0.4DBUtils db 0.2DBHelper instance 0.2

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

Programanalysis

MAPInference

Candidate assignment𝒐 𝑷 𝒐 𝒌)*a =DBUtils b =instance 1.2a =DBHelper b =db 1.3a =DBUtils b =db 0.8a =DBHelper b =instance 1.2

*Non-normalized

�⃗� = 𝑎𝑟𝑔𝑚𝑎𝑥𝑃 𝑂 = �⃗�′ 𝐾 = 𝑘�⃗�′ ∈ Ω

Page 20: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ObfuscatedCode

SQLiteHelper

getWritableDB

DBHelperextends

getsfield-in

db

PredictionPhasename1 name2 weightSQLiteHelper DBUtils 0.3SQLiteHelper DBHelper 0.2

name1 name2 weightgetWritableDB db 0.7getWritableDB instance 0.4

name1 name2 weightDBUtils instance 0.5DBHelper db 0.4DBUtils db 0.2DBHelper instance 0.2

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

Staticanalysis

Deobfuscated Code

class DBHelper extends SQLiteHelper {SQLiteDatabase db; public DBHelper(Context ctx) {db = getWritableDB();

}}

Transform

Page 21: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

PreservingSemantics

class Aint aObject bvoid a()

class B extends

Avoid b()void c(A a)

Syntacticconstraintse.g.fieldswithinaclassmusthavedistinctnames

Semanticconstraintse.g.methodoverloadsmustbepreserved

Freelyrenamingfields/variables/methodsmaychange theprogramsemantics

musthavedistinctnames

musthavedistinctnames

mustnotoverride

methoda()

Page 22: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard:SystemOverview

Staticanalysis TransformMAP

Inference

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {

b = getWritableDB();}

}

class DBHelper extends SQLiteHelper{SQLiteDatabase db;public DBHelper(Context ctx) {

db = getWritableDB();}

}

PredictionPhase

Open-source,unobfuscatedapplications

LearningPhase

Staticanalysis Training

ObfuscatedCode De-obfuscatedCode

Page 23: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard Implementation

Page 24: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard Implementation

www.apk-deguard.com

§ StaticanalysisframeworkforJavaandAndroid

StaticAnalysis

LearningandMAPInference§ Scalableopen-sourceframework

forstructuredprediction§ Open-source:http://nice2predict.org

§ Trainingdata:2Kopen-source,unobfuscated Androidapplications

Page 25: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

1. CanDeGuard reverseProGuard?2. CanDeGuard detectthird-partylibraries?3. IsDeGuard usefulformalwareinspection?

EvaluationEvaluation

Page 26: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ProGuard Experiment

SourceCode

ObfuscatedAPK De-obfuscatedAPK

Non-obfuscatedAPK =?

Page 27: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

AfterObfuscation

Fields Methods Classes Packages Total

20

40

60

80

100

0

%ofprogramelements

only13%knownnames

Knownnames

Page 28: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

CanDeGuard reverseProGuard?

Fields Methods Classes Packages Total

20

40

60

80

100

0

%ofprogramelements

Knownnames

Correctlypredictednames

Mis-predictednames

Packagenamesaredirectlyusedto

predictthird-partylibraries

1.6%knownnames

80.6%correctnames

80%ofthenamesareidenticaltotheoriginalones

i.e.,identicaltotheoriginalnames

Page 29: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

CanDeGuard DetectThird-PartyLibraries?

LibraryCode

SourceCode

ObfuscatedAPK

ProGuardobfuscateslibrarypackagenames

De-obfuscatedAPK

?

Precision:93.1%Recall:91%

ProGuard

Page 30: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

IsDeGuard UsefulforMalwareInspection?

class d {String a = System.getProperty(..)char[] b;byte [] c;byte[] a(String) {}

}

class Base64 {String NL = System.getProperty(..)char[] ENC;byte [] DEC;byte[] decode(String) {}

}

De-obfuscatingsamplesfromtheAndroidMalwareGenomeProject

MalwareSample De-obfuscatedMalwareSample

Base64Decoder

Revealsstringdecoders

Revealsclassesthathandlesensitivedata(e.g.Location)

Hardtohandleheavily-obfuscatedcode(e.g.reflection)

Page 31: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

package com.example.dbhelper

class DBHelper extends SQLiteHelper {SQLiteDatabase db;

public DBHelper(Context ctx) {db = getWritableDB();

}

Cursor execSQL(String str) {return db.rawQuery(str);

package a.b.c

class a extends SQLiteHelper {SQLiteDatabase b;

public a(Context ctx) {b = getWritableDB();

}

Cursor c(String str) {return b.rawQuery(str);

Tryonline:www.apk-deguard.com Fields Methods Classes Packages Total

20

40

60

80

100

0

SQLiteHelper

getWritableDB

a

b

name1 name2 weight

SQLiteHelper DBUtils 0.3

SQLiteHelper DBHelper 0.2

name1 name2 weight

getWritableDB db 0.7

getWritableDB instance 0.4

ProbabilisticModels

HighPredictionAccuracy

Moreinfo:http://www.srl.inf.ethz.ch/spas

Summary