Top Banner
ZOZZLE: FAST AND PRECISE IN-BROWSER JAVASCRIPT MALWARE DETECTION Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium (August, 2011)
34

Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

Dec 26, 2015

Download

Documents

Gwendolyn York
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

ZOZZLE: FAST AND PRECISE IN-BROWSER JAVASCRIPT

MALWARE DETECTION

Charles Curtsinger

UMass at Amherst

Benjamin Livshits and Benjamin Zorm

Microsoft Research

Christian Seifert

Microsoft

20th USENIX Security Symposium (August, 2011)

Page 2: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

ZOZZLE: LOW-OVERHEAD MOSTLY STATIC JAVASCRIPT

MALWARE DETECTION

Charles Curtsinger

UMass at Amherst

Benjamin Livshits and Benjamin Zorm

Microsoft Research

Christian Seifert

Microsoft

Microsoft Research Technical Report (November, 2010)

Page 3: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 3

Outline

Introduction Observation on Offline Nozzle Design Experiment Evaluation

2011/5/24

Page 4: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 4

Introduction

In the last several years, we have seen mass-scale exploitation of memory-based vulnerabilities migrate towards heap spraying attacks.

But many solutions are not lightweight enough to be integrated into a commercial browser.

2011/5/24

Page 5: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 5

About Nozzle

The overhead of this runtime technique may be 10% or higher.

This paper is based on our experience using NOZZLE for offline.

Offline scanning is also not as effective against transient malware that appears and disappears frequently.

2011/5/24

Page 6: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 6

About Zozzle

ZOZZLE is integrated with the browser’s JavaScript engine to collect and process JavaScript code that is created at runtime.

Our focus in this paper is on creating a very low false positive, low overhead scanner.

2011/5/24

Page 7: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 7

Observation on Offline Nozzle

Once we determine that JavaScript is malicious, we invested a considerable effort in examining the code by hand and categorizing it in various ways.

we investigated 169 malware samples.

2011/5/24

Page 8: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 8

Distribution of Different Exploit Samples

2011/5/24

Page 9: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 9

Transience of Detected Malicious URLs

2011/5/24

Page 10: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 10

Javascript eval Unfolding

2011/5/24

Page 11: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 11

Distribution of Context Counts

2011/5/24

Page 12: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 12

Design

2011/5/24

Page 13: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 13

Training Data Extraction and Labeling We start by augmenting the JavaScript

engine in a browser with a “deobfuscator” that extracts and collects individual fragments of JavaScript.Detours [link]jscript.dll [link]Compile function

(COlescript::Compile())

2011/5/24

Page 14: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 14

Feature Extraction

We create features based on the hierarchical structure of the JavaScript abstract syntax tree(AST).

2011/5/24

Page 15: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 15

Feature Selection

χ2 test

2011/5/24

With feature Without feature

malicious A C

benign B D

%9.9983.10

22

DCBADBCA

CBAD

Page 16: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 16

Classifier Training

Naϊve Bayesian classifier

Assume to be conditionally independent

2011/5/24

n

kikkin

n

inini

LFFFPLFFP

FFP

LFFPLPFFLP

1111

1

11

,,,,,

,,

,,,,

n

n

kiki

n

n

kikki

ni FFP

LFPLP

FFP

LFFFPLPFFLP

,,,,

,,,,,

1

1

1

111

1

Page 17: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 17

Naϊve Bayesian classifier

Complexity: linear time

2011/5/24

n

kikiscript

n

n

kiki

nibelspossibleLai

script

LFPLPC

FFP

LFPLPFFLPC

1

1

11

maxarg

,,maxarg,,maxarg

Page 18: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 18

Fast Pattern Matching

2011/5/24

Page 19: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 19

Fast Pattern Matching (cont.)

2011/5/24

Page 20: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 20

Experiment

Malicious Samples919 deobfuscated malicious context

Benign SamplesAlexa top 50 URLs7,976 contexts

2011/5/24

Page 21: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 21

Feature Selection

hand-picked vs. automatically selected

2011/5/24

Page 22: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 22

Evaluation

HP xw4600 workstationIntel Core2 Duo 3.16 GHz4 GB memoryWindows 7 64-bit Enterprise

2011/5/24

Page 23: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 23

Effectiveness

2011/5/24

Page 24: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 24

Training Set Size

2011/5/24

Page 25: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 25

Feature Set Size

2011/5/24

Page 26: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 26

Comparison with Other Techniques

2011/5/24

Page 27: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 27

Performance: Context Size

2011/5/24

Page 28: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 28

Performance: Feature Set

2011/5/24

Page 29: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 29

THANK YOU

2011/5/24

Page 30: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 30

JAVASCRIPT OBFUSCATION

2011/5/24

Page 31: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 31

I think these is the all…

2011/5/24

unescape(“%48%65%6c%6c%6f%57%6f%72%6c%64”)

“\u0048\u0065\u006C\u006C\u006F\u0057\u006F\u0072\u006C\u0064”

document.write(“alert(‘1’)”);eval(“alert(1)”);

"H976e246l3l2o19W42o45r7l88d734".replace(/[09]/g,"")

Page 32: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 32

If I want to eval…

<script>Fucntion("alert(‘1')")();setTimeout("alert(‘1')“;execScript("alert(‘1')", "javascript");[].constructor.constructor('alert(1)')();window["eval"]("alert(‘1’)");

</script>

2011/5/24

Page 33: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 33

In the network, I find …

<script>([][(![]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+

[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[!+[]+!+[]]]()[(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]])(+!+[])

</script>

2011/5/24

Page 34: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.

A Seminar at Advanced Defense Lab 34

THE END

2011/5/24