Detection of ASCII Malware

Post on 13-Jan-2016

44 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Detection of ASCII Malware. Parbati Kumar Manna Dr. Sanjay Ranka Dr. Shigang Chen. Internet Worm and Malware. Huge damage potential Infects hundreds of thousands of computers Costs millions of dollars in damage Melissa, ILOVEYOU, Code Red, Nimda, Slammer, SoBig, MyDoom - PowerPoint PPT Presentation

Transcript

Detection of ASCII Malware

Parbati Kumar Manna

Dr. Sanjay Ranka

Dr. Shigang Chen

2

Internet Worm and Malware

• Huge damage potential Infects hundreds of thousands of

computers Costs millions of dollars in damage Melissa, ILOVEYOU, Code Red,

Nimda, Slammer, SoBig, MyDoom

• Mostly uses Buffer Overflow

• Propagation is automatic (mostly)

3

Recent Trends

• Shift in hacker’s mindset

• Malware becoming increasingly evasive and obfuscative

• Emergence of Zero-day worms

• Arrival of Script Kiddies

4

Motivation for ASCII Attacks

• Prevalence of servers expecting text-only input

• Text-based protocols

• Presumption of text being benign

• Deployment of ASCII filter for bypassing text

5

IDS Detecting ASCII Attack?

• Disassembly-based IDS

All jump instructions are ASCII

Higher proportion of branches

Exponential disassembly cost

High processing overhead for IDS

• Frequency-based IDS

PAYL evaded by ASCII worm

6

Buffer Overflow

7

• Opcode Unavailability Shellcode requires binary opcodes Here only xor, and, sub, cmp etc. Must generate opcodes dynamically

• Difficulty in Encryption No backward jump Can’t use same decrypter routine

for each encrypted block No one-to-one correspondence

between ASCII and binary

Constraints of ASCII Malware

0 m a y v a r y

ASCII binary

8

Creation of ASCII Malware

9

Buffer Overflow using ASCII

Overflowing a buffer using an ASCII string:

10

• Opcode Unavailability Dynamic generation of opcodes

needs more ASCII instructions for each binary instruction

• Difficulty in Encryption No backward jump means

decrypter block for each encrypted block must be hardcoded

Long sequence of contiguous valid instructions likely high MEL

Detection of ASCII Malware

What is this MEL?

11

• Indicates maximum length of an execution path

Need to disassemble (and execute) from all possible entry points

All branching must be considered• Abstract payload execution

Used for binary worms with sled Effectiveness dwindled presently

Maximum Executable Length

12

Benign Text has Low MEL

• Contains characters that correspond to invalid instructions

Privileged Instruction (I/O) Arbitrary Segment Selector More Memory-accessing

instructions – may use uninitialized registers

Long sequence of contiguous valid instructions unlikely low MEL

13

Proposed Solution

Question:

• How long is “long”?

• Find out the maximum length of valid instruction sequence

• If it is long enough, the stream contains a malware

14

• Toss a coin n times

• What is the probability that the max distance between two consecutive heads is ?

Probabilistic Analysis

Head (H) Invalid Instruction (I)

Tail (T) Valid Instruction (v)

T H T T H T T T T T H T T TV I V V I V V V V V I V V V

15

Probabilistic Analysis

n = number of coin tosses p = probability of a head Xi = R.V.s for inter-head

distancesXmax = Max inter-head distanceC.D.F of Xmax = Prob [Xmax ≤ x]

= [1 – p(1-p)x ]n

F.P. rate = 1 - Prob [Xmax ≤ τ] = 1 - [1 – p(1-p)τ ]n

16

Probabilistic Analysis

For a fixed N = k (exactly k invalid instructions)

17

Probabilistic Analysis

For all possible values of N:

18

Threshold Calculation

n , p , (false positive rate)

(max inter-head distance)

Known

Unknown

)1log(

log))1(1log(1

p

pn

Threshold

19

Independence Assumption

2 test contingency table

Observed Expected

I2 is valid

I2 is invali

d

I1 is valid

I2 is invalid

I1 is valid 8960 2797 8922 2835

I1 is invalid 2797 938 2835 900

• Validity of an instruction is an independent event

• All the Xi’s are independent (while Xi = n)

20

Threshold Calculation

With increasing n, we must choose a larger to keep the same rate of false positive

21

Threshold Calculation

With decreasing p, we must choose a larger to keep the same rate of false positive

22

Determine n

size)n instructio (average

)charactersinput ofnumber (

I

Cn

E[I] = E[Prefix chain length] + E[core instruction length]

Obtained from character frequency of input data

23

1.Privileged instructions

2.Wrong Segment Prefix Selector

3.Un-initialized memory access

Determine p

Invalid Instructions

Only 1. and 2. can be determined on a standalone basis

24

Experimental Setup

25

Implementation

26

Experimental Setup

• Benign data setup ASCII stream captured from live CISE network

using Ethereal

• Malicious data setup Existing framework used to generate ASCII worm

by converting binary worms

• Promising experimental results for max valid instruction length Benign: all max values all below threshold Malicious: values significantly higher than

27

Experimental Results (DAWN)

28

Experimental Results (APE-L)

29

Contrasting with APE

• Full content examination

• Threshold calculation

• Sled Vs. malware

• Exploiting text-specific properties

30

Multilevel Encryption

Encryption

Decryption

binary ASCII ASCII

ASCII ASCII binary

Only Visible decrypter

31

Multilevel Encryption

Text0x20 – 0x3F

Text0x40 – 0x5F

Text0x60 – 0x7E

Binary

Binary

32

Questions

33

Thank you

top related