Zero-day Attack 방어를 위한 네트워크 기반 탐지 방법

1

Zero-day Attack 방어를 위한 네트워크 기반 탐지 방법

2007 년 4 월 12 일

김익균 ([email protected])

한국전자통신연구원 정보보호연구단

2

ContentsContents

Vulnerability & Zero-day Attack

Intrusion Detection

Detection Model

Research Trends : Zero-day Attack Detection

VulnerabilitiesVulnerabilities

Half-life of critical vulnerabilities is 21 days Half of the most prevalent are replaced by new

vulnerabilities every year Lifespan of some vulnerabilities and worms is

unlimited 80% of worms and automated exploits occur in

the first two half-lives

*Source : Gerhard Eschelbeck of Qualys at Blackhat 2004

Laws of Vulnerabilities

3

July 242002

…..

0-day attack

Future wormreleased

2006 ??

SQL server buffer-overflow vulnerability

SlammerWorm released

Jan 252003

185-dayattack

Apr 112004

Apr 302004

SasserWorm released

19-dayattack

LSASS buffer-overflowvulnerability

Vulnerabilitydisclosure

• A zero-day attack is a computer threat that exposes undisclosed or unpatched computer application vulnerabilities. (defined by wikipedia)• May 2005 : Zero-day exploits for unknown vulnerabilities in Mozilla Firefox

Zero-day AttacksZero-day Attacks

Time-gap between vulnerability disclosure and release of a worm that exploits it is decreasing

4

Audit Data

Packet Parsing

Packet Sensor

Pattern MatchingSignature Data Base

Detection Engine

ExpertManager

Rule ManageResponse

Response Manager

ComparisonObjects

Re

spo

nse

Alert

Network

Intrusion Detection - IIntrusion Detection - I

Misused Analysis : Signature-based

5

Signature-based IDSignature-based ID

Today, Deep Packet Inspection capability, ASIC-based Appliance•TippingPoint: IPS 5000ETippingPoint: IPS 5000E•TopLayer: IPS 5500TopLayer: IPS 5500•Cisco: IPS4255Cisco: IPS4255

By 2006, 75 percent of Global 2000 enterprises will replace or augment their firewall approach with deep packet inspection capabilities

By 2005, enterprises will no longer use software-based application proxy firewalls

Source - Deep Packet Inspection: Next Phase of Firewall Evolution

(21 November 2002, Gartner)

By 2006, 75 percent of Global 2000 enterprises will replace or augment their firewall approach with deep packet inspection capabilities

By 2005, enterprises will no longer use software-based application proxy firewalls

Source - Deep Packet Inspection: Next Phase of Firewall Evolution

(21 November 2002, Gartner)

By 2009 the UTM space will be the largest single market.

By 2009 the UTM space will be the largest single market.

DPI : High Performance Pattern Matching

6

Audit Data

Packet Parsing

Packet Sensor

StatisticalData MiningNeural Net.

LearningProfile

AnomalyAnalysis Engine

Learning & Comparison

Objects

Alert

Network

Report

Intrusion Detection - IIIntrusion Detection - IIAnomaly Detection

7

Current Signature Generation Process New worm outbreak Report of anomalies from people via

phone/email/newsgroup Worm trace is captured Manual analysis by security experts Signature generation

Labor-intensive, Human-mediated

Zero-day Attack ProtectionZero-day Attack Protection

Anomaly Detection + Signature Generation + High Performance FW (IPS)

8

Control Flow Hijacking Worm ModelControl Flow Hijacking Worm Model

IPUPR. LYR. PAYLOAD TCP/UDP HDRAttack CodeExploit

(ReturnAddr)Decryption

CodeNOP NOPNOP NOP

• Epsilon (ε) = Exploit Vector• Gamma (γ) = Bogus Control Data• Pi (π) = Payload

* Source [Crandall05]Epsilon-Gamma-Pi Model

9

ε- γ- π Modelε- γ- π Model10

• Epsilon (ε) = HTTP Header

• Gamma (γ) = Return Address

• Pi (π) = Codered Shellcode

* Source [Crandall05]

CodeRed II Case

Control hijacking - exampleControl hijacking - example

Normal Stack Smashed Stack Smashed Stack

Buffer Overflow

11

Recent Worm ExploitsRecent Worm Exploits

Worm Exploits

12

Worm PolymorphicWorm Polymorphic

Worm body

Randomly generates a new keyand corresponding decryptor code

Mutation A

Decrypt and execute

Mutation C

Mutation B

To detect an unknown mutation of a known virus ,

emulate CPU execution of until the current sequence ofinstruction opcodes matches the known sequence for virus body

Polymorphic Worm

13

Polymorphic EnginePolymorphic Engine

ADMutate alters each of these elements NOP substitution with operationally inert commands Shell code encoded by XORing with a randomly

generated key Return address modulated – least significant byte

altered to jump into different parts of NOPs

NOP substituteAnother NOP

Yet another NOPA different NOPHere’s a NOP

XOR’ed Machine Code:execve (/bin/sh)

Modulated Pointer toNOP Substitutes

NOP substituteAnother NOP

Yet another NOPA different NOPHere’s a NOP

XOR’ed Machine Code:execve (/bin/sh)

Modulated Pointer toNOP Substitutes

PolymorphicXOR Decoder

Mutation Engine

14

Metamorphic Code - ExamplesMetamorphic Code - Examples Code reordering

Instructions that are independent are re-ordered

MOV EAX, [X]MOV EBX, [Y]ADD EAX, EBXMOV [X], EAX

MOV EBX, [Y]MOV EAX, [X]ADD EAX, EBXMOV [X], EAX


MOV EAX, [X]MOV EBX, [Y]ADD EAX, EBXPUSH ESIMOV [X], EAXPOP ESI

Garbage Code insertion Instructions are inserted that are semantic no-ops (do not effect the

code and registers, and therefore execution)

Equivalent Code Replacement Register renaming, or semantically equivalent code


XOR EAX, EAXADD EAX, [X]ADD EAX, [Y]MOV [X], EAX

Register-reassignment Swaps the usage of the registers Causes extensive “minor” changes in the

code sequence

15

Zero-Day Attack Detection Zero-Day Attack Detection

Network-based Prevalence Model

• Autograph/Polygraph• Earlybird

Other Type• PayL• PacketVaccine

Malicious Code detection• SigFree• Polymorphic Detection - Network execution• Control Flow Graph

Host-based • MINOS• DACODA

16

Research Trends

Prevalence Model – (1)Prevalence Model – (1)

Key observation : Define worm behavior Content invariance

• Portions of a worm are invariant (e.g. the decryption routine) Content prevalence

• Appears frequently on the network Address dispersion

• Distribution of destination addresses more uniform to spread fast

* Source [Singh04]

(Stefan Savage, UCSD *)

Two consequences Content Prevalence: 1/60

sampled Rabin Fingerpirntng, 40bytes substring, Prevalence threshold is 3

Address Dispersion: Threshold 30 source, 30 destination

Packet content examination can be evaded with simple polymorphism

[1] EarlyBird

17


Key Observations TCP worms that propagate via scanning Worm’s payloads share a common substring

• Vulnerability exploit part is not easily mutable, Not polymorphic

Step 1: Select suspicious flows using heuristics Flows from scanners are suspicious

Step 2: Generate signature using content-prevalence analysis All instances of a worm have a common byte pattern specific to

the worm Content-based Payload Partitioning (COPP)

• Partition if Rabin fingerprint of a sliding window matches Breakmark• Configurable parameters: content block size (minimum, average,

maximum), breakmark, sliding window

* Source [Kim 04][2] Autograph

18

19

A protocol through which multiple distributed Autograph monitors may share information

step1: select suspicious flows using heuristics

step2: generate signature using content prevalence analysis


* Source [Kim 04][2] Autograph

19


No one substring is specific enough BUT, there are multiple substrings

Protocol framing Value used to overwrite return address (Parts of poorly obfuscated code)

Approach : combine the substrings (3 bytes-size)

* Source [Newsome05][3] Polygraph

20

Summary of Prevalence Model DetectionSummary of Prevalence Model Detection

Earlybird Autograph/Polygraph

TrafficContents Prevalence ->

Address dispersionSuspicious Flow Selection ->

Contents Prevalence

Preprocessing 1/64 string sample Session reassembly

Suspicious Traffic NA• Port Scan detection• Session Success Rate• Address Dispersion

Prevalent Content

Extraction Multi-stage Hash Longest Common Substring

Signature size 40 bytes 3 bytes tokensWhitelist Heuristic Heuristic

Demerits

• High False Positive• Sampled Traffic• Heuristic based Whitelist

management• Do not handle polymorphic

worm

• Due to Session Reassemble, Degradation of Processing Power

• High session fail rate due to P2P service

Trends No more super-worm outbreak, since 2004

Prevalence Model

21

Payload AnomalyPayload AnomalyPAYL

Compute a “normal profile” of a site’s unique content flow, and use this information to detect anomalous data

n-gram• is the sequence of n adjacent byte values in a packet payload• A sliding window with width n is passed over the whole payload one

byte at a time and the frequency of each n-gram is computed• The frequency count distribution represents a model of the content flow

(“statistical centroid”) Compare the similarity between test data and the trained models

• Mahalanobis distance• If the distance of a test datum is greater than the threshold, the system

issues an alert

* Source [Wang05] [4] PAYL

22

Normal HTTP RequestNormal HTTP Request CodeRed II CodeRed II

Character Distribution Character Distribution

23

Jump Address Detection –(1)Jump Address Detection –(1)

Vaccine Generation detection of anomalous packet payloads a byte sequence resembling a jump address, and randomization of selected contents

Exploit Detection detect an exploit attempt it should now trigger an exception in a vulnerable program

Vulnerability Diagnosis correlates the exception with the vaccine to acquire information regarding the

exploit the corrupted pointer content and its location in the exploit packet

Signature Generation creates variations of the original exploit to probe the vulnerable program in an effort to identify necessary exploit conditions for generation of a signature

* Source [X. Wang, CCS2006] [5] Packet Vaccine

24

Vaccine Generation A key step in most exploits is to inject a jump address to

redirect the control flow of a vulnerable program Such an address points

• stack or heap in a code-injection attack• global library entry in an existing-code attack

24

Jump Address Detection –(2)Jump Address Detection –(2)

Approach Check every 4-byte sequence(32-bit system) or 8-byte

sequence(64-bit system) Randomize those which fall in the

address range of the potential jump targets in a protected program

Should cause an exception, segmentation fault (SEGV) or illegal instruction fault(ILL)

* Source [X. Wang, CCS2006] [5] Packet Vaccine

Attacker Victim

2. RPC DCOM Request

( TCP 135 )

6. TFTP Download Request

( UDP 69 )

7. Delivery Main Worm Body “Msblaster.exe”

( TFTP : UDP 69 )

DCOM object : Insufficient bounds checking RPC Endpoint Mapper listen : 139, 135, 445, 593

4. Start TFTP Server

msblaster.exe

8. Syn FloodingWindowsUpdate.com

Buffer

Overflow

Listening TCP 4444

3. Shell Code : Binding Port 4444

1. Probe-Connection Scan Attempt

( TCP 135 )

5. Remote Command Attempt

5.1 tftp <host> GET msblast.exe

7.1 start msblast.exe

Worm File Code

4

BLASTER Worm

Attack ScenarioAttack Scenario25

Malicious Code Detection Approach –(1) Malicious Code Detection Approach –(1)

Windows platformsWindows platformsWindows platformsWindows platforms

Port 80Web ServiceWeb Service

Port 111, 137, 138, 139Remoteaccess servicesRemoteaccess services

Port 1434MS-SQL ServersMS-SQL Servers

Port 139, 445WorkstationservicesWorkstationservices

Linux platformsLinux platformsLinux platformsLinux platforms

Port 80Apache WebserverApache Webserver

Port 53 BINDBIND

Port 161SNMPSNMP

Port 25MailMail

Port 1521, 3306, 5432DatabaseserversDatabaseservers

Accept Data Only

Assumptions Buffer overflow attacks typically contain executables whereas legitimate client

requests never contain executables in most Internet services if a packet contains executables it would be an attack

Malicious Code

26

Malicious Code DetectionMalicious Code Detection

SigFree blocks attacks by detecting the presence of code Signature free Immunized from most attack-side obfuscation methods Generic code-data separation criteria Transparency Negligible throughput degradation Economical deployment with very low maintenance cost

Scope Web service (port 80) Buffer overflow attacks

• Actually it’s not a BOF detection algorithm, it’s a executable code detection algorithm

Application level attacks such as data manipulation and SQL injection are out of the scope

IA-32(Intel) Packet based (No reassemble)

Assumption : Normal requests do not contain executable codes

* Source [Wang06][6] SigFree

27

SigFree Overview SigFree Overview

SigFree architecture

• Scheme 1: exploits the OS characteristics of a program (faster)

• Scheme 2: exploits the data flow characteristics of a program (more robust)

Extended instruction flow graph

All Possible instruction

SigFree * Source [Wang06][6] SigFree

28

29

SigFree - LimitationSigFree - Limitation

Limitations SigFree can’t fully handle the branch-function based obfuscation SigFree can’t detect the shellcode that is written in a

alphanumeric form SigFree can’t detect malicious code which consists of fewer

useful instructions than current threshold 15 SigFree can’t the encrypted executable codes

SigFree * Source [Wang06][6] SigFree

29

Network-Level Execution – (1)Network-Level Execution – (1)

[ Input Stream]start end

DisassemblyExecution

Mem Read Count

Decryptor

Byte shifting

Invalid memory accesses & Invalid Instructions

DisassemblyExecution

Mem Read Count

Mem read loop

If over threshold, attack decision.

* Source [Michalls06]

executes every potential instruction sequence, aiming to identify the execution behavior of polymorphic shellcodes

compares their execution profile against the behavior observed to be inherent to polymorphic shellcodes.

[7] Polymorphic Shellcode Detection

30

Patten 1During decrytion, the decryptor must read the encrypted payload in order to decrypt it. Hence, the decryption process must read the encrypted payload.Criterion 1 : If a number of payload reads in a execution chain > Payload Read Threshold (PRT)

Patten 2A mandatory operation of every polymorphic shellcode is to find its location in memory using some form of “Get PC(%eip)”. Criterion 2 : If the chain executes some form of “Get PC(%eip)”

Execution chain for payload reads

“Get PC” code

An execution chain

* Source [Michalls06]

Network-Level Execution – (2)Network-Level Execution – (2)

[7] Polymorphic Shellcode Detection

31

32

PW Detection : CFG – (1)PW Detection : CFG – (1)

Perform a linear disassembly from the first byte of a stream to extract the machine instructions

Remove invalid basic blocks (resulted from the disassembly of non-code byte streams) Invalid block :

• if it contains one or more invalid instructions,• if it is on a path to an invalid block or • if it ends in a control transfer instruction that jumps into the

middle of another instruction

Control Flow Graph Extraction * Source [Kruegel 05][8] Control Flow Graph Extraction

32


• linear disassembly of the byte stream

• Nodes Describes the sequence of instruction without any jumps.

• Edges jump instruction making transition from one node to another.

CFG of a binary code cluster of closely connected nodes CFG of random sequence isolated nodes

• Move A 10• Move B 10• ADD B

• JMP BLOCK2

• MOV A 15• MOV B 20• MUL B

CFG Construction * Source [Kruegel 05]

Robustness to modification Junk insertion, register renaming, code transposition, instruction

substitution Uniqueness

Different executable regions should map to different fingerprints

[8] CFG Construction

33


Classify Instructions 14 sets

• A 14 bit colour value associated with each node (1 bit corresponding to 1 class)

• When one or more instructions of certain class appears in the basic block , the corresponding bit of the basic block colour value is set to 1.• E.g. MOV A, B 00000000000010

MUL A,10 00000000000001 PUSH A 00000000010000

Node Colour : 00000000010011

Append 14 bit colour value to each node in the adjacency matrix of the sub graphConcatenate the rows as before and get the new fingerprint

* Source [Kruegel 05][8] Graph Coloring

34

Classification of Malicious Code DetectionClassification of Malicious Code Detection

Static Analysis : Structure of executables Analysis without “execution” Disassembling -> String Analysis Frequency Analysis Structure of a program is described by its control flow graph

(CFG) Cannot be used to detect novel malware instances Used to recognize obfuscated invariants of the same code

instance

Dynamic Analysis : Behavior of executables File is “executed “ in saved envirionment

• VMWare, SandBox Behavior of a program is the observable effect that it has on its

environment• RegMon, FileMon, syscall monitoring

Consider the behavior for a whole class of malware

Static v.s. Dynamic Analysis

35

Payload AnalysisPayload Analysis36

Payload Analysis

Known Filetype Unknown Filetype

Static Analysis

Dynamic Analysis

Crypto Analysis

Find out“responsible”

Program(Loader)

• Executables• Exploits• Known Files

• Data with unknown file-type

Host-based Detection – (1)Host-based Detection – (1)

Tagged architecture that tracks the integrity of every memory word Network data is tainted Control data (return pointers, function

pointers, jump targets, etc.) should not be Taint tracking with every instruction Great for catching worms

Uses the γ mapping Implemented a full-system tagging scheme in

a virtual machine Linux (modified kernel)

• Tracks integrity in the file system• Virtual memory swapping

Windows (unmodified)• Works great as a honeypot for cacthing

worms

* Source [Crandall04][9] MINOS

37

Host-based Detection – (2)Host-based Detection – (2)

DAvis malCODe Analyzer Discover invariants in the exploit vector (ε)

Symbolic execution on the system trace during attacks that Minos catches

Used for an empirical analysis of polymorphism and metamorphism Quantify and understand the limits

* Source [Crandall05][10] DACODA

38

bibliographybibliography1) [Singh04] S. Singh, C. Estan, G. Varghese, and S. Savage. Automated worm fingerprinting. In

OSDI, 2004.2) [Kim 04] H.-A. Kim and B. Karp. Autograph: Toward automated, distributed worm signature

detection. In USENIX Security Symposium, pages 271-286, 2004.3) [Newsome05] J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating

signatures for polymorphic worms. In Proceedings of the IEEE Symposium on Security and Privacy, May, 2005.

4) [Wang05] K. Wang, G. Cretu, and S. J. Stolfo. Anomalous payload-based worm detection and signature generation. In Proceedings of the 14th Usenix Security Symposium, Baltimore, MD, USA, July 31 – August 5 2005.

5) [X. Wang 06] XiaoFeng Wang, Zhuowei Li, Jun Xu, Michael K. Reiter, Chongkyung Kil, Jong Youl Choi1, Packet Vaccine: Black-box Exploit Detection and Signature Generation, CCS 2006

6) [Wang06] SigFree: A Signature-free Buffer Overflow Attack Blocker, Usenix security 20067) [Michalls06] Michalis Polychronakis, Kostas G. Anagnostakis, and Evangelos P. Markatos

Network-Level Polymorphic Shellcode Detection Using Emulation DIMVA20068) [Kruegel05] C. Kruegel, E. Kirda, D. Mutz,W. Robertson, and G. Vigna. Polymorphic worm

detection using structural information of executables. In Proceedings of the 8th International Symposium on Recent Advances in Intrusion Detection (RAID), September 2005.

9) [Crandall 04] Jedidiah R. Crandall and Frederic T. Chong, Minos: Control Data Attack Prevention Orthogonal to Memory Model, IEEE/ACM international symposium on micro-architecture, 221-232, IEEE Computer Society. 2004

10) [Crandall 05] J. R. Crandall, Z. Su, S. F. Wu, and F. T. Chong. On Deriving Unknown Vulnerabilities from Zero-Day Polymorphic and Metamorphic Worm Exploits. ACM CCS, pages 235–248, November 2005

39

http://www.math.utah.edu/pub/tex/bib/cryptography2000.html

http://www.math.utah.edu/pub/tex/bib/cryptography2000.html

Zero-day Attack 방어를 위한 네트워크 기반 탐지 방법

Documents