1 Zero-day Attack 방방방 방방 방방방방 방방 방방 방방 2007 년 4 년 12 년 년년년 (ikkim21@etri.re.kr) 년년년년년년년년년 년년년년년년년
Jan 13, 2016
2
ContentsContents
Vulnerability & Zero-day Attack
Intrusion Detection
Detection Model
Research Trends : Zero-day Attack Detection
VulnerabilitiesVulnerabilities
Half-life of critical vulnerabilities is 21 days Half of the most prevalent are replaced by new
vulnerabilities every year Lifespan of some vulnerabilities and worms is
unlimited 80% of worms and automated exploits occur in
the first two half-lives
*Source : Gerhard Eschelbeck of Qualys at Blackhat 2004
Laws of Vulnerabilities
3
July 242002
…..
0-day attack
Future wormreleased
2006 ??
SQL server buffer-overflow vulnerability
SlammerWorm released
Jan 252003
185-dayattack
Apr 112004
Apr 302004
SasserWorm released
19-dayattack
LSASS buffer-overflowvulnerability
Vulnerabilitydisclosure
• A zero-day attack is a computer threat that exposes undisclosed or unpatched computer application vulnerabilities. (defined by wikipedia)• May 2005 : Zero-day exploits for unknown vulnerabilities in Mozilla Firefox
Zero-day AttacksZero-day Attacks
Time-gap between vulnerability disclosure and release of a worm that exploits it is decreasing
4
Audit Data
Packet Parsing
Packet Sensor
Pattern MatchingSignature Data Base
Detection Engine
ExpertManager
Rule ManageResponse
Response Manager
ComparisonObjects
Re
spo
nse
Alert
Network
Intrusion Detection - IIntrusion Detection - I
Misused Analysis : Signature-based
5
Signature-based IDSignature-based ID
Today, Deep Packet Inspection capability, ASIC-based Appliance•TippingPoint: IPS 5000ETippingPoint: IPS 5000E•TopLayer: IPS 5500TopLayer: IPS 5500•Cisco: IPS4255Cisco: IPS4255
By 2006, 75 percent of Global 2000 enterprises will replace or augment their firewall approach with deep packet inspection capabilities
By 2005, enterprises will no longer use software-based application proxy firewalls
Source - Deep Packet Inspection: Next Phase of Firewall Evolution
(21 November 2002, Gartner)
By 2006, 75 percent of Global 2000 enterprises will replace or augment their firewall approach with deep packet inspection capabilities
By 2005, enterprises will no longer use software-based application proxy firewalls
Source - Deep Packet Inspection: Next Phase of Firewall Evolution
(21 November 2002, Gartner)
By 2009 the UTM space will be the largest single market.
By 2009 the UTM space will be the largest single market.
DPI : High Performance Pattern Matching
6
Audit Data
Packet Parsing
Packet Sensor
StatisticalData MiningNeural Net.
LearningProfile
AnomalyAnalysis Engine
Learning & Comparison
Objects
Alert
Network
Report
Intrusion Detection - IIIntrusion Detection - IIAnomaly Detection
7
Current Signature Generation Process New worm outbreak Report of anomalies from people via
phone/email/newsgroup Worm trace is captured Manual analysis by security experts Signature generation
Labor-intensive, Human-mediated
Zero-day Attack ProtectionZero-day Attack Protection
Anomaly Detection + Signature Generation + High Performance FW (IPS)
8
Control Flow Hijacking Worm ModelControl Flow Hijacking Worm Model
IPUPR. LYR. PAYLOAD TCP/UDP HDRAttack CodeExploit
(ReturnAddr)Decryption
CodeNOP NOPNOP NOP
• Epsilon (ε) = Exploit Vector• Gamma (γ) = Bogus Control Data• Pi (π) = Payload
* Source [Crandall05]Epsilon-Gamma-Pi Model
9
ε- γ- π Modelε- γ- π Model10
• Epsilon (ε) = HTTP Header
• Gamma (γ) = Return Address
• Pi (π) = Codered Shellcode
* Source [Crandall05]
CodeRed II Case
Control hijacking - exampleControl hijacking - example
Normal Stack Smashed Stack Smashed Stack
Buffer Overflow
11
Recent Worm ExploitsRecent Worm Exploits
Worm Exploits
12
Worm PolymorphicWorm Polymorphic
Worm body
Randomly generates a new keyand corresponding decryptor code
Mutation A
Decrypt and execute
Mutation C
Mutation B
To detect an unknown mutation of a known virus ,
emulate CPU execution of until the current sequence ofinstruction opcodes matches the known sequence for virus body
Polymorphic Worm
13
Polymorphic EnginePolymorphic Engine
ADMutate alters each of these elements NOP substitution with operationally inert commands Shell code encoded by XORing with a randomly
generated key Return address modulated – least significant byte
altered to jump into different parts of NOPs
NOP substituteAnother NOP
Yet another NOPA different NOPHere’s a NOP
XOR’ed Machine Code:execve (/bin/sh)
Modulated Pointer toNOP Substitutes
NOP substituteAnother NOP
Yet another NOPA different NOPHere’s a NOP
XOR’ed Machine Code:execve (/bin/sh)
Modulated Pointer toNOP Substitutes
PolymorphicXOR Decoder
Mutation Engine
14
Metamorphic Code - ExamplesMetamorphic Code - Examples Code reordering
Instructions that are independent are re-ordered
MOV EAX, [X]MOV EBX, [Y]ADD EAX, EBXMOV [X], EAX
MOV EBX, [Y]MOV EAX, [X]ADD EAX, EBXMOV [X], EAX
MOV EAX, [X]MOV EBX, [Y]ADD EAX, EBXMOV [X], EAX
MOV EAX, [X]MOV EBX, [Y]ADD EAX, EBXPUSH ESIMOV [X], EAXPOP ESI
Garbage Code insertion Instructions are inserted that are semantic no-ops (do not effect the
code and registers, and therefore execution)
Equivalent Code Replacement Register renaming, or semantically equivalent code
MOV EAX, [X]MOV EBX, [Y]ADD EAX, EBXMOV [X], EAX
XOR EAX, EAXADD EAX, [X]ADD EAX, [Y]MOV [X], EAX
Register-reassignment Swaps the usage of the registers Causes extensive “minor” changes in the
code sequence
15
Zero-Day Attack Detection Zero-Day Attack Detection
Network-based Prevalence Model
• Autograph/Polygraph• Earlybird
Other Type• PayL• PacketVaccine
Malicious Code detection• SigFree• Polymorphic Detection - Network execution• Control Flow Graph
Host-based • MINOS• DACODA
16
Research Trends
Prevalence Model – (1)Prevalence Model – (1)
Key observation : Define worm behavior Content invariance
• Portions of a worm are invariant (e.g. the decryption routine) Content prevalence
• Appears frequently on the network Address dispersion
• Distribution of destination addresses more uniform to spread fast
* Source [Singh04]
(Stefan Savage, UCSD *)
Two consequences Content Prevalence: 1/60
sampled Rabin Fingerpirntng, 40bytes substring, Prevalence threshold is 3
Address Dispersion: Threshold 30 source, 30 destination
Packet content examination can be evaded with simple polymorphism
[1] EarlyBird
17
Prevalence Model – (2)Prevalence Model – (2)
Key Observations TCP worms that propagate via scanning Worm’s payloads share a common substring
• Vulnerability exploit part is not easily mutable, Not polymorphic
Step 1: Select suspicious flows using heuristics Flows from scanners are suspicious
Step 2: Generate signature using content-prevalence analysis All instances of a worm have a common byte pattern specific to
the worm Content-based Payload Partitioning (COPP)
• Partition if Rabin fingerprint of a sliding window matches Breakmark• Configurable parameters: content block size (minimum, average,
maximum), breakmark, sliding window
* Source [Kim 04][2] Autograph
18
19
A protocol through which multiple distributed Autograph monitors may share information
step1: select suspicious flows using heuristics
step2: generate signature using content prevalence analysis
Prevalence Model – (2)Prevalence Model – (2)
* Source [Kim 04][2] Autograph
19
Prevalence Model – (3)Prevalence Model – (3)
No one substring is specific enough BUT, there are multiple substrings
Protocol framing Value used to overwrite return address (Parts of poorly obfuscated code)
Approach : combine the substrings (3 bytes-size)
* Source [Newsome05][3] Polygraph
20
Summary of Prevalence Model DetectionSummary of Prevalence Model Detection
Earlybird Autograph/Polygraph
TrafficContents Prevalence ->
Address dispersionSuspicious Flow Selection ->
Contents Prevalence
Preprocessing 1/64 string sample Session reassembly
Suspicious Traffic NA• Port Scan detection• Session Success Rate• Address Dispersion
Prevalent Content
Extraction Multi-stage Hash Longest Common Substring
Signature size 40 bytes 3 bytes tokensWhitelist Heuristic Heuristic
Demerits
• High False Positive• Sampled Traffic• Heuristic based Whitelist
management• Do not handle polymorphic
worm
• Due to Session Reassemble, Degradation of Processing Power
• High session fail rate due to P2P service
Trends No more super-worm outbreak, since 2004
Prevalence Model
21
Payload AnomalyPayload AnomalyPAYL
Compute a “normal profile” of a site’s unique content flow, and use this information to detect anomalous data
n-gram• is the sequence of n adjacent byte values in a packet payload• A sliding window with width n is passed over the whole payload one
byte at a time and the frequency of each n-gram is computed• The frequency count distribution represents a model of the content flow
(“statistical centroid”) Compare the similarity between test data and the trained models
• Mahalanobis distance• If the distance of a test datum is greater than the threshold, the system
issues an alert
* Source [Wang05] [4] PAYL
22
Normal HTTP RequestNormal HTTP Request CodeRed II CodeRed II
Character Distribution Character Distribution
23
Jump Address Detection –(1)Jump Address Detection –(1)
Vaccine Generation detection of anomalous packet payloads a byte sequence resembling a jump address, and randomization of selected contents
Exploit Detection detect an exploit attempt it should now trigger an exception in a vulnerable program
Vulnerability Diagnosis correlates the exception with the vaccine to acquire information regarding the
exploit the corrupted pointer content and its location in the exploit packet
Signature Generation creates variations of the original exploit to probe the vulnerable program in an effort to identify necessary exploit conditions for generation of a signature
* Source [X. Wang, CCS2006] [5] Packet Vaccine
24
Vaccine Generation A key step in most exploits is to inject a jump address to
redirect the control flow of a vulnerable program Such an address points
• stack or heap in a code-injection attack• global library entry in an existing-code attack
24
Jump Address Detection –(2)Jump Address Detection –(2)
Approach Check every 4-byte sequence(32-bit system) or 8-byte
sequence(64-bit system) Randomize those which fall in the
address range of the potential jump targets in a protected program
Should cause an exception, segmentation fault (SEGV) or illegal instruction fault(ILL)
* Source [X. Wang, CCS2006] [5] Packet Vaccine
Attacker Victim
2. RPC DCOM Request
( TCP 135 )
6. TFTP Download Request
( UDP 69 )
7. Delivery Main Worm Body “Msblaster.exe”
( TFTP : UDP 69 )
DCOM object : Insufficient bounds checking RPC Endpoint Mapper listen : 139, 135, 445, 593
4. Start TFTP Server
msblaster.exe
8. Syn FloodingWindowsUpdate.com
Buffer
Overflow
Listening TCP 4444
3. Shell Code : Binding Port 4444
1. Probe-Connection Scan Attempt
( TCP 135 )
5. Remote Command Attempt
5.1 tftp <host> GET msblast.exe
7.1 start msblast.exe
Worm File Code
4
BLASTER Worm
Attack ScenarioAttack Scenario25
Malicious Code Detection Approach –(1) Malicious Code Detection Approach –(1)
Windows platformsWindows platformsWindows platformsWindows platforms
Port 80Web ServiceWeb Service
Port 111, 137, 138, 139Remoteaccess servicesRemoteaccess services
Port 1434MS-SQL ServersMS-SQL Servers
Port 139, 445WorkstationservicesWorkstationservices
Linux platformsLinux platformsLinux platformsLinux platforms
Port 80Apache WebserverApache Webserver
Port 53 BINDBIND
Port 161SNMPSNMP
Port 25MailMail
Port 1521, 3306, 5432DatabaseserversDatabaseservers
Accept Data Only
Assumptions Buffer overflow attacks typically contain executables whereas legitimate client
requests never contain executables in most Internet services if a packet contains executables it would be an attack
Malicious Code
26
Malicious Code DetectionMalicious Code Detection
SigFree blocks attacks by detecting the presence of code Signature free Immunized from most attack-side obfuscation methods Generic code-data separation criteria Transparency Negligible throughput degradation Economical deployment with very low maintenance cost
Scope Web service (port 80) Buffer overflow attacks
• Actually it’s not a BOF detection algorithm, it’s a executable code detection algorithm
Application level attacks such as data manipulation and SQL injection are out of the scope
IA-32(Intel) Packet based (No reassemble)
Assumption : Normal requests do not contain executable codes
* Source [Wang06][6] SigFree
27
SigFree Overview SigFree Overview
SigFree architecture
• Scheme 1: exploits the OS characteristics of a program (faster)
• Scheme 2: exploits the data flow characteristics of a program (more robust)
Extended instruction flow graph
All Possible instruction
SigFree * Source [Wang06][6] SigFree
28
29
SigFree - LimitationSigFree - Limitation
Limitations SigFree can’t fully handle the branch-function based obfuscation SigFree can’t detect the shellcode that is written in a
alphanumeric form SigFree can’t detect malicious code which consists of fewer
useful instructions than current threshold 15 SigFree can’t the encrypted executable codes
SigFree * Source [Wang06][6] SigFree
29
Network-Level Execution – (1)Network-Level Execution – (1)
[ Input Stream]start end
DisassemblyExecution
Mem Read Count
Decryptor
Byte shifting
Invalid memory accesses & Invalid Instructions
DisassemblyExecution
Mem Read Count
Mem read loop
If over threshold, attack decision.
* Source [Michalls06]
executes every potential instruction sequence, aiming to identify the execution behavior of polymorphic shellcodes
compares their execution profile against the behavior observed to be inherent to polymorphic shellcodes.
[7] Polymorphic Shellcode Detection
30
Patten 1During decrytion, the decryptor must read the encrypted payload in order to decrypt it. Hence, the decryption process must read the encrypted payload.Criterion 1 : If a number of payload reads in a execution chain > Payload Read Threshold (PRT)
Patten 2A mandatory operation of every polymorphic shellcode is to find its location in memory using some form of “Get PC(%eip)”. Criterion 2 : If the chain executes some form of “Get PC(%eip)”
Execution chain for payload reads
“Get PC” code
An execution chain
* Source [Michalls06]
Network-Level Execution – (2)Network-Level Execution – (2)
[7] Polymorphic Shellcode Detection
31
32
PW Detection : CFG – (1)PW Detection : CFG – (1)
Perform a linear disassembly from the first byte of a stream to extract the machine instructions
Remove invalid basic blocks (resulted from the disassembly of non-code byte streams) Invalid block :
• if it contains one or more invalid instructions,• if it is on a path to an invalid block or • if it ends in a control transfer instruction that jumps into the
middle of another instruction
Control Flow Graph Extraction * Source [Kruegel 05][8] Control Flow Graph Extraction
32
PW Detection : CFG – (2)PW Detection : CFG – (2)
• linear disassembly of the byte stream
• Nodes Describes the sequence of instruction without any jumps.
• Edges jump instruction making transition from one node to another.
CFG of a binary code cluster of closely connected nodes CFG of random sequence isolated nodes
• Move A 10• Move B 10• ADD B
• JMP BLOCK2
• MOV A 15• MOV B 20• MUL B
CFG Construction * Source [Kruegel 05]
Robustness to modification Junk insertion, register renaming, code transposition, instruction
substitution Uniqueness
Different executable regions should map to different fingerprints
[8] CFG Construction
33
PW Detection : CFG – (3)PW Detection : CFG – (3)
Classify Instructions 14 sets
• A 14 bit colour value associated with each node (1 bit corresponding to 1 class)
• When one or more instructions of certain class appears in the basic block , the corresponding bit of the basic block colour value is set to 1.• E.g. MOV A, B 00000000000010
MUL A,10 00000000000001 PUSH A 00000000010000
Node Colour : 00000000010011
Append 14 bit colour value to each node in the adjacency matrix of the sub graphConcatenate the rows as before and get the new fingerprint
* Source [Kruegel 05][8] Graph Coloring
34
Classification of Malicious Code DetectionClassification of Malicious Code Detection
Static Analysis : Structure of executables Analysis without “execution” Disassembling -> String Analysis Frequency Analysis Structure of a program is described by its control flow graph
(CFG) Cannot be used to detect novel malware instances Used to recognize obfuscated invariants of the same code
instance
Dynamic Analysis : Behavior of executables File is “executed “ in saved envirionment
• VMWare, SandBox Behavior of a program is the observable effect that it has on its
environment• RegMon, FileMon, syscall monitoring
Consider the behavior for a whole class of malware
Static v.s. Dynamic Analysis
35
Payload AnalysisPayload Analysis36
Payload Analysis
Known Filetype Unknown Filetype
Static Analysis
Dynamic Analysis
Crypto Analysis
Find out“responsible”
Program(Loader)
• Executables• Exploits• Known Files
• Data with unknown file-type
Host-based Detection – (1)Host-based Detection – (1)
Tagged architecture that tracks the integrity of every memory word Network data is tainted Control data (return pointers, function
pointers, jump targets, etc.) should not be Taint tracking with every instruction Great for catching worms
Uses the γ mapping Implemented a full-system tagging scheme in
a virtual machine Linux (modified kernel)
• Tracks integrity in the file system• Virtual memory swapping
Windows (unmodified)• Works great as a honeypot for cacthing
worms
* Source [Crandall04][9] MINOS
37
Host-based Detection – (2)Host-based Detection – (2)
DAvis malCODe Analyzer Discover invariants in the exploit vector (ε)
Symbolic execution on the system trace during attacks that Minos catches
Used for an empirical analysis of polymorphism and metamorphism Quantify and understand the limits
* Source [Crandall05][10] DACODA
38
bibliographybibliography1) [Singh04] S. Singh, C. Estan, G. Varghese, and S. Savage. Automated worm fingerprinting. In
OSDI, 2004.2) [Kim 04] H.-A. Kim and B. Karp. Autograph: Toward automated, distributed worm signature
detection. In USENIX Security Symposium, pages 271-286, 2004.3) [Newsome05] J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating
signatures for polymorphic worms. In Proceedings of the IEEE Symposium on Security and Privacy, May, 2005.
4) [Wang05] K. Wang, G. Cretu, and S. J. Stolfo. Anomalous payload-based worm detection and signature generation. In Proceedings of the 14th Usenix Security Symposium, Baltimore, MD, USA, July 31 – August 5 2005.
5) [X. Wang 06] XiaoFeng Wang, Zhuowei Li, Jun Xu, Michael K. Reiter, Chongkyung Kil, Jong Youl Choi1, Packet Vaccine: Black-box Exploit Detection and Signature Generation, CCS 2006
6) [Wang06] SigFree: A Signature-free Buffer Overflow Attack Blocker, Usenix security 20067) [Michalls06] Michalis Polychronakis, Kostas G. Anagnostakis, and Evangelos P. Markatos
Network-Level Polymorphic Shellcode Detection Using Emulation DIMVA20068) [Kruegel05] C. Kruegel, E. Kirda, D. Mutz,W. Robertson, and G. Vigna. Polymorphic worm
detection using structural information of executables. In Proceedings of the 8th International Symposium on Recent Advances in Intrusion Detection (RAID), September 2005.
9) [Crandall 04] Jedidiah R. Crandall and Frederic T. Chong, Minos: Control Data Attack Prevention Orthogonal to Memory Model, IEEE/ACM international symposium on micro-architecture, 221-232, IEEE Computer Society. 2004
10) [Crandall 05] J. R. Crandall, Z. Su, S. F. Wu, and F. T. Chong. On Deriving Unknown Vulnerabilities from Zero-Day Polymorphic and Metamorphic Worm Exploits. ACM CCS, pages 235–248, November 2005
39