Malware Detection Slides courtesy of Mihai Christodorescu
Mar 19, 2016
Malware DetectionSlides courtesy of Mihai Christodorescu
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 2
The Rising Malware Tide• Malware is software with unwanted
functionality.Viruses, trojans, backdoors, bots, adware, spyware, browser hijackers, downloaders, droppers, keyloggers, password stealers, ...
• “Blended” threats
100,000,000 machines are infected.[Vint Cerf, World Economic Forum 2007]
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 3
Organized Cyber-Crime• Boom in online fraud:
– Spamming– Trade in stolen data– Financial fraud– ID theft
Malware is the tool of the trade.
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 4
The Changing Threat Landscape
1995: Hobby malware, for fun• Show programming prowess• Single author
2007: Professional malware, for profit• Collaborative development• Bug-fix releases, code reuse
Botnets: distributed computing has finally arrived.
Creator of the Melissa
worm
?
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 5
Failure of Signature Detectors
Malware detectors still use signatures.
Malware is obfuscated/transformed easily.Software diversity used successfully by malware.
Internet
ac028c0e86009d8edfac0ac075fbe81cfd72ef50b91000f7f15052b90:*:504b03040a0001000800*...*:188420:181779:*:8ad6900f5088cab9356678e43c...3:*:3e3c623e6c696e6b3c2f6...
Virus Scanner
Known Malware
New Malware 1New Malware 2
Paradigm shift in malware creation,
yet no change in malware detection!
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 6
Focus On Behavior
[Kaspersky Labs, Symantec]
2001 2002 2003 2004 2005 2006
10
1,000
100,000
10,000
1
8,82111,136 20,731
31,726 53,95086,876
New malware & malware families
Time
100
325 335 274 202 (est.)
A family is a collection of behaviors.A behavior can be shared by many families.
Family = malware with a common code base.
Number of families
stays constant.
Number of variants
grows exponentiall
y.
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 7
Main thesis
Detection of obfuscated malware requires a semantic analysis of program behavior.
Program verification provides the techniques necessary to perform malware
detection effectively and efficiently.
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 8
Specifying Behavior
Byte signatures allow for fast detection.– But not resilient to obfuscation.
High-level descriptions require expensive detection.– Resilient to obfuscation.
Syntactic Semantic
Execution of program M causes the system to reach a state where a copy of M has been sent by email.
“”
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 9
Connect
Send
Malspec: Self-Propagation by Email
Netsky.B
push 10hpush eaxpush edicall connectpush esipush eaxpush [ebp+hMem]call wsprintfAadd esp, 0Chpush [ebp+hMem]call lstrlenApush 0push eaxpush [ebp+hMem]push ebxpush eaxpush ecxpush edicall send
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 10
Connect
Send
push 10hpush eaxpush edicall connectpush esipush eaxpush [ebp+hMem]call wsprintfAadd esp, 0Chpush [ebp+hMem]call lstrlenApush 0push eaxpush [ebp+hMem]push ebxpush eaxpush ecxpush edicall send
Netsky.B
X := Arg1
Arg1 = X &Arg2 = “EHLO.*”
= +Semantic component
describesdependency constraints.
Syntactic component describestemporal
constraints.
Malspec: Self-Propagation by Email
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 11
“Read Own Exe. Image”“Send Email”
Building a Real Malspec
send(X,“DATA”)
X:=socket()
connect(X)
send(X,“EHLO”)
send(X,T)
Y:=read(Z)
Z:=open(S)
S:=process_name()
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 12
“Read Own Exe. Image”“Send Email”
send(X,“DATA”)
Building a Real MalspecX:=socket()
connect(X)
send(X,“EHLO”)
Y:=read(Z)
send(X,T)
Z:=open(S)
S:=process_name()
send(X,T))),Base64(l(StringEqua YT
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 13
send(X,“DATA”)
Malspec ConstraintsX:=socket()
connect(X)
send(X,“EHLO”)
Y:=read(Z)
send(X,T))),Base64(l(StringEqua YT
Z:=open(S)
S:=process_name()Local constraint
Dependence constraint:X after socket = X before connect
Dependence constraint
AutomatingMalspec Creation:Malspec Mining
MalwareSample
BenignProgramBenign
ProgramBenignProgramBenign
Program
—
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 14
Malspecs Benefits
X:=socket()
connect(X)
send(X,“EHLO”)
send(X,“DATA”)Y:=read(Z)
send(X,T))),Base64(l(StringEqua YT
Z:=open(S)
S:=process_name()
Choice of security-sensitive operations
Constraint-based execution order
Dependences free of obfuscation artifacts
Expressive to describe even obfuscated behavior.
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 15
Malspec Detection Strategies• Static analysis
• Dynamic analysis
• Host-based IDS
• Inline Reference Monitors
X:=socket()
connect(X)
send(X,“EHLO”)
send(X,“DATA”)Y:=read(Z)
send(X,T))),Base64(l(StringEqua YT
Z:=open(S)
S:=process_name()
Malspecs are independent of detection method.
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 16
Detection of Malicious Behavior
BinaryFile
MalwareDetector
X:=socket()
connect(X)
send(X,“EHLO”)
send(X,“DATA”) Y:=read(Z)send(X,T)
)),Base64(l(StringEqua YT
Z:=open(S)
S:=process_name()
Goal: Find a program path that matches the malspec.
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 17
Find A Malicious Program Path
X:=socket()
connect(X)
send(X,“EHLO”)
send(X,“DATA”) Y:=read(Z)send(X,T)
)),Base64(l(StringEqua YT
Z:=open(S)
S:=process_name()
Interprocedural Control-Flow Graph
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 18
1) Match Malspec OperationsX:=socket()
connect(X)
send(X,“EHLO”)
send(X,“DATA”)Y:=read(Z)
send(X,T))),Base64(l(StringEqua YT
Z:=open(S)
S:=process_name()
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 19
2) Match Malspec Constraints
X:=socket()
connect(X)
send(X,“EHLO”)
send(X,“DATA”)Y:=read(Z)
send(X,T))),Base64(l(StringEqua YT
Z:=open(S)
S:=process_name()
Malspec Constraint:Z after open = Z before
read
Program Constraint:The program fragment preserves the program expression bound to Z.
Like a semantic def-use
constraint.
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 20
2) Match Malspec Constraints
Program Constraint:The program fragment preserves the program expression bound to Z.
Semantic nop wrt E = program fragment preserving an expression E.
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 21
2) Match Malspec Constraints
Program Constraint:The program fragment preserves the program expression bound to Z.
Need an Oracle...
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 22
Advances in Decision Procedures
Dramatic improvements in SAT solvers:– SATO [Zhang, CADE 1997]– GRASP [Marques-Silva & Sakallah, 1999]– zChaff [Moskewicz et al., DAC 2001]– BerkMin [Goldberg & Novikov, DATE 2002]
SAT-based Bounded Model Checking:[Clarke et al., FMSD 2001]
– SAT-specific speedups [Strichman, CHARME 2001]– Richer logics [Seshia et al., DAC 2003]
A decision procedure can approx. an Oracle.
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 23
Using Decision Procedures
Program Constraint:The program fragment preserves the program expression bound to Z.
Decisionprocedure
P True/False
P
add esp, 0Chpush[ebp+hMem]
P
P
4
12
12
01
01
espesphMemebpmemoryespmemory
espesp
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 24
Semantics-AwareMalware Detector
Semantics-Aware Detector
Disassembler CFGconstructor
BinaryFile
CFG
Graphmatching
Malspec
Malspec operations
Malspec constraints
Yes / No
IDA Pro
[Detlefs et al., “Simplify,” 2004][Lahiri & Seshia, CAV 2004]
Constraintsatisfaction
Simplify
UCLID
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 25
Effective DetectionWith hard-coded semantic-nop
patterns:
With decision procedures:
Commercial AV
SAFE
Known malware 100% 100%Obfuscated variants
0% 100%
Malspec source
Variants detected
# of AV signatures
# of SAMD malspecs
Netsky.B C,D,O,P,T,W 7 1Bagle.I J,N,O,P,R,Y 7 1
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 26
Semantic-Nop Detection BenefitsSemantic-Nop features:• Flow sensitivity• Binding procedure• Decision procedures• Rich constraints
Obfuscation resilience:
• Code reordering• Register renaming• Junk code• Code substitution
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 27
Detection Performance
300–800 s
Powerful decision procedures are expensive.
1–9 s
Simplify theorem proverUCLID
bounded model
checker
SAFE pattern matching
Idea:Use expensive decision
procedures only if cheap decision procedures do not provide a
definitive answer.
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 28
Stack of Decision Procedures
Simplify theorem proverUCLID
bounded model
checker
SAFE pattern matching
Random execution
Average cost, same decision power.
Yes
No
Yes
Yes/No
“No, code does not
satisfy constraint!”
Constraint
Program fragment ?
?
?
University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 29
Performance Results
Malware Minimum Average MaximumNetsky
(B,C,D,O,P,T,W)
Bagle(I,J,N,O,P,R,Y)
Bagle(obfuscated
variants)
Detection times in seconds
60.56 99.57 140.08
36.00 56.41 97.13
74.81 140.14 186.50
Test setup: 1 GHz CPU, 1 GB RAMComparison:
Commercial signature-based detector: <1sDecision procedure-based detector: >300s