Dytan: A Generic Dynamic Taint Analysis Framework James Clause, Wanchun (Paul) Li, and Alessandro Orso College of Computing Georgia Institute of Technology Partially supported by: NSF awards CCF-0541080 and CCR-0205422 to Georgia Tech, DHS and US Air Force Contract No. FA8750-05-2-0214
103
Embed
Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Dytan: A Generic Dynamic Taint Analysis
FrameworkJames Clause, Wanchun (Paul) Li,
and Alessandro OrsoCollege of Computing
Georgia Institute of Technology
Partially supported by:NSF awards CCF-0541080 and CCR-0205422 to Georgia Tech,
DHS and US Air Force Contract No. FA8750-05-2-0214
Attack detection / preventionDetect / prevent attacks such as SQL injection, buffer overruns,
stack smashing, cross site scriptinge.g., Suh et al. 04, Newsome and Song 05,
Halfond et al. 06, Kong et al. 06, Qin et al. 06
Dynamic tainting applications
Information policy enforcement
Attack detection / prevention
Testing
Data lifetime / scope
Information policy enforcementensure classified information does not leak outside the system
e.g.,Vachharajani et al. 04, McCamant and Ernst 06
Dynamic tainting applications
Information policy enforcement
Attack detection / prevention
Testing
Data lifetime / scope
TestingCoverage metrics, test data generation heuristic, ...
e.g., Masri et al 05, Leek et al. 07
Dynamic tainting applications
Information policy enforcement
Attack detection / prevention
Testing
Data lifetime / scopeData lifetime / scopetrack how long sensitive data, such as passwords or account
numbers, remain in the applicatione.g., Chow et al. 04
Dynamic tainting applications
Information policy enforcement
Attack detection / prevention
Testing
Data lifetime / scope
MotivationAd-hoc taint analysis
implementationResults
Ad-hoc taint analysis implementation
Ad-hoc taint analysis implementation
Results
Results
MotivationAd-hoc taint analysis
implementationResults
Ad-hoc taint analysis implementation
Ad-hoc taint analysis implementation
Results
Results
Ad-hoc taint analysis implementation
Results
MotivationAd-hoc taint analysis
implementationResults
Ad-hoc taint analysis implementation
Ad-hoc taint analysis implementation
Results
Results
Ad-hoc taint analysis implementation
Results
Motivation
Configuration
Dytan Generic Framework
Custom Dynamic Taint Analysis Results
Motivation
•Flexible
Configuration
Dytan Generic Framework
Custom Dynamic Taint Analysis Results
Motivation
•Flexible•Easy to use
Configuration
Dytan Generic Framework
Custom Dynamic Taint Analysis Results
Motivation
•Flexible•Easy to use•Accurate
Configuration
Dytan Generic Framework
Custom Dynamic Taint Analysis Results
Outline
✓Motivation & overview
• Framework (Dytan)
• flexibility• ease of use• accuracy
• Empirical evaluation
• Conclusions
Framework: flexibility
Taint
sources
Propagation
policy
Taint
sinksConfiguration
Framework: flexibility
Taint
sources
Propagation
policy
Taint
sinks
Framework: flexibility
Taint
sources
Taint
sources
Propagation
policy
Taint
sinks
Which data to tag, and how to tag it
Framework: flexibility
Propagation
policy
Taint
sources
Propagation
policy
Taint
sinks
How tags should be propagated at runtime
Framework: flexibility
Taint
sinks
Taint
sources
Propagation
policy
Taint
sinks
Where and how tags should be checked
Framework: flexibility
Taint
sources
Propagation
policy
Taint
sinks
Taint sources
What to tag How to tag
Taint sources
What to tag How to tagIdentify what program data should be assigned tags
Taint sources
What to tag How to tagIdentify what program data should be assigned tags
• Variables (local or global)• Function parameters• Function return values• Data from an input stream
network, filesystem, keyboard, ...
• Specific input stream141.195.121.134:80, a.txt,...
Taint sources
What to tag How to tagIdentify what program data should be assigned tags
• Variables (local or global)• Function parameters• Function return values• Data from an input stream
network, filesystem, keyboard, ...
• Specific input stream141.195.121.134:80, a.txt,...
Describe how tags should be assigned for identified data
Taint sources
What to tag How to tagIdentify what program data should be assigned tags
• Variables (local or global)• Function parameters• Function return values• Data from an input stream
network, filesystem, keyboard, ...
• Specific input stream141.195.121.134:80, a.txt,...
Describe how tags should be assigned for identified data
• Single tag• One tag per source• Multiple tags per source
Taint sources
What to tag How to tagIdentify what program data should be assigned tags
• Variables (local or global)• Function parameters• Function return values• Data from an input stream
network, filesystem, keyboard, ...
• Specific input stream141.195.121.134:80, a.txt,...
Describe how tags should be assigned for identified data
• Single tag• One tag per source• Multiple tags per source• ...
a.txt
Taint sourcesWhat to tag: a.txtHow to tag: single tag
a.txt
Taint sourcesWhat to tag: a.txtHow to tag: single tag
Taint sourcesWhat to tag: a.txtHow to tag: single tag
a.txt
Taint sourcesWhat to tag: a.txtHow to tag: single tag
a.txt
1 1 1 1 1 1
Taint sourcesWhat to tag: a.txtHow to tag: single tag
a.txt
Taint sourcesWhat to tag: a.txt
a.txt
How to tag: multiple tags
Taint sourcesWhat to tag: a.txt
a.txt
2 31 4 5 n
How to tag: multiple tags
Propagation policy
3
B
A
12
3C
Affecting data Mapping function
Propagation policy
3
B
A
12
3C
Affecting data Mapping functionData that affects the outcome of a statement through
Propagation policy
3
B
A
12
3C
Affecting data Mapping functionData that affects the outcome of a statement through
• Data dependencies
Propagation policy
3
B
A
12
3C
Affecting data Mapping functionData that affects the outcome of a statement through
• Data dependencies• Control dependencies
Propagation policy
3
B
A
12
3C
Affecting data Mapping functionData that affects the outcome of a statement through
• Data dependencies• Control dependencies
A policy can consider both or only data dependencies
Propagation policy
3
B
A
12
3C
Affecting data Mapping functionData that affects the outcome of a statement through
• Data dependencies• Control dependencies
A policy can consider both or only data dependencies
Define how tags associated with affecting data should be combined
Propagation policy
3
B
A
12
3C
Affecting data Mapping functionData that affects the outcome of a statement through
• Data dependencies• Control dependencies
A policy can consider both or only data dependencies
Define how tags associated with affecting data should be combined
• Union
Propagation policy
3
B
A
12
3C
Affecting data Mapping functionData that affects the outcome of a statement through
• Data dependencies• Control dependencies
A policy can consider both or only data dependencies
Define how tags associated with affecting data should be combined
• Union• Max
Propagation policy
3
B
A
12
3C
Affecting data Mapping functionData that affects the outcome of a statement through
• Data dependencies• Control dependencies
A policy can consider both or only data dependencies
Define how tags associated with affecting data should be combined
• Union• Max• ...
Propagation policy
3
B
A
12
3C
if(X) {
C = A + B;}
Propagation policy
3
if(X) {
C = A + B;}
1 2
Propagation policy
3
if(X) {
C = A + B;}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
unionmax
3
if(X) {
C = A + B;}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence✔
unionmax
3
if(X) {
C = A + B;}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence✔
unionmax
✔
3
if(X) {
C = A + B;}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence✔
unionmax
✔
3
if(X) {
C = A + B;}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence✔
unionmax
✔
1 2
3
if(X) {
C = A + B;}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
unionmax
3
if(X) {
C = A + B;}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
unionmax
✔
✔
3
if(X) {
C = A + B;}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
unionmax
✔
✔
✔
3
if(X) {
C = A + B;}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
unionmax
✔
✔
✔
3
if(X) {
C = A + B;}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
unionmax
✔
✔
✔
3
Where to check What to check
Taint Sinks
How to check
Where to check What to checkLocation in the program to perform a check
Taint Sinks
How to check
Where to check What to checkLocation in the program to perform a check
• Function entry / exit• Statement type• Specific program point
Taint Sinks
How to check
Where to check What to checkLocation in the program to perform a check
• Function entry / exit• Statement type• Specific program point
The data whose tags should be checked
Taint Sinks
How to check
Where to check What to checkLocation in the program to perform a check
• Function entry / exit• Statement type• Specific program point
The data whose tags should be checked
• Variables• Function parameters• Function return value
Taint Sinks
How to check
Where to check What to checkLocation in the program to perform a check
• Function entry / exit• Statement type• Specific program point
The data whose tags should be checked
• Variables• Function parameters• Function return value
Taint Sinks
How to checkSet of conditions to check and a set of actions to perform if the conditions are not met.
Where to check What to checkLocation in the program to perform a check
• Function entry / exit• Statement type• Specific program point
The data whose tags should be checked
• Variables• Function parameters• Function return value
Taint Sinks
How to checkSet of conditions to check and a set of actions to perform if the conditions are not met.• validate presence of tags (exit or log)• ensure absence of tags (exit or log)• ...
• Use Dytan to taint program inputs and measure the amount of heap data tainted at program exit
• Compare Dytan against inaccurate policies• no implicit operands (no IM)• no address generators (no AG)• no implicit operands, no address generators (no
IM, no AG)
Goal: measure the effect of inaccurate propagation policies on analysis results
RQ2: results
0%
25%
50%
75%
100%
Firefox (1 page) Firefox (3 pages) Gzip
Dytan No IM No AG No IM, no IG
Performance• Measured for gzip:
≈30x for data flow
≈50x for data and control flow
• High overhead, but...
Performance
• In line with existing implementations
• Measured for gzip:
≈30x for data flow
≈50x for data and control flow
• High overhead, but...
Performance
• In line with existing implementations
• Designed for experimentation
• Favors flexibility over performance
• Measured for gzip:
≈30x for data flow
≈50x for data and control flow
• High overhead, but...
Performance
• In line with existing implementations
• Designed for experimentation
• Favors flexibility over performance
• Implementation can be further optimized
• Measured for gzip:
≈30x for data flow
≈50x for data and control flow
• High overhead, but...
Related work
• Existing dynamic tainting approaches [Suh et al. 04, Newsome and Song 05, Halfond et al. 06, Kong et al. 06, ...]• Ad-hoc
• Other dynamic taint analysis frameworks [Xu et al. 06 and Lam and Chiueh 06]• Focused on security applications• Single taint mark• No control-flow propagation
• Operate at the source code level
Conclusions
• Dytan
• a general framework for dynamic tainting
• allows for instantiating and experimenting with different dynamic taint analysis approaches