Type-based Taint Analysis for Java Web Applications Wei Huang , Yao Dong and Ana Milanova Rensselaer Polytechnic Institute 1
Feb 24, 2016
Type-based Taint Analysis for Java Web Applications
Wei Huang, Yao Dong and Ana Milanova
Rensselaer Polytechnic Institute
1
Taint Analysis for Java Web ApplicationsTracks flows from untrusted
sources to sensitive sinks◦Such flows can cause SQL-injection, Cross-site scripting, other attacks
2
Untrusted input
Sensitive sinksunsanitized
SOURCES: ServletRequest.getParam
eter(), etc.
SINKS:Statement.execu
te(), etc
3
SQL Injection
HttpServletRequest req = ...;Statement stat = ...;String user = req.getParameter(“user”);String query = “SELECT * FROM Users WHERE name
= “ + user;stat.execute(query);
Tainted input
“John OR 1=1”
SELECT * FROM Users WHERE name = John OR 1 = 1
4
Work on Taint AnalysisFinding Security Vulnerabilities with
Static Analysis [Livshits and Lam, Usenix Security’05]
TAJ [Tripp et al. PLDI’09]F4F [Sridharan et al. OOPSLA’11] Andromeda [Tripp et al. FASE’13]
TAJ, F4F and Andromeda are included in a commercial tool from IBM, called AppScan
5
Issues with Existing WorkDataflow and points-to based
approaches
Reflection
Libraries
Frameworks
6
Our Type-based Taint AnalysisSFlow: a type systemSFlowInfer: inference tool for SFlow
◦Takes Java program where sources are typed tainted and sinks are typed safe
◦Infers SFlow types for the rest of the variables◦If inference succeeds --- no flows from sources
to sinks◦If it fails with type errors --- potential flows
Easily and effectively handles reflection, libraries and frameworks
Inference and Checking Framework
7
Unified Typing Rules
Set-Based Solver
Extract Typing
Type Checking
Parameters
Instantiated Rules
Set-based Solution
Concrete Typing
Program Source
AnnotatedLibraries
Immutability (ReIm) Universe Types (UT) Ownership Types (OT)
SFlow AJ EnerJ More?
8
SFlowInferThe instantiated inference toolDetects (or verifies the absence
of) information flow violations
Java source
Annotated
Libraries
SFlowInfer Result
Sources and Sinks
9
SQL Injection
HttpServletRequest req = ...;Statement stat = ...;tainted String user = req.getParameter(“user”);tainted String query = “SELECT * FROM Users WHERE name = “ + user;stat.execute(query);
Source: the return value
is tainted
Type error!
Sink: the parameter is
safe
Subtyping: safe <: tainted
10
ContributionsSFlow: A context-sensitive type
system for secure information flow SFlowInfer: An inference
algorithm for SFlow◦SFlowInfer is an effective taint analysis tool
Implementation and evaluation
11
OutlineSFlow type systemInference algorithm for SFlowHandling of reflection, libraries
and frameworks Implementation and evaluation
SFlow Type Qualifierstainted: A variable x is tainted, if
there is flow from an untrusted source to x
safe: A variable x is safe if there is flow from x to a safe sink
poly: The polymorphic qualifier, can be instantiated to tainted or safe safe <: poly <: tainted
12
13
Instantiated Typing Rules for SFlow
(TCALL)
T
Viewpoint adaptation accounts for context
sensitivity.qy is the context of
adaptation.
Additional constraints…
14
OutlineSFlow type systemInference algorithm for SFlowHandling of reflection, libraries
and frameworks Implementation and evaluation
Inference and Checking Framework
15
Unified Typing Rules
Set-Based Solver
Extract Typing
Type Checking
Parameters
Instantiated Rules
Set-based Solution
Concrete Typing
Program Source
AnnotatedLibraries
Immutability (ReIm) Universe Types (UT) Ownership Types (OT)
SFlow AJ EnerJ More?
16
Set-based SolverSet Mapping S:
◦variable {tainted, poly, safe}Iterates over statements s
◦Removes infeasible qualifiers for each variable in s according to the typing rule
Until reaches a fixpoint, and outputs ◦Type errors if one or more variables get assigned the empty set, or
◦A set-based solution
17
From Stanford Securibench-microStringBuffer buf;…foo(buf, buf, resp, req);
void foo(StringBuffer b, StringBuffer b2, ServletResponse resp, ServletRequest req) {
String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
String str = b2.toString(); writer.println(str); //sink
}
18
From Stanford Securibench-microStringBuffer buf;…foo(buf, buf, resp, req);
void foo(StringBuffer b, StringBuffer b2, ServletResponse resp, ServletRequest req) {
String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
String str = b2.toString(); writer.println(str); //sink
}
19
From Stanford Securibench-microStringBuffer buf;…foo(buf, buf, resp, req);
void foo(StringBuffer b, StringBuffer b2, ServletResponse resp, ServletRequest req) {
String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
String str = b2.toString(); writer.println(str); //sink
}
20
From Stanford Securibench-microStringBuffer buf;…foo(buf, buf, resp, req);
void foo(StringBuffer b, StringBuffer b2, ServletResponse resp, ServletRequest req) {
String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
String str = b2.toString(); writer.println(str); //sink
}
21
From Stanford Securibench-microStringBuffer buf;…foo(buf, buf, resp, req);
void foo(StringBuffer b, StringBuffer b2, ServletResponse resp, ServletRequest req) {
String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
String str = b2.toString(); writer.println(str); //sink
}
22
From Stanford Securibench-microStringBuffer buf;…foo(buf, buf, resp, req);
void foo(StringBuffer b, StringBuffer b2, ServletResponse resp, ServletRequest req) {
String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
String str = b2.toString(); writer.println(str); //sink, BAD!
}
23
Set-based Solver{tainted,poly,safe} StringBuffer buf;…foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b, {tainted,poly,safe} StringBuffer b2, ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString(); writer.println(str); //sink, BAD: flow from source!
}
24
Set-based Solver{tainted,poly,safe} StringBuffer buf;…foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b, {tainted,poly,safe} StringBuffer b2, ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString(); writer.println(str); //sink, BAD: flow from source!
}
25
Set-based Solver{tainted,poly,safe} StringBuffer buf;…foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b, {tainted,poly,safe} StringBuffer b2, ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString(); writer.println(str); //sink, BAD: flow from source!
}
26
Set-based Solver{tainted,poly,safe} StringBuffer buf;…foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b, {tainted,poly,safe} StringBuffer b2, ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString(); writer.println(str); //sink, BAD: flow from source!
}
27
Set-based Solver{tainted,poly,safe} StringBuffer buf;…foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b, {tainted,poly,safe} StringBuffer b2, ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString(); writer.println(str); //sink, BAD: flow from source!
}
28
Set-based Solver{tainted,poly,safe} StringBuffer buf;…foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b, {tainted,poly,safe} StringBuffer b2, ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name; name = req.getParameter(NAME);//source b.append(name); PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString(); writer.println(str); //sink
}Type error! tainted or poly str cannot be assigned to safe
parameter!
29
Set-based Solver (Cont’d)What if the set-based solver
terminates without a type error?Extract the maximal typing from
set-based solution according to preference ranking
tainted > poly > safe◦If S(x) = {poly, safe} the maximal
typing types x poly Unfortunately, the maximal typing
for SFlow does not always type-check
Inference and Checking Framework
30
Unified Typing Rules
Set-Based Solver
Extract Typing
Type Checking
Parameters
Instantiated Rules
Set-based Solution
Concrete Typing
Program Source
AnnotatedLibraries
Immutability (ReIm) Universe Types (UT) Ownership Types (OT)
SFlow AJ EnerJ More?
31
Maximal Typing
class A { {String f; {String get(A this) { return this.f; }}A y = ...;String x = y.get();writer.println(x); // sink
Unfortunately, the maximal typing for SFlow does not always type-check!
32
Maximal Typing (Cont’d)class A { {poly} String f; {poly,safe} String get({poly,safe} this) { return this.f; }}{tainted,poly,safe} A y = ...;{safe} String x = y.get();
writer.println(x);
33
Maximal Typing (Cont’d)class A { {poly} String f; {poly,safe} String get({poly,safe} this) { return this.f; }}{tainted,poly,safe} A y = ...;{safe} String x = y.get();
writer.println(x);
34
Maximal Typing (Cont’d)class A { {poly} String f; {poly,safe} String get({poly,safe} this) { return this.f; }}{tainted,poly,safe} A y = ...;{safe} String x = y.get();
writer.println(x);
✗
35
Method Summary ConstraintsReflect the relations between
parameters and return valuesFurther remove infeasible
qualifiersString id(String p) { String x = p; return x;}
36
Method Summary Constraints (Cont’d)
class A { {poly} String f; {poly,safe} String get({poly,safe} this) { return this.f; }}{tainted,poly,safe} A y = ...;{safe} String x = y.get();
writer.println(x);
37
Method Summary Constraints (Cont’d)
class A { {poly} String f; {poly,safe} String get({poly,safe} this) { return this.f; }}{tainted,poly,safe} A y = ...;{safe} String x = y.get();
writer.println(x);
✔
38
OutlineSFlow type systemInference algorithm for SFlowHandling of reflection, libraries
and frameworks Implementation and evaluation
39
Reflection, Libraries and FrameworksReflective object creation is easy!There is no need to abstract heap
objects!Flow from x to y is reflected through
subtyping x <: yX x = (X)Class.forName(“str”).newInstance();x.f = a; // a is a sourcey = x;b = y.f; // b is a sink
40
Reflection, Libraries and Frameworks (Cont’d)Libraries (JDK, third-party,
frameworks)
Unknown library methods are typed poly, poly poly
safe l = r.m(r1,r2)
l = r.m(tainted r1,r2)
41
Reflection, Libraries and Frameworks (Cont’d)Frameworks (e.g., Struts, Spring)
◦Framework classes/interfaces are subclassed/implemented in web application code
Superclass-subclass relation is handled using function subtyping constraints
UserAction.execute(ActionForm userForm) <:Action.execute(tainted ActionForm form) entails form <: userForm //userForm is tainted
42
OutlineSFlow type systemInference algorithm for SFlowHandling of reflection, libraries
and frameworks Implementation and evaluation
43
ImplementationBuilt in inference and checking
framework for pluggable types [Huang et al. ECOOP’12]◦Instantiated framework with SFlow◦Built on top of the Checker Framework
[Papi et al. ISSTA’08, Dietl et al. ICSE’11]
Publicly available at ◦http://code.google.com/p/type-inference/
44
EvaluationDroidBench
◦A suit of 39 Android apps by [Arzt et al. PLDI’14] for evaluating taint analysis for Android
Java web applications◦Stanford Securibench: a suit by Ben Livshits designed for evaluating taint analysis
◦Other web applications from previous work
◦13 web applications comprising 473kLOC
45
DroidBench [Arzt et al. PLDI’14]
Tool Name AppScan Fortify SCA
FlowDroid
SFlowInfer
Correct warning ✔ 14 17 26 28False warning ✖ 5 4 4 9Missed flow 14 11 2 0Precision ✔/(✔+✖) 74% 81% 86% 76%Recall ✔/(✔+) 50% 61% 93% 100%
SFlowInfer outperforms AppScan and Fortify SCA
FlowDroid [Arzt et al. PLDI’14] is flow-sensitive◦DroidBench is designed for flow sensitivity
46
Java Web ApplicationsWe manually examined all type errorsParameter Manipulation / SQL Injection
◦7 benchmarks have no type errors◦66 type errors correspond to true flows◦Average false positive rate: 15%
Parameter Manipulation / XSS◦8 benchmarks have no type errors◦143 type errors correspond to true flows◦Average false positive rate: 4%
47
Runtime PerformanceSFlowInfer takes less than 3
minutes on all but 2 benchmarksLargest benchmark, photov
126kLOC, takes 640 seconds◦Can be optimized
Maximal heap size is set to 2GB!
48
ConclusionA type system for secure
information flowAn efficient type inference
algorithm◦Effective taint analysis tool
Evaluation on 473kLOC
Publicly available at ◦http://code.google.com/p/type-inference/