Pattern Recognition and Applications Lab University of Cagliari, Italy Department of Electrical and Electronic Engineering Secure Kernel Machines against Evasion Attacks Paolo Russu, Ambra Demontis, Battista Biggio, Giorgio Fumera, Fabio Roli [email protected]Dept. Of Electrical and Electronic Engineering University of Cagliari, Italy AISec 2016 – Vienna, Austria – Oct., 28th 2016
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Gradients of g(x) can be analyticallycomputed in many cases– SVMs, Neural networks
−2−1.5
−1−0.5
00.5
11.5
x
f (x) = sign g(x)( ) =+1, malicious−1, legitimate
"#$
%$
minx 'g(x ')
s.t. d(x, x ') ≤ dmax
x '
7
d(x, !x ) ≤ dmaxFeasible domain
[Biggio et al., ECML 2013]
http://pralab.diee.unica.it
Computing Descent Directions
Support vector machines
Neural networks
x1
xd
d1
dk
dm
xf g(x)
w1
wk
wm
v11
vmd
vk1
……
……
g(x) = αi yik(x,i∑ xi )+ b, ∇g(x) = αi yi∇k(x, xi )
i∑
g(x) = 1+ exp − wkδk (x)k=1
m
∑#
$%
&
'(
)
*+
,
-.
−1
∂g(x)∂x f
= g(x) 1− g(x)( ) wkδk (x) 1−δk (x)( )vkfk=1
m
∑
RBF kernel gradient: ∇k(x,xi ) = −2γ exp −γ || x − xi ||2{ }(x − xi )
8But…what if theclassifier is non-differentiable?
[Biggio et al., ECML 2013]
http://pralab.diee.unica.it
Evasion of Non-differentiable Classifiers
9
PD(X,Y)data
Surrogate training data
f(x)
Send queriesGet labels
Learnsurrogate classifier
f’(x)
http://pralab.diee.unica.it
Dense and Sparse Evasion Attacks
• L2-norm noise corresponds to dense evasion attacks– All features are modified by
a small amount
• L1-norm noise corresponds to sparse evasion attacks– Few features are significantly
modified
10
min$% 𝑔 𝑥%𝑠. 𝑡. |𝑥 − 𝑥%|-- ≤ 𝑑01$
min$% 𝑔 𝑥%𝑠. 𝑡. |𝑥 − 𝑥%|2 ≤ 𝑑01$
http://pralab.diee.unica.it
Goal of This Work
• Secure learning against evasion attacks exploits game-theoretical models, robust optimization, multiple classifiers, adversarial training, etc.
• Practical adoption of current secure learning algorithms is hindered by several factors:– strong theoretical requirements– complexity of implementation– scalability issues (computational time and space for training)
11
Our goal: to develop secure kernel machinesthat are not computationally more demanding
than their non-secure counterparts
http://pralab.diee.unica.it
Security of Linear Classifiers
12
http://pralab.diee.unica.it
Secure Linear Classifiers
• Intuition in previous work on spam filtering[Kolcz and Teo, CEAS 2007; Biggio et al., IJMLC 2010]– the attacker aims to modify few features– features assigned to highest absolute weights are modified first– heuristic methods to design secure linear classifiers with more evenly-
distributed weights
• We know now that the aforementioned attack is sparse– l1-norm constrained
13
Then, what does more evenly-distributed weights mean from a more theoretical perspective?
http://pralab.diee.unica.it
Robustness and Regularization[Xu et al., JMLR 2009]
• SVM learning is equivalent to a robust optimization problem– regularization depends on the noise on training data!
14
minw,b
12wTw+C max 0,1− yi f (xi )( )
i∑
minw,b
maxui∈U
max 0,1− yi f (xi +ui )( )i∑
l2-norm regularization is optimal against l2-norm noise!
infinity-norm regularization is optimal against l1-norm noise!
http://pralab.diee.unica.it
Infinity-norm SVM (I-SVM)
• Infinity-norm regularizer optimal against sparse evasion attacks
15
minw,b
w∞+C max 0,1− yi f (xi )( )
i∑ , w
∞=max
i=1,...,dwi
wei
ghts
wei
ghts
http://pralab.diee.unica.it
Cost-sensitive Learning
• Unbalancing cost of classification errors to account for different levels of noise over the training classes[Katsumada and Takeda, AISTATS ‘15]
• Evasion attacks: higher amount of noise on malicious data
16
http://pralab.diee.unica.it
Experiments on MNIST Handwritten Digits
17
• 8 vs 9, 28x28 images (784 features – grey-level pixel values)
• 500 training samples, 500 test samples, 5 repetitions
• Parameter tuning (max. detection rate at 1% FP)
0 200 400 6000
0.2
0.4
0.6
0.8
1Handwritten digits (dense attack)
TP
at F
P=
1%
d max
SVMcSVMI−SVMcI−SVM
0 2000 4000 60000
0.2
0.4
0.6
0.8
1Handwritten digits (sparse attack)
TP
at F
P=
1%
d max
SVMcSVMI−SVMcI−SVM
vs
http://pralab.diee.unica.it
Examples of Manipulated MNIST Digits
18
original sample
5 10 15 20 25
5
10
15
20
25
SVM g(x)= −0.216
5 10 15 20 25
5
10
15
20
25
I−SVM g(x)= 0.112
5 10 15 20 25
5
10
15
20
25
cI−SVM g(x)= 0.148
5 10 15 20 25
5
10
15
20
25
cSVM g(x)= −0.158
5 10 15 20 25
5
10
15
20
25
Sparseevasionattacks(l1-normconstrained)
original sample
5 10 15 20 25
5
10
15
20
25
cI−SVM g(x)= −0.018
5 10 15 20 25
5
10
15
20
25
I−SVM g(x)= −0.163
5 10 15 20 25
5
10
15
20
25
cSVM g(x)= 0.242
5 10 15 20 25
5
10
15
20
25
SVM g(x)= 0.213
5 10 15 20 25
5
10
15
20
25
Denseevasionattacks(l2-normconstrained)
http://pralab.diee.unica.it
Experiments on Spam Filtering
• 5000 samples from TREC 07 (spam/ham emails)• 200 features (words) selected to maximize information gain• Parameter tuning (max. detection rate at 1% FP)• Results averaged on 5 repetitions
19
0 5 10 150
0.2
0.4
0.6
0.8
1Spam filtering (sparse attack)
TP
at
FP
=1
%
d max
SVMcSVMI−SVMcI−SVM
http://pralab.diee.unica.it
Security of Non-linear Classifiers
20
http://pralab.diee.unica.it
Secure Nonlinear Classifiers (Intuition)
21
http://pralab.diee.unica.it
Secure Nonlinear Classifiers (Intuition)
22
http://pralab.diee.unica.it
Secure Nonlinear Classifiers (Intuition)
22
http://pralab.diee.unica.it
Secure Nonlinear Classifiers (Intuition)
22
http://pralab.diee.unica.it
Secure Nonlinear Classifiers (Intuition)
22
http://pralab.diee.unica.it
Secure Kernel Machines
• Key Idea: to better enclose benign data (eliminate blind spots)– Adversarial Training / Game-theoretical models
• We can achieve a similar effect by properly modifying the SVM parameters (classification costs and kernel parameters)
• Adversary’s capability– adding up to dmax API calls– removing API calls may
compromise the embeddedmalware code
classifier
benign
malicious
API reference extraction
API reference selection
learning-based model
runtime analysis
known label
JavaScript
API references
Suspiciousreferences
Experiments on PDF Malware Detection
minx 'g(x ')
s.t. d(x, x ') ≤ dmax
x ≤ x '
24
evalisNaNthis.getURL...
http://pralab.diee.unica.it
Experiments on PDF Malware Detection
• Lux0R Data: Benign / Malicious PDF files with Javascript– 5,000 training samples, 5,000 test samples, 5 repetitions– 100 API calls selected from training data
• Detection rate (TP) at FP=1% vs max. number of added API calls
25numberofaddedAPIcalls
http://pralab.diee.unica.it
Conclusions and Future Work
• Classifier security can be significantly improved by properly tuning classifier parameters– regularization terms– cost-sensitive learning, kernel parameters
Future Work• Security / complexity comparison against current adversarial
approaches– Adversarial Training / Game-theoretical models
• More theoretical insights on classifier / feature vulnerability