Pattern Recognition and Applications GroupUniversity of Cagliari, Italy
Department of Electrical and Electronic Engineering
R AP
Adversarial pattern classificationAdversarial pattern classification
using multiple classifiers and randomizationusing multiple classifiers and randomization
Battista Biggio, Giorgio Fumera, Fabio Roli
S+SSPR 2008,Orlando, Florida, December 4th, 2008
physicalprocess
acquisition/measurement
pattern(image, textdocument, ...)
x1x2...xn
featurevector
learningalgorithm
classifier
randomnoise
ed by sets of coupleds for formal neuronsation of essentialsfeat
Example: OCR
But many security applications, such as spam filtering, do not fit well with theabove model:
noise is not random, but adversarial. Malicious errors.false negatives are not random, they are created to evade the classifiertraining data can be “tainted” by the attackeran important classifier’s feature is its “hardness of evasion”, that is, the effort thatthe attacker has to do for evading the classifier
Standard pattern Standard pattern classification classification modelmodel
Adversarial Adversarial pattern pattern classificationclassification
It’s a game with two players: the classifier and the adversaryThe adversary camouflages illegitimate patterns in adversarial way to evade the classifierThe classifier should be adversary-aware to handle the adversarial noise and toimplement defence strategies
measurementpattern(e-mail,network packet,fingerprint, ...)
x1x2...xn
featurevector
learningalgorithm
classifier
adversarialnoise
Example:spam e-mails
Spam message:CNBC Features MPRG on PowerLunch Today, Price Climbs74%!The Motion Picture GroupSymbol: MPRGPrice: $0.33 UP 74%
AnAn example of adversarial classificationexample of adversarial classification
Feature weightsbuy = 1.0viagra = 5.0
Total score = 6.0
From: [email protected] Viagra !
> 5.0 (threshold)
Spam
Spam FilteringSpam Filtering
Linear Classifier1st round1st round
Note that the popular SpamAssassin filter is really a linear classifier See http://spamassassin.apache.org
A game in the feature spaceA game in the feature space……
1st round1st round
X2
X1
+++
+
+--
--
-
yc(x)
Feature weightsbuy = 1.0viagra = 5.0
Classifier’s weights are learnt using an initial “untainted” training set
See, for example, the case of the SpamAssassin filterhttp://spamassassin.apache.org/full/3.0.x/dist/masses/README.perceptron
From: [email protected] Viagra!
N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004
AnAn example of adversarial classificationexample of adversarial classification
Feature weightsbuy = 1.0viagra = 5.0University = -2.0Florida = -3.0
Total score = 1.0
From: [email protected] Viagra!Florida UniversityNanjing
< 5.0 (threshold)
Spammer attacks by adding Spammer attacks by adding ““goodgood”” words words……
Linear Classifier2nd round2nd round
Ham
A game in the feature spaceA game in the feature space……
2nd round2nd round Feature weightsbuy = 1.0viagra = 5.0University = -2.0Florida = -3.0
Adding good words is a typical trick used by spammers for evading a filter
The spammer’s goal is modifying the mail so that the filter is evaded but themessage is still understandable by humans
Spammer attacks by adding Spammer attacks by adding ““goodgood”” words words……X2
X1
+++
+
+---
-
-
yc(x)
-
From: [email protected] Viagra!Florida UniversityNanjing
N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004
Modelling the spammerModelling the spammer’’s attack strategys attack strategyN. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004
X2
X1
+yc(x)The adversary uses a costfunction A(x) to select maliciouspatterns that can becamouflaged as innocent withminimum cost W(x, x’)
�
A(x) = argmaxx '!X
UA(y
c(x' ),+)"W (x,x' )[ ]
- xx’
A(x)
Adversary utility is higher when malicious patterns are misclassified:
For spammers, the cost W(x, x’) is related to adding words, replacing words, etc.
The adversary transforms a malicious pattern x into an innocent pattern x’ if thecamouflage cost W(x, x’) is lower than the utility gain
In spam filtering, the adversary selects spam mails which can be camouflaged as hammails with a minimum number of modifications of mail content
�
UA(!,+) >U
A(+,+)
AnAn example of adversarial classificationexample of adversarial classification
Feature weightsbuy = 1.0viagra = 5.0University = -0.3Florida = -0.3
Total score = 5.4
From: [email protected] Viagra!Florida UniversityNanjing
> 5.0 (threshold)
Classifier reaction by retrainingClassifier reaction by retraining……
Linear Classifier3rd round3rd round
Spam
Modelling classifier reactionModelling classifier reaction
3rd round3rd roundFeature weightsbuy = 1.0viagra = 5.0University = -2.0Florida = -3.0
X2
X1
+++
+
+---
-
-
yc(x)
-
From: [email protected] Viagra!Florida UniversityNanjing
Classifier retrainingClassifier retraining……N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004
The classifier is adversary-aware, it takes into account the previous moves ofthe adversary
In real cases, this means that the filter’s user provides the correct labels formislabelled mails
The classifier constructs a new decision boundary yc(x) if this move gives anutility higher than the cost for extracting features and re-training
Adversary-aware classifierAdversary-aware classifierN. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004
Results reported in this paper showed that classifier performancesignificantly degrades if the adversarial nature of the task is not takeninto account, while an adversary-aware classifier can performsignificantly better
By anticipating the adversary strategy, we can defeat it.
““If you know the enemy and know yourself, you need not fearIf you know the enemy and know yourself, you need not fearthe result of a hundred battlesthe result of a hundred battles””((Sun Tzu, 500 BC)Sun Tzu, 500 BC)
Real anti-spam filters should be adversary-aware, which means thatthey should adapt to and anticipate adversary’s moves: exploiting thefeedback of the user, changing their operation, etc.
Mimimum costcamouflage(s)BUY VI@GRA!
x
1
+ x '
C(x) = + C(x) = !
Beyond classifier retrainingBeyond classifier retraining……
x
2
Real anti-spam filters can be re-trained by the feedback of the userswhich can provide correct labels for the mislabelled mails. In the modelof Dalvi et al., this corresponds to the assumption of perfect knowledgeof the adversary’s strategy function A(x)
Defence strategies in adversarial classificationDefence strategies in adversarial classification
Beyond retraining, are there other defence strategies thatBeyond retraining, are there other defence strategies thatwe can implement?we can implement?
A defence strategy: hiding information by randomizationA defence strategy: hiding information by randomization
““Keep the adversary guessing. If your strategy is a mystery, it cannot beKeep the adversary guessing. If your strategy is a mystery, it cannot becounteracted. This gives you a significant advantagecounteracted. This gives you a significant advantage””(Sun Tzu, 500 BC)(Sun Tzu, 500 BC)
Am I evading it? X2
X1
+y1(x)
-x
x’
y2(x)
- +Two randomrealizations of theboundary yc(x)
An intuitive strategy for making a classifier harder to evade is to hideinformation about it to the adversary
A possible implementation of this strategy is to introduce some randomness inthe placement of the classification boundary
A defence strategy: hiding information by randomizationA defence strategy: hiding information by randomization
Am I evading it? X2
X1
+y1(x)
-x
x’A(x)=x’
y2(x)
- +Two randomrealizations of yc(x)A(x)=x’ does notevade y1(x) !
Consider a randomized classifier yc(x, T), where the random variable is the training set T
Example: assume that UA(-,+)=5, UA(+,+)=0, W(x’, x)=3
Case 1: the adversary knows the actual boundary y2(x)The adversary’s gain if the pattern x is changed into x’ is UA(x’, x) - W(x’, x)= 5 - 3 = 2,then the adversary does the transformation ad evades the classifier.
Case 2: two random boundaries with P(y1(x))=P(y1(x))=0.5The expected gain is: [UA(x’, x) * P(y1(x)) + UA(x’, x) * P(y2(x))] - W(x’, x)= [0 * 0.5 - 5 * 0.5] - 3 = 2.5 - 3 < 0,then the adversary does not move, even if such move would allow evading the classifier.
A defence strategy: hiding information by randomizationA defence strategy: hiding information by randomization
Am I evading it? X2
X1
+y1(x)
-x
x’A(x)=x’
y2(x)
- +
Why is a randomized classifier harder to evade?Why is a randomized classifier harder to evade?
In the Proceedings paper we show that adversary’s strategy A(x) becomessuboptimal. Adversary does not camouflage malicious patterns that would allowevading the classifier, or camouflage malicious patterns which are misclassifiedby the classifier.
�
EYC
A(x){ } = argmaxx '!X
EYC
UA(y
c(x' ),+){ }"W (x, x' )[ ]
EYC
A(x){ }# A(x /yc (x' )) = Aopt(x)
Key points:
yc(x) becomes a random variable: YcThe adversary has to compute theexpected value of A(x) by averagingover possible realizations of yc(x)
Black/White List
URL Filter
Signature Filter
Header Analysis
Content Analysis
Σ… Assigned class
legitimate
spam
Evade hard MCS with randomizationEvade hard MCS with randomization
http://spamassassin.apache.org
The defence strategy based on “randomization” can be implemented in severalwaysWe implemented it using the multiple classifiers approach, by the randomisationof the combination functionFor our experiments, we used the SpamAssassin filter that is basically a linearlyweighted combination of classifiers, and randomized the weights by training setbootstrapping
ExperimentsExperiments with multiple classifiers and randomizationwith multiple classifiers and randomization
E-mail data set: TREC 200775,419 real e-mail messagesreceived between Apr.-July 200725,220 ham, 50,199 spam
SpamAssassin architecture
Experimental set upWe used SpamAssassin filter with aweighted sum as combination function(a SVM with linear kernel)
Randomization of the combinationfunction by bootstrap. The adversary“sees” 100 different sets of weightswith identical probability.
Key point: the adversary does not know theactual set of weights deployed for combiningmultiple classifiers (filtering rules). So it candevise only a suboptimal strategy A(x).
11.2119.551.461.300.560.98
�
A
rnd
U
�
C
rnd
U
�
C
det
U
�
detFN (%)
�
rnd
FN (%)
�
A
det
U
The average false negative rate decreasesfrom 19.55% to 11.25% when the classifieruses randomizationThis is confirmed by the decrease ofadversary’s utility and the increase ofclassifier’s utility
Assume that the adversary can make anymodification which reduces the score of a rule