-
Pattern Recognition Letters 135 (2020) 146–152
Contents lists available at ScienceDirect
Pattern Recognition Letters
journal homepage: www.elsevier.com/locate/patrec
Steganographic universal adversarial perturbations
Salah Ud Din ∗, Naveed Akhtar, Shahzad Younis, Faisal Shafait,
Atif Mansoor, Muhammad Shafique
School of Electrical Engineering and Computer Science (SEECS),
National University of Sciences & Technology (NUST) H-12,
Islamabad 440 0 0, Pakistan
a r t i c l e i n f o
Article history:
Received 19 July 2019
Revised 24 March 2020
Accepted 17 April 2020
Available online 25 April 2020
Keywords:
Adversarial attack
Steganography
Deep neural networks
Wavelet transform
a b s t r a c t
We propose a steganography based technique to generate
adversarial perturbations to fool deep models
on any image. The proposed perturbations are computed in a
transform domain where a single secret
image embedded in any target image makes any deep model
misclassify the target image with high
probability. The attack resulting from our perturbation is ideal
for black-box setting, as it does not re-
quire any information about the target model. Moreover, being a
non-iterative technique, our perturba-
tion estimation remains computationally efficient. The computed
perturbations are also imperceptible to
humans while they achieve high fooling ratios for the models
trained on large-scale ImageNet dataset.
We demonstrate successful fooling of ResNet-50, VGG-16,
Inception-V3 and MobileNet-V2, achieving up
to 89% fooling of these popular classification models.
© 2020 Elsevier B.V. All rights reserved.
d
p
t
t
c
m
n
a
(
d
i
t
a
t
t
i
o
b
p
i
b
a
d
1. Introduction
Deep Neural Networks (DNNs) have demonstrated outstanding
performance for many challenging tasks in speech recognition [1]
,
natural language processing [2] and classification [3–5] .
Convolu-
tional Neural Networks (CNNs) based image classification [4]
has
revolutionized the use of deep learning in computer vision. In
the
last few years, researchers have been continuously devising
deep
learning based solutions for many complex tasks in the areas
of
medical science [6] , safety and security [7] , and self-driving
vehi-
cles [8] .
As deep learning is now being also used in security-critical
applications, the vulnerability of the state-of-the-art DNNs to
ad-
versarial attacks has recently attracted significant interest of
re-
searchers [9] . These attacks come in the form of a small
pertur-
bation to the input to fool the network to change its
prediction
altogether. At the same time, the perturbation remains
impercep-
tible to humans. There are several existing techniques for
crafting
such perturbations [10–12] . These techniques either cause the
net-
work to predict a specific class for the input, i.e. targeted
attack,
or make it predict any incorrect output, i.e. non-targeted
attack.
In the domain of natural images, Moosavi-Dezfooli et al. [13]
com-
puted image-agnostic perturbations that can fool networks on
any
image, in contrast to the commonly used perturbations that
fool
∗ Corresponding author. E-mail address: [email protected]
(S. Ud Din).
s
i
5
https://doi.org/10.1016/j.patrec.2020.04.025
0167-8655/© 2020 Elsevier B.V. All rights reserved.
eep models on individual images. However, the perturbation
com-
uted by Moosavi-Dezfooli et al. are not completely
imperceptible
o the Human visual system. Moreover, their technique is
compu-
ationally expensive as it must iterate over a large training
data to
ompute the perturbation. Furthermore, it requires complete
infor-
ation of the weights of the target network, hence it is
inherently
ot suitable for black-box settings where this information is
not
vailable.
In this paper, we propose another kind of image-agnostic
i.e. universal) perturbation that is computed efficiently and
also
oes not require any network information. Moreover, it
remains
mperceptible to Humans, see Fig. 1 . We leverage
steganography
o compute the desired perturbations, where we hide a secret
im-
ge inside the image to be classified (i.e. host image). The
per-
urbation is essentially performed in a transform domain
(wavelet
ransform), as opposed to the existing convention of
manipulat-
ng images in the pixel domain. We address the key challenge
f identifying the appropriate frequency band/component that
can
e embedded in the host images for fooling the networks while
reserving the perturbation imperceptibility. This completely
erad-
cates the need of computing network gradients for the
pertur-
ation estimation. Interestingly, we find that a single secret
im-
ge can be used to fool multiple networks trained on
large-scale
atasets with high probability. Nevertheless, we analyze
multiple
ecret images for thoroughness. We demonstrate successful
fool-
ng of the state-of-the-art ImageNet models, that includes
ResNet-
0 [3] , Inception-V3 [14] , VGG-16 [15] and MobileNet-V2 [16]
.
https://doi.org/10.1016/j.patrec.2020.04.025http://www.ScienceDirect.comhttp://www.elsevier.com/locate/patrechttp://crossmark.crossref.org/dialog/?doi=10.1016/j.patrec.2020.04.025&domain=pdfmailto:[email protected]://doi.org/10.1016/j.patrec.2020.04.025
-
S. Ud Din, N. Akhtar and S. Younis et al. / Pattern Recognition
Letters 135 (2020) 146–152 147
Fig. 1. Adversarial examples created with Discrete Wavelet
Transform and Singular
Value Decomposition based Steganography that successfully fool
the state-of-the-
art DNNs. A secret image (left) is embedded in the host images
(middle) in the
frequency domain to get adversarial examples (right). Labels
predicted by ResNet-
50 are also indicated.
2
a
m
t
p
t
t
v
G
p
I
t
u
v
i
a
f
s
l
a
p
r
t
f
p
t
a
o
m
i
t
i
T
t
A
i
T
t
b
t
t
p
d
D
f
a
r
i
l
b
a
t
J
a
t
f
p
s
e
p
t
d
i
c
p
3
t
i
b
m
i
n
t
P
w
ψ
w
t
t
s
o
. Related work
Adversarial attacks on DNNs provide an opportunity to
estimate
network’s robustness in adversarial settings before its
deploy-
ent in the real-world. They have recently attracted significant
at-
ention of the research community [9] . Szegedy et al. [11] first
ex-
osed the susceptibility of the state-of-the-art DNNs to small
per-
urbations to input images that can lead them to incorrect
predic-
ions. Since then, several techniques have emerged to generate
ad-
ersarial perturbations. Goodfellow et al. [17] presented the
Fast
radient Sign Method (FGSM) to efficiently generate
adversarial
erturbations for a given image using one step gradient
ascend.
nstead of one large step, Kurakin et al. [18] proposed to take
mul-
iple small steps in an iterative method to compute the
gradients
ntil the perturbation achieves the required fooling.
Dong et al. [19] presented an iterative method for computing
ad-
ersarial perturbations based on momentum that avoids local
max-
ma during the iterative process. They showed that the
developed
lgorithm can also deceive adversarially trained networks.
Success-
ul targeted attack using an adversarial patch has been
demon-
trated by Brown et al. [20] that can deceive a network on a
arge number of images. In their technique, adversarial
patches
re crafted under Expectation Over Transformation framework.
The
atches are then added/replaced at a specific location in the
natu-
al image to make it an adversarial example. The adversarial
per-
urbations generated by Wagner [21] are designed to attack the
de-
ensive distillation algorithm of Papernot et al. [22] . The
generated
erturbations are forced to be quasi-imperceptible by
restricting
heir l 0 , l 2 or l ∞ norm. Papernot et al. [10] generated
adversarial examples by putting
n upper bound on the l 0 -norm of the perturbation rather
than
n l 2 or l ∞ norm. Their method computes the adversarial
saliencyap for the network gradients after modifying the pixels of
the
nput image one at a time. The algorithm alters the values of
only
hose pixels that have greater affect on fooling Su et al. [12] .
mod-
fied only one pixel of the image to create adversarial
example.
heir method uses Differential Evolution [23] to estimate the
spa-
ial location and RGB values of the adversarial pixel in the
image.
nother iterative method of computing adversarial
perturbations
s DeepFool algorithm proposed by Moosavi-Dezfooli et al. [24]
.
heir method is based on linearizing the decision boundaries
of
he deep neural network around the input image. A small
pertur-
ation vector is added to the image at every iteration that
takes
he perturbed image towards the decision boundary. The
perturba-
ion signal at every iteration is accumulated to compute the
final
erturbation.
All the above mentioned methods can fool deep models on in-
ividual images for which the perturbation is computed.
Moosavi-
ezfooli et al. [13] proposed universal adversarial perturbations
to
ool neural networks on any image. These perturbations are
gener-
ted by restricting their l 2 or l ∞ norm. Although impressive
foolingatios are achieved by their perturbations, the resulting
adversar-
al patterns become quasi-imperceptible in the images. Moreover,
a
arge amount of training data is required to compute their
pertur-
ations. Universal perturbations are also constructed by
Khrulkov
nd Oseledets [25] using smaller number of images. They
obtained
he perturbations by taking singular values of the hidden
layers’
acobian matrices.Mopuri et al. [26] computed data
independent
dversarial perturbations using fast-feature-fool method.
The universal adversarial perturbations discussed above
require
raining data for computing the manipulative signal for
network
ooling. The iterative nature of the techniques to compute
those
erturbations also makes them computationally expensive. For
in-
tance, for the perturbations based on network gradient
estimates,
.g. [13] , GPU based hardware is required for computing a
single
erturbation in a reasonable time. In contrast to these
techniques,
he method proposed in this work does not any require
training
ata. Furthermore, no information regarding the targeted
network
s required. This also makes our adversarial perturbation an
ideal
hoice for black-box attacks. Moreover, requiring no gradient
com-
utations makes our technique computationally efficient.
. Problem formulation
We use the following notations for the formal presentation
of
he problem. Let x ∈ R m ×n be a clean image that is given as
annput to a well-trained DNN classifier f(.) . We ignore the
num-
er of image channels for notational simplification. The
classifier
aps the image to its correct label ‘ � ’ with a high
probability
.e. P ( { f(x) : x → � } ) → 1 . Our goal is to generate a
perturbation sig-al ‘ η’ which when embedded in the clean input
image satisfieshe following constraint:
( f ( x + η) � = f (x ) ) ≥ ψ, (1)here ψ is referred to as the
fooling ratio, defined as:
= |{ f (x i + η) � = f (x i ) }| M
, ∀ i ∈ { 1 , 2 , . . . , M} (2)here M is the total number of
samples in our dataset. Notice
hat, whereas ‘ η’ is shown to be directly added to the images
inhe above formulation, it is not necessary to treat it as an
additive
ignal during its estimation. This is one of the key differences
in
ur treatment of ‘ η’ and its more common handling in the
existing
-
148 S. Ud Din, N. Akhtar and S. Younis et al. / Pattern
Recognition Letters 135 (2020) 146–152
Fig. 2. Illustration of different com ponents of 2D Wavelet
Transform of the image.
The LL and HH components respectively contain the low and high
frequency in-
formation while HL and LH components contain both the high and
low frequency
information.
h
c
4
L
A
l
d
p
x
w
R
a
t
4
m
t
c
t
S
w
D
o
p
y
t
e
l
e
b
t
c
R
a
n
R
i
methods that restrict themselves to the pixel domain to treat ‘
η’ asan additive noise.
The existing adversarial attacks are directly aimed at pixel
ma-
nipulation. Although we represent ‘ η’ in Eq. (1) following the
com-mon convention, we actually compute the perturbation in a
trans-
form domain, which results in a manipulated image that is
ob-
tained with the inverse transform. This will be clarified in the
next
Section. Another major difference between our computation of ‘
η’and its conventional treatment in the literature is in terms of
re-
stricting its norm to control the perturbation perceptibility.
The use
of transform domain allows us to manipulate the images
without
paying particular attention to the pixel domain norm
restrictions.
As will be seen shortly, the smooth holistic manipulations
resulting
from the proposed technique intrinsically result in
imperceptible
patterns embedded in the adversarial images. Following the
com-
mon convention, we alternatively refer to the manipulated
images
as the ‘adversarial examples’ in this paper.
4. Proposed approach
To create an adversarial example out of a natural image, we
tap
into the advances of Discrete Wavelet Transform (DWT) and
Sin-
gular Value Decomposition (SVD) based steganography [27] , [28]
.
The key intuition is that DWT has unique characteristics in
terms
of identifying the frequency regions where the external
informa-
tion can be hidden effectively in an image without being
easily
perceivable [27] . The main concept of our technique is that,
we
hide the low frequency components of a secret image inside
the
low frequency components of the host image, and also manipu-
late the low-to-high frequency components of the resulting
image
with the affine transformations of the secret image. The choice
of
hiding the low frequency components of the secret image and
its
manipulated low-to-high frequency components is based on the
intrinsic properties of these components. In this work, we
some-
times refer to the manipulated image as the stego-image to
better
contextualize our method in the area of steganography. The
stego-
image/adversarial example is finally used to deceive the
classifier.
4.1. No training data and target model required
The popular techniques in the literature to generate
universal
adversarial perturbation (e.g. [13,29] ) require training images
that
are used to optimize the perturbation signals, generally with
re-
spect to a target model to be attacked. This makes the
techniques
both time consuming as well as unattractive, because the
target
model can often be unknown in practice. The technique
proposed
in this work is agnostic to the target models, and it also does
not
require any particular training data. All that is required is a
suit-
able secret image that can be embedded in any image to make
the
latter an adversarial example. This is a highly desirable
property
for a universal adversarial attack. It is noted that we make a
care-
ful selection of the secret image. Further details on the
selection of
the secret image are provided in Section 5.1 .
4.2. 2D Discrete wavelet transform decomposition
We make use of 2-Dimensional ‘Haar’ based DWT [30,31] to de-
compose an image into its four components, denoted by LL, LH,
HL,
HH. The LL-component is obtained by Low-pass filtering the
image
in the horizontal direction and the vertical dimension, leading
to
a feature map that contains the low frequency information of
the
source image. The LH-component performs the Low-pass
filtering
in the horizontal direction and High-pass filtering in the
vertical
direction. This order reverses in the HL-component, while the
HH-
component uses High-pass filtering in both directions. Each of
the
resulting feature maps are components of the original image
that
ave their own unique properties. We illustrate the 2D Haar
DWT
omponents of an example image in Fig 2 .
.3. Singular value decomposition (SVD)
We use the Singular Value Decomposition (SVD) to merge the
L-components of the host and secret images, as explained
below.
n LL-component contains most of the image information in the
ow frequency bands where slight manipulation does not cause
rastic changes in the perception. Using the SVD we can
decom-
ose an image component, say ˆ x ∈ R m ×n a follows: ˆ = U ∗ S ∗
V T , (3)here U ∈ R m ×m and V ∈ R n ×n are orthogonal matrices,
and S ∈
m ×n is the diagonal matrix of singular values. In this work,
wepply SVD to each channel of an RGB image individually for
fur-
her processing.
.4. Constructing the adversarial examples
Let x 1 be the host image and x 2 be the secret image. The
afore-
entioned DWT decomposition is applied to both x 1 and x 2 to
get
heir DWT components. First, we are interested in x 1 LL - the
LL-
omponent of x 1 and x 2 LL - the LL-component of x 2 . We apply
SVD
o these components and fuse their singular values as
follows:
= (1 − α) S x 1 LL + αS x 2 LL , (4)here S comp denotes the
singular value matrix (see Eq. 3 ) of the
WT component in the subscript, and α is the hyper-parameter ofur
technique. We note that the above fusion of singular values is
erformed individually for all the channels in our images.
We reconstruct the LL-component of the adversarial image,
say
, using the fused singular values in Eq. (4) along the U and V
ma-
rices of x 1 . Hence, we refer to this component of the
adversarial
xample as y 1 LL in the text to follow. Notice that, the
dominant
ow frequency features in the adversarial image are mainly
influ-
nced by the original (i.e. host) image. On the other extreme,
we
orrow the HH-component of the adversarial image directly
from
he secret image, i.e. x 2 HH . Human perception often finds the
HH-
omponent of DWT close to white noise, see the DWT HH Fig. 4
.
eplacing this component of an image with that of another im-
ge does not drastically change the Human perception of the
origi-
al image. However, the high frequency variations in the
quantized
GB values significantly distort the image for a DNN.
In order to strengthen the original perception of the host
mage, we also use the LH-component of the host image as
-
S. Ud Din, N. Akhtar and S. Younis et al. / Pattern Recognition
Letters 135 (2020) 146–152 149
Fig. 3. Image agnostic perturbations are computed using DWT and
SVD based steganography. Host and secret images undergo 2D haar
wavelet transform to get four
frequency components (LL, LH, HL, HH). Each channel of the
LL-component of both images is decomposed into singular values
which are then fused to form the LL-component
of the desired adversarial example. The LH and HH components of
the desired image are borrowed from the host and secret image,
respectively. The HL-component is
also a fusion of the respective components of the two original
images. Inverse transform is computed over these four components to
form and adversarial example with
imperceptible perturbations.
t
d
a
g
f
f
y
w
e
L
t
f
w
t
t
m
i
{
a
a
e
C
h
a
5
p
f
[
i
t
f
r
u
t
i
t
d
t
5
a
i
p
d
t
w
d
i
G
w
m
o
c
W
F
E
p
r
t
i
T
a
u
r
g
W
I
(
i
he LH-component of the adversarial image. However, we intro-
uce slight modifications to the HL-component of the host im-
ge as follows. We take x 2 HL , and rotate it at four different
an-
les, i.e. −5 o , 5 o , −10 o , 10 o . The rotated components are
averaged toorm ˜ x 2 HL and then fused with x 1 HL under a convex
combination as
ollows:
2 HL = (1 − β) ∗ x 1 HL + β ∗ ˜ x 2 HL , (5)here β is a
hyper-parameter that controls the contribution of
ach component in the fusion process. Notice that we write
.H.S. of Eq. (5) as y 2 HL because of the fusion. It is worth
men-
ioning that whereas we systematically choose the rotation
angles
or the affine transformations, random rotations in [ −10 o , 10
o ] alsoork equally well. The main intuition behind performing the
affine
ransformations to x 2 HL is to confuse a DNN with those
transforma-
ions that can naturally occur in the images. Other affine
transfor-
ations can also be explored for the same purpose.
Finally, to reconstruct the adversarial/stego image, we use
the
nverse DWT. The inverse DWT is executed over the components
y 1 LL , x 1 LH , y 2 HL and x 2 HH }. The overall process of
generating the
dversarial image is illustrated graphically in Fig. 3 . The
resulting
dversarial examples have embedded perturbations that are
gen-
rally very hard to perceive by humans. However,
state-of-the-art
NNs get confused in classifying those images, despite
showing
igh confidence on the correct predictions of the host/original
im-
ges.
. Experiments and results
To validate our stagnographic universal adversarial attack,
we
erform extensive experiments with ImageNet 2012 [32] . We
select
our popular DNNs, i.e. ResNet-50 [3] , Inception-V3 [14] ,
VGG-16
15] and MobileNet-V2 [16] and fool those on the ImageNet
val-
dation set. Our choice of the networks is based on their
versa-
ility and established performance. Recall that our technique
per-
orms non-targeted black-box attacks. As such, no training data
is
equired for our attack, except the secret image. This allows us
to
se all the images in the validation set of the ImageNet as
our
est data (except the secret image). This is in contrast to the
ex-
sting methods for computing the universal adversarial
perturba-
ions, e.g. [13] , that consume a large number of images from
the
ataset for training, and report result only on a smaller subset
of
he ImageNet validation set.
.1. Secret image selection
Our technique embeds a secret image in the host image to
cre-
te adversarial examples. This gives us full freedom to use
any
mage as the secret image. However, it was observed in our
ex-
eriments that a secret image with more edges is preferable
un-
er our scheme. To identify more desirable secret images, we
can
ake advantage of conventional filtering techniques. In this
work,
e use Sobel filter to extract edge information from the images
to
ecide their suitability as the secret image. The used filter
operator
s given in Eq. 6 .
x = [ −1 0 1
−2 0 2 −1 0 1
] � x 2 and G y =
[ −1 −2 −1 0 0 0 1 2 1
] � x 2 ,
(6)
here � denotes the convolution, and G x and G y are the
feature
aps preserving the vertical and horizontal edge information.
In order to choose the secret image we use an empirical
thresh-
ld on the Edge Pixels (EP) identified by the filter. The EP
value
ounts the number of pixels in the edges extracted by the
filter.
e illustrate the five secret images used in our experiments
in
ig. 4 . The top row of the figure shows the secret images with
their
P values. The bottom row shows the resulting adversarial
exam-
les. We have intentionally chosen the original/clean image with
a
elatively plain background to clearly illustrate the
perceptibility of
he perturbations. As can be seen, based on the edge
information
n the secret image, the EP values decrease from Fig. 4 (a) to
(e).
his also results in reducing the perturbation perceptibility in
the
dversarial examples. However, secret images with smaller EP
val-
es also result in the adversarial examples that have less
fooling
atios.
In Table 1 , we summarize the fooling rates resulting on
Ima-
eNet validation set using the five secret images shown in Fig. 4
.
e emphasize, that the results are on 49,999 ImageNet
samples.
t can be seen that with the images having EP values around
15K
Fig. 4 ), the fooling ratios for the state-of-the-art ImageNet
models
s significant, reaching up to 89% fooling for the MobileNet.
These
-
150 S. Ud Din, N. Akhtar and S. Younis et al. / Pattern
Recognition Letters 135 (2020) 146–152
Fig. 4. Adversarial examples formed by different secret images.
The top row shows the secret images along their Edges Pixel (EP)
values. The bottom row presents the
adversarial examples generated using the corresponding secret
images. With the larger number of edges in a secret image, the
fooling rate increases (see Table 1 ), however
it also makes the perturbation quasi-imperceptible.
Table 1
Fooling rates (%) using the five secret images illustrated in
Fig. 4 . The rates are
computed for 49,999 images of ImageNet validation set. We use
hyper-parameter
values α = 0 . 1 , β = 1 .
Fooling Rate (%)
Different Secret images ResNet-50 VGG-16 Inception V3 MobileNet
V2
Image 1 84.77 87.19 79.19 89.13
Image 2 82.11 84.96 74.51 86.61
Image 3 71.76 73.88 64.59 77.52
Image 4 42.72 38.65 41.25 48.26
Image 5 42.61 38.45 41.78 48.74
Table 2
Ablation analysis with different steganographic modifications
applied to the clean
image. Image 3 in Fig. 4 is used as the secret image.
Fooling Rate (%)
Modification ResNet-50 VGG-16 Inception V3 MobileNet V2
Image embedding 29.94 23.69 28.91 34.51
HH component 30.96 23.7 30.19 36.91
HL component 60.8 61.73 54.25 66.74
All Hybrid 71.76 73.88 64.59 77.51
Fig. 5. ResNet-50 fooling rate (%) for secret image 4 using
different values of α and
β . The x-axis values for α are in the range [0, 0.1]. 10x
scaling is used for better
readability.
m
t
p
t
p
i
p
v
5
F
o
h
i
p
f
o
i
= p
t
p
s
results are especially intriguing because the fooling is
achieved us-
ing a non-optimization based method. Moreover, the attack is
con-
ducted in a true black-box setup, where we have not assumed
any
information about the target model. A single secret image is
able
to form adversarial examples that generalize well across the
net-
works having varied architectures.
5.2. Discussion
As an ablation study of the overall technique, we modify the
method of embedding the secret image in the clean image and
analyse the results. As the first instance, we simply embed the
Im-
age 3 (as shown in Fig. 4 ) using steganography. The fooling
ratios
for the four networks are reported against the Image
embedding
modification in Table 2 . In the second experiment, we
additionally
modify the HH-component of the DWT using the procedure dis-
cussed in Section 4 . The result of these experiments are
reported
against HH component modification in the Table. Similarly, the
last
two rows of the Table report the fooling ratios when
additional
HL-component modification is performed, and when all the
pro-
posed modifications (in Section 4 ) are performed. It is clear
from
the Table that each of our proposed modification to the
individ-
ual component of DWT adds to the eventual fooling ratio
achieved
by our technique. Note that, we deliberately use Image 3
instead
of Image 2 ( Fig. 4 ) in this Table to emphasize that this trend
re-
ains generic for different secret images used under the
proposed
echnique.
Recall that the proposed approach requires two hyper-
arameters that govern the fooling ratio and perceptibility
of
he perturbations in the adversarial examples. The first
hyper-
arameter is α that decides the strength with which the
secretmage is embedded in the host image while the other hyper-
arameter is β which determines the strength of embedding
theertical edges in the host image. The fooling ratios of
ResNet-
0 network with different values of α and β are summarized inig.
5 . As the value of α increases from 0.01 to 0.1, the strengthf
embedding the secret image in the host images increases and
ence the fooling ratio increases. Varying β values from 0.1 to
1n Eq. (5) , multiple y HL components are generated. These HL
com-
onents are then used to generate adversarial examples, and
thier
ooling prowess is analyzed. Fig. 5 clearly shows that as the
value
f β increases, the fooling ratio also increases. With an
increasen the values of α and β , fooling ratio increases until a
point ( α 0.1 and β = 1) where we get maximum fooling ratio with
im-erceptibility. On further increasing these values, the fooling
ra-
io increases further, albeit slightly. Nevertheless, it also
makes the
erturbations perceptible for the Human visual system. The plot
is
hown for Image 4 in Fig. 4 .
-
S. Ud Din, N. Akhtar and S. Younis et al. / Pattern Recognition
Letters 135 (2020) 146–152 151
Fig. 6. Random adversarial examples with perturbation norms ( �
2 ) close to the
mean perturbation norms reported in Table 3 . The perturbations
remain largely im-
perceptible to the Human visual system.
Table 3
� 2 -norm of the perturbations for different secret images.
Different Secret l 2 -norm l 2 -norm l 2 -norm
images (Min Val) (Max Val) (Mean Val)
Image 1 11.49 103.96 48.85
Image 2 14.15 103.66 49.42
Image 3 8.56 98.33 34.94
Image 4 4.46 96.74 29.75
Image 5 3.5 95.71 28.13
5
u
m
m
t
i
e
o
f
8
t
a
�
o
m
i
p
e
t
6
b
s
s
r
b
q
p
t
l
t
p
m
n
c
c
D
R
.3. Perturbation perceptibility
Adversarial examples are considered more effective when the
nderlying perturbations to the image remain imperceptible to
hu-
ans. Since we do not modify the image in the pixel domain,
our
ethod naturally results in hard to perceive perturbations.
Never-
heless, the resulting adversarial images do differ from the
original
mage in terms of e.g. brightness, sharpness. To quantify the
differ-
nces, we summarize the � 2 -norm of the difference between
the
riginal and adversarial images in Table 3 . The reported values
are
or all five secret images shown in Fig. 4 . The values are given
for
-bit images with range [0–255]. As can be seen, as the edges
in
he secret images increase, the difference between the original
and
dversarial images increases. It should be noted that we report
the
2 -norm of the difference following the existing conventions.
Since
ur technique essentially modifies the image in the frequency
do-
ain, even significant � 2 -norm perturbations still remain
largely
mperceptible to humans under our technique. We provide exam-
le visualizations to corroborate this claim in Fig. 6 , where
random
xamples having perturbation norms close to the mean values
of
he norms in Table 3 are shown.
. Conclusion
We developed an adversarial attack on deep learning inspired
y steganography. The proposed attack embeds a secret image
in-
ide any host image to fool any network on the resulting
adver-
arial example. This doubly-universal attack achieves high
fooling
ates ( ~ 80%) on a variety of state-of-the-art networks under
tru
alck-box settings. To perform the attack, we mixed the low
fre-
uency information of the secret and the host image, while
re-
lacing the high frequency information of the host image with
hat of the secret image. It is observed that secret images
with
arger number of edges are more suitable for the proposed at-
ack. Depending upon the secret image, the resulting
adversarial
erturbations remain imperceptible to quasi-imperceptible,
while
aintaining good fooling rates across the networks. The
proposed
on-optimization attack is performed holistically on the images
by
omputing their discrete wavelet transforms and singular value
de-
ompositions.
eclaration of Competing Interest
None
eferences
[1] G. Hinton , L. Deng , D. Yu , G. Dahl , A.-r. Mohamed , N.
Jaitly , A. Senior , V. Van-
houcke , P. Nguyen , B. Kingsbury , et al. , Deep neural
networks for acoustic mod-eling in speech recognition, IEEE Signal
Process. Mag. 29 (2012) 82–97 .
[2] I. Sutskever , O. Vinyals , Q.V. Le , Sequence to sequence
learning with neu-
ral networks, in: Advances in neural information processing
systems, 2014,pp. 3104–3112 .
[3] K. He , X. Zhang , S. Ren , J. Sun , Deep residual learning
for image recognition,in: Proceedings of the IEEE conference on
computer vision and pattern recog-
nition, 2016, pp. 770–778 . [4] A. Krizhevsky , I. Sutskever ,
G.E. Hinton , Imagenet classification with deep con-
volutional neural networks, in: Advances in neural information
processing sys-
tems, 2012, pp. 1097–1105 . [5] C. Szegedy , W. Liu , Y. Jia ,
P. Sermanet , S. Reed , D. Anguelov , D. Erhan , V. Van-
houcke , A. Rabinovich , Going deeper with convolutions, in:
Proceedings of theIEEE conference on computer vision and pattern
recognition, 2015, pp. 1–9 .
[6] D. Shen , G. Wu , H.-I. Suk , Deep learning in medical image
analysis, Annu. Rev.Biomed. Eng. 19 (2017) 221–248 .
[7] M.M. Najafabadi , F. Villanustre , T.M. Khoshgoftaar , N.
Seliya , R. Wald ,
E. Muharemagic , Deep learning applications and challenges in
big data ana-lytics, J. Big. Data. 2 (1) (2015) 1 .
[8] S. Ramos , S. Gehrig , P. Pinggera , U. Franke , C. Rother ,
Detecting unexpectedobstacles for self-driving cars: Fusing deep
learning and geometric modeling,
in: Proceedings of the 2017 IEEE Intelligent Vehicles Symposium
(IV), IEEE, LosAngeles, CA, 2017, pp. 1025–1032 .
[9] N. Akhtar , A. Mian , Threat of adversarial attacks on deep
learning in computer
vision: a survey, IEEE Access 6 (2018) 14410–14430 . [10] N.
Papernot , P. McDaniel , S. Jha , M. Fredrikson , Z.B. Celik , A.
Swami , The lim-
itations of deep learning in adversarial settings, in:
Proceedings of the 2016IEEE European Symposium on Security and
Privacy, IEEE, Saarbrucken, Ger-
many, 2016, pp. 372–387 . [11] C. Szegedy, W. Zaremba, I.
Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fer-
gus, Intriguing properties of neural networks, 2013. arXiv
preprint arXiv:1312.
6199 . [12] J. Su , D.V. Vargas , K. Sakurai , One pixel attack
for fooling deep neural networks,
IEEE Trans. Evol. Comput. 23 (5) (2019) 828–841 . [13] S.-M.
Moosavi-Dezfooli , A. Fawzi , O. Fawzi , P. Frossard , Universal
adversarial
perturbations, in: Proceedings of the IEEE Conference on
Computer Vision andPattern Recognition, 2017, pp. 1765–1773 .
[14] C. Szegedy , V. Vanhoucke , S. Ioffe , J. Shlens , Z. Wojna
, Rethinking the inception
architecture for computer vision, in: Proceedings of the IEEE
conference oncomputer vision and pattern recognition, 2016, pp.
2818–2826 .
[15] K. Simonyan, A. Zisserman, Very deep convolutional networks
for large-scaleimage recognition, 2014. arXiv preprint
arXiv:1409.1556 .
[16] M. Sandler , A. Howard , M. Zhu , A. Zhmoginov , L.-C. Chen
, Mobilenetv2: In-verted residuals and linear bottlenecks, in:
Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, 2018, pp. 4510–4520
. [17] I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and
harnessing adversarial ex-
amples, 2014. arXiv preprint arXiv:1412.6572 .
[18] A. Kurakin, I. Goodfellow, S. Bengio, Adversarial examples
in the physicalworld, 2016. arXiv preprint arXiv:1607.02533 .
[19] Y. Dong , F. Liao , T. Pang , H. Su , J. Zhu , X. Hu , J.
Li , Boosting adversarial attackswith momentum, in: Proceedings of
the IEEE Conference on Computer Vision
and Pattern Recognition, 2018, pp. 9185–9193 .
http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0001http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0002http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0002http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0002http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0002http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0003http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0003http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0003http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0003http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0003http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0004http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0004http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0004http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0004http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0005http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0005http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0005http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0005http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0005http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0005http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0005http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0005http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0005http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0005http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0006http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0006http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0006http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0006http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0007http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0007http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0007http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0007http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0007http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0007http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0007http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0008http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0008http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0008http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0008http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0008http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0008http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0009http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0009http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0009http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0010http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0010http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0010http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0010http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0010http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0010http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0010http://arXiv:1312.6199http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0012http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0012http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0012http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0012http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0013http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0013http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0013http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0013http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0013http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0014http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0014http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0014http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0014http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0014http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0014http://arXiv:1409.1556http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0016http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0016http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0016http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0016http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0016http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0016http://arXiv:1412.6572http://arXiv:1607.02533http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0019http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0019http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0019http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0019http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0019http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0019http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0019http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0019
-
152 S. Ud Din, N. Akhtar and S. Younis et al. / Pattern
Recognition Letters 135 (2020) 146–152
[
[
[
[20] T.B. Brown, D. Mané, A. Roy, M. Abadi, J. Gilmer,
Adversarial patch, 2017. arXivpreprint arXiv:1712.09665 .
[21] N. Carlini , D. Wagner , Towards evaluating the robustness
of neural networks,in: In Proceedings of the 2017 IEEE Symposium on
Security and Privacy (SP),
IEEE, San Jose, CA, 2017, pp. 39–57 . [22] N. Papernot , P.
McDaniel , X. Wu , S. Jha , A. Swami , Distillation as a defense
to
adversarial perturbations against deep neural networks, in: 2016
IEEE Sym-posium on Security and Privacy (SP), IEEE, San Jose, CA,
2016, pp. 582–
597 .
[23] S. Das , P.N. Suganthan , Differential evolution: a survey
of the state-of-the-art,IEEE Trans. Evol. Comput. 15 (1) (2011)
4–31 .
[24] S.-M. Moosavi-Dezfooli , A. Fawzi , P. Frossard , Deepfool:
a simple and accuratemethod to fool deep neural networks, in:
Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, 2016, pp. 2574–2582
. [25] V. Khrulkov , I. Oseledets , Art of singular vectors and
universal adversarial per-
turbations, in: Proceedings of the IEEE Conference on Computer
Vision and
Pattern Recognition, 2018, pp. 8562–8570 .
26] K.R. Mopuri, U. Garg, R.V. Babu, Fast feature fool: A data
independent approachto universal adversarial perturbations, 2017.
arXiv preprint arXiv:1707.05572 .
[27] P.-Y. Chen , H.-J. Lin , et al. , A DWT based approach for
image steganography,International Journal of Applied Science and
Engineering 4 (3) (2006) 275–290 .
28] C.-C. Lai , C.-C. Tsai , Digital image watermarking using
discrete wavelet trans-form and singular value decomposition, IEEE
Trans. Instrum. Meas. 59 (11)
(2010) 3060–3063 . [29] O. Poursaeed , I. Katsman , B. Gao , S.
Belongie , Generative adversarial perturba-
tions, in: Proceedings of the IEEE Conference on Computer Vision
and Pattern
Recognition, 2018, pp. 4 422–4 431 . [30] S.G. Mallat , A theory
for multiresolution signal decomposition: the wavelet
representation, IEEE Transactions on Pattern Analysis &
Machine Intelligence(7) (1989) 674–693 .
[31] I. Daubechies , Ten lectures on wavelets, 61, Siam, 1992 .
32] O. Russakovsky , J. Deng , H. Su , J. Krause , S. Satheesh , S.
Ma , Z. Huang , A. Karpa-
thy , A. Khosla , M. Bernstein , et al. , Imagenet large scale
visual recognition chal-
lenge, Int. J. Comput. Vis. 115 (3) (2015) 211–252 .
http://arXiv:1712.09665http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0021http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0021http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0021http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0022http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0022http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0022http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0022http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0022http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0022http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0023http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0023http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0023http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0024http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0024http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0024http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0024http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0025http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0025http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0025http://arXiv:1707.05572http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0027http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0027http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0027http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0027http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0028http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0028http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0028http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0029http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0029http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0029http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0029http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0029http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0030http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0030http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0031http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0031http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032http://refhub.elsevier.com/S0167-8655(20)30146-X/sbref0032
Steganographic universal adversarial perturbations1
Introduction2 Related work3 Problem formulation4 Proposed
approach4.1 No training data and target model required4.2 2D
Discrete wavelet transform decomposition4.3 Singular value
decomposition (SVD)4.4 Constructing the adversarial examples
5 Experiments and results5.1 Secret image selection5.2
Discussion5.3 Perturbation perceptibility
6 ConclusionDeclaration of Competing InterestReferences