Stealing Machine Learning Models via Prediction APIs Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Reiter, Thomas Ristenpart Usenix Security Symposium Austin, Texas, USA August, 11 th 2016
StealingMachineLearningModelsviaPredictionAPIs
FlorianTramèr,FanZhang,AriJuels,MichaelK.Reiter,ThomasRistenpart
Usenix SecuritySymposiumAustin,Texas,USAAugust,11th 2016
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
MachineLearning(ML)Systems
2
(1) Gatherlabeleddata
x(1),y(1) x(2),y(2) …
Dependentvariableyn-dimensionalfeaturevectorx
DataBob Tim Jake
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%correct
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%correct
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%correct
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
(3) Usefinsomeapplicationorpublishitforotherstouse
Training
y=
Modelf
x =BobTimJake
(2) TrainMLmodelffromdataf(x)=y
Prediction
Confidence
Application
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
MachineLearningasaService(MLaaS)
3
$$$perquery
Modelf
input
BlackBoxclassification
PredictionAPI
Data
TrainingAPI
Goal1:RichPredictionAPIs• HighlyAvailable• High-PrecisionResults
Goal2:ModelConfidentiality• Model/DataMonetization• SensitiveData
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
MachineLearningasaService(MLaaS)
4
Service ModeltypesAmazon Logistic regressions
Google ???(announced: logisticregressions,decisiontrees,neuralnetworks,SVMs)
Microsoft Logisticregressions,decisiontrees, neuralnetworks,SVMsPredictionIO Logisticregressions,decisiontrees,SVMs(white-box)BigML Logistic regressions,decisiontrees
SellDatasets– Models– PredictionQueriestootherusers$$$ $$$
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
Goal:Adversarialclientlearnscloseapproximationoffusingasfewqueriesaspossible
Applications:
1) Underminepay-for-predictionpricingmodel
2) Facilitateprivacyattacks(
3) Steppingstonetomodel-evasion[Lowd,Meek– 2005][Srndic,Laskov – 2014]
ModelExtractionAttacks
5
Attack Modelf Datax
f(x)f’
Target:f(x)=f’(x) on≥99.9%ofinputs
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
Goal:Adversarialclientlearnscloseapproximationoffusingasfewqueriesaspossible
ModelExtractionAttacks(PriorWork)
6
Iff(x)isjustaclasslabel:learningwithmembershipqueries- Booleandecisiontrees[Kushilevitz,Mansour– 1993]- Linearmodels(e.g.,binaryregression)[Lowd,Meek– 2005]
Attack Modelf Datax
f(x)f’
Isn’tthis“justMachineLearning”?No!PredictionAPIsreturnmoreinformationthanassumedinpriorworkand“traditional”ML
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
MainResults
7
DataAttack Modelfx
f(x)f’
• LogisticRegressions,NeuralNetworks,DecisionTrees,SVMs
• Reverse-engineermodeltype&features
f’(x)=f(x)on100%ofinputs100s-1000’sofonlinequeries
InversionAttack
x f’(x)
Improved Model-InversionAttacks[Fredrikson etal.2015]
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
ModelExtractionExample:LogisticRegressionTask:FacialRecognitionoftwopeople(binaryclassification)
8
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
Modelf
Bob
Alice
Featurevectorsarepixeldatae.g.,n=92*112=10,304
Data
f(x)= 1/(1+e-(w*x+b))
fmapsfeaturestopredictedprobabilityofbeing“Alice”≤0.5classifyas“Bob”>0.5classifyas“Alice”
n+1parametersw,b chosenusingtrainingsettominimizeexpectederror
Generalizetoc>2classeswithmultinomiallogisticregressionf(x)=[p1,p2,…,pc]predictlabelasargmaxi pi
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
ModelExtractionExample:LogisticRegression
Goal:Adversarialclientlearnscloseapproximationoffusingasfewqueriesaspossible
9
Attack
Linearequationinn+1unknownsw,b
ln =w*x+bf(x)1- f(x)
()
f(x) =1/(1+e-(w*x+b))
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the
right of it.
If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds
to one of the other faces, select “Not Present”.
Altered Image
Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.
Softmax MLP DAE
overall identified excluded
20
40
60
80
100
%co
rrect
(a) Average over all responses.
overall identified excluded
20
40
60
80
100
(b) Correct by majority vote of responses.
overall identified excluded
20
40
60
80
100
(c) Accuracy with high-performing workers.
Fig. 11. Reconstruction attack results.
In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.
algorithm time (s) epochs
Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5
Fig. 12. Attack performance.
1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This
Target Softmax MLP DAE
Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.
is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.
2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.
Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-
12
Modelf
Bob
AliceData
f(x)=f’(x) on100%ofinputs
Queryn+1randompoints⇒ solvealinearsystemofn+1equations
x
f(x)f’
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
f’
f
GenericEquation-SolvingAttacks
10
MLaaS Service
• Solvenon-linearequationsystem intheweightsW- Optimizationproblem+gradientdescent- “NoiselessMachineLearning”
• MultinomialRegressions&DeepNeuralNetworks:- >99.9%agreementbetweenfandf’- ≈1querypermodelparameteroff- 100s- 1,000sofqueries/secondstominutes
random inputsX outputsY
confidencevalues
[f1(x), f2(x), . . . , fc(x)] 2 [0, 1]c
ModelfhaskparametersW
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
MLaaS:ACloserLook
11
x
Modelf
f(x)
PredictionAPI TrainingAPI
Data
- Classlabelsandconfidencescores- Supportforpartialinputs
MLModelTypeSelection:logisticorlinearregression
FeatureExtraction:(automatedandpartiallydocumented)
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
OnlineAttack:AWSMachineLearning
12
input
Model OnlineQueries Time(s) Price($)HandwrittenDigits 650 70 0.07AdultCensus 1,485 149 0.15
Extractedmodelf’agreeswithfon100%oftestedinputs
FeatureExtraction:Quantile Binning+One-
Hot-Encoding
Reverse-engineeredwithpartialqueries andconfidencescores
prediction
“Extract-and-test”
ModelChoice:LogisticRegression
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
Trainingsamplesof40individuals
DataMultinomialLRModelf
Application:Model-InversionAttacksInfertrainingdatafromtrainedmodels[Fredrikson etal.– 2015]
13
Strategy Attackagainst1individual Attack againstall40individuals
OnlineQueries AttackTime OnlineQueries AttackTime
Black-BoxInversion[Fredrikson etal.] 20,600 24min 800,000 16hours
Extract-and-Invert(ourwork) 41,000 10hours 41,000 10hours
Attackrecoversimageofoneindividual
InversionAttack
x
f’(x)
White-BoxAttack
f(x)=f’(x)for>99.9%ofinputs
f’f(x)
ExtractionAttack
x
×40
×1
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
ExtractingaDecisionTree
14
Kushilevitz-Mansour(1992)
• Poly-timealgorithmwithmembershipqueriesonly• OnlyforBooleantrees,impracticalcomplexity
(Ab)usingConfidenceValues
• Assumption: alltreeleaveshaveuniqueconfidencevalues• Reconstructtreedecisionswith“differentialtesting”• OnlineattacksonBigML
xConfidencevaluederivedfromclassdistributioninthetrainingset
Inputsxandx’differinasinglefeature x x’
v v’
Differentleavesarereachedó
Tree“splits”onthisfeature
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
AttackonLinearClassifiers[Lowd,Meek– 2005]
Howtopreventextraction?
APIMinimization
Countermeasures
15
decisionboundary
f(x)=yPrediction
Confidence
• Prediction=classlabelonly• LearningwithMembership
Queries
n+1parametersw,b
f(x)=sign(w*x+b)classifyas“+”ifw*x+b>0and“-”otherwise
1. Findpointsondecisionboundary(w*x+b =0)- Finda“+”anda“-”- Linesearchbetweenthetwopoints
2. Reconstructw and b (uptoscalingfactor)
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
GenericModelRetrainingAttacks
16
• ExtendtheLowd-Meekapproachtonon-linearmodels
• ActiveLearning:- Querypointscloseto“decisionboundary”- Updatef’tofitthesepoints
• MultinomialRegressions,NeuralNetworks,SVMs:- >99%agreementbetweenfandf’- ≈100queriespermodelparameteroff
≈100× lessefficientthanequation-solving
querymorepointshere
Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs
Conclusion
17
RichpredictionAPIs Model &dataconfidentiality
EfficientModel-ExtractionAttacks• LogisticRegressions,NeuralNetworks,DecisionTrees,SVMs• Reverse-engineeringofmodeltype,featureextractors• Activelearningattacksinmembership-querysetting
Applications• Sidestepmodelmonetization• Boostotherattacks:privacybreaches,modelevasion
Thanks! Findoutmore:https://github.com/ftramer/Steal-ML