Top Banner
Confidence Metrics for Classification by deep Neural Networks March 15, 2019 Adam Oberman with Chris Finlay Math and Stats, McGill
20

Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Jul 27, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Confidence Metrics for Classification by deep

Neural NetworksMarch 15, 2019

Adam Obermanwith Chris Finlay

Math and Stats, McGill

Page 2: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Challenges for deep learning“It is not clear that the existing AI paradigm is immediately amenable to any sort of software engineering validation and verification. This is a serious issue, and is a potential roadblock to DoD’s use of these modern AI systems, especially when considering the liability and accountability of using AI”

JASON report

Page 3: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Fact: the output “probabilities” of neural networks for image classification are not the probabilities that the classification is correct.

Misinterpretation: the output probabilities are not meaningful predictors of classification error.

this is correct. unlike other classifiers, e.g. Naive Bayes, there is no interpretation of the output of the network as a probability

in fact, we can extract useful information from the output, combined with the statistics of the loss on the test set.

Page 4: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Fact: if a neural network generalizes well , and is gives correct classifications 95% of the time (say), then we can estimate probability correct based on p_max

How?Suppose, for the sake of argument, that we are given, along with a prediction, the value of the loss, (but not the correct label).

Page 5: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Then we could have an imperfect, but much better idea of the probability that the prediction is correct.

• For small values < 0.8 of the loss, always correct. • For large values > 3 always incorrect• For intermediate values, make a histogram, with probability

correct in each bin.

Value of the loss

Page 6: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

We have easy to compute metrics which are almost as good as the loss

Page 7: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

How to measure predictive value?

Before presenting the results, we need to explain a way to measure the quality of a metric. This will allow us to compare the predictive value of different metrics on different data sets.

We will show that we can easily compute metrics which give great than10X improved confidence.

Moreover, we can define a simplified “green, yellow, red” zones, where the confidence is very high, moderate, and very low.

In the latter case, we value the fact that we have increased confidence in the probability of an error.

Page 8: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Measuring the odds

Page 9: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Bayes Factor measures the value of information

Whether to bet for or against depends on the new odds.

The Bayes Factor tells us how much our expected winnings increase, if we know the value of the test (and bet correctly).

Page 10: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Predicting preference for Voice vs Text

Histogram for age:

Page 11: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Value of age information

What is the expected value of knowing the age (without knowing in advance the range)? Expected Bayes Ratio.

Page 12: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Less valuable information: where they live

So compared to expected value of knowing the age of 22, the expected value of knowing location is low, 1.5.

Page 13: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Value of Model Entropy

Still can estimate when prediction is correct.• For small values < .001 of the function, always correct. • For large values > 1 correct less than 20% of the time• For intermediate values, make a histogram, with probability

correct in each bin.

Page 14: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Bayes Ratio for Loss, Entropy, U1, U2

Bayes ratio over equal 20 quantile bins for: loss, entropy, U1, U5. Very large Bayes ratio in the first 10 and last 3 bins. On the other hand, bin 15 for U5 provides very little value.

Page 15: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Expected Bayes Ratio Tables

High Cost

Page 16: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Comparison with other works

• Another confidence metric comes from Bayesian Dropout (Gal and Gharamani). In this case run inference 30 times with different random dropout. Confidence based on model variance. • less accurate• Very costly (30 X inference cost)

• Can also train an auxiliary neural network to predict whether another network is correct or wrong. • Have not compared accuracy for this method• But this is costly (1X inference)

• Compared to these methods, our method is less costly, and we also can provide theoretical guarrantees that it works (for small values of U under the assumptions of generalization).

Page 17: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Conclusions so far

• A confidence metric which is easy to compute (basically free) and which gives increased confidence in the probability that a prediction is correct.

• This measure can be adapted to Top 5, and can be used to detect increased probability of errors as well.

• On the larger dataset, ImageNet, the metric performs better relative to CIFAR-10.

Page 18: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Detecting Incorrect Labels• When we evaluate the

uncertainty metric, we find some outliers.

• These turn our to be ambiguous images in the test set.

Page 19: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

What about off-manifold data?

Page 20: Confidence Metrics for Classification by deep Neural Networks · Adam Oberman with Chris Finlay Math and Stats, McGill. Challenges for deep learning “It is not clear that the

Conclusions

• This tool for giving confidence (or uncertainty) to the classifications of neural networks has immediate applications to fields where confidence is valuable.

• Can also be used for • detecting errors in labels• detecting off manifold data, or adversarially

perturbed data