Evaluation of classification performance on small ... · Evaluation of classification performance on small, imbalanced datasets Kay H. Brodersen1,2, Cheng Soon Ong1, Klaas E. Stephan1,2,3,

Evaluation of classification performance on small, imbalanced datasets

Kay H. Brodersen1,2, Cheng Soon Ong1, Klaas E. Stephan1,2,3, Joachim M. Buhmann1

1 Department of Computer Science, ETH Zurich, Switzerland 2 Laboratory for Social and Neural Systems Research, University of Zurich, Switzerland 3 Wellcome Trust Centre for Neuroimaging, University College London, United Kingdom

The balanced accuracy

Is the accuracy a faithful performance measure?

actual + actual – actual + actual –

predicted +

predicted –

Setting

Observations with labels

Classification-based confusion matrix:

Performance assessment

Accuracy

Balanced accuracy

Assessing classification performance

x }1,1{ y

actual + actual –

predicted + TP FP

predicted – FN TN

Assuming a flat prior on the interval [0,1], the posterior of the accuracy follows a Beta distribution

From this we can compute:

the mean:

the mode:

a posterior probability interval:

The posterior distribution of the accuracy

),(~ baBetaA 1,1 IbCa

1,1;1;1,1;2

1 ICFICF BB

A xxICB

ICxp )1()1,1(

Assuming a flat prior on the interval [0,1], the posterior of the balanced accuracy is given by the convolution of two Beta distributions

Based on this density, we can compute:

the mean

the mode

a posterior probability interval

The posterior distribution of the balanced accuracy

BetaavgAAB NP ~)(21

01,1;21,1);(2)( dzFPTNzpFNTPzxpxp AAB

Two examples average accuracy 2 std. errors

mean accuracy and 95% mass

mean bal. acc. and 95% mass

chance

actual + actual –

Example 2: high accuracies on both classes, no imbalance, no bias

Example 1: fair overall accuracy, high class imbalance, strong prediction bias

actual + actual –

predicted +

predicted –

Posterior densities mean

median

95% post. prob. int.

average bal. acc.

chance

Posterior balanced accuracy

Posterior accuracy

predicted +

predicted –

actual + actual –

Smooth precision-recall curves 2

Decision values

Decision values and the binormal assumption

decision values of negative examples

decision values of positive examples

Empirical and parametric curves

ROC curve PR curve

empirical

true TPR

(recall)

FPR (1 – specificity)

Empirical and parametric curves

ROC curve PR curve

empirical

binormal

-binormal true

(recall)

FPR (1 – specificity)

The effect of class imbalance on the PR curve

Fraction of positive examples

The effect of class imbalance on the PR curve

P) empirical

binormal

-binormal

Take-home messages

Dont’s

report the average and the standard error of the accuracy across cross-validation folds

look at empirical ROC or PR curves

Do’s

report a statistic of the posterior distribution of the balanced accuracy

compute a smooth ROC or PR curve under parametric assumptions

K.H. Brodersen, C.S. Ong, K.E. Stephan, J.M. Buhmann (2010) The balanced accuracy and its posterior distribution. Proceedings of the 20th International Conference on Pattern Recognition (in press).

K.H. Brodersen, C.S. Ong, K.E. Stephan, J.M. Buhmann (2010) The binormal assumption on precision-recall curves. Proceedings of the 20th International Conference on Pattern Recognition (in press).

Evaluation of classification performance on small ... · Evaluation of classification performance on small, imbalanced datasets Kay H. Brodersen1,2, Cheng Soon Ong1, Klaas E. Stephan1,2,3,

Documents

Classification: Protected A · Small and Medium Enterprise....

SMALL PACKAGEDTUNABLE LASER Publication Classification...

Compilation of small ribosomal subunit RNA...

UvA-DARE (Digital Academic Repository) Classification of...

U. S. Small Business Administration · U. S. Small Business...

Hierarchical Classification with the small set of features

Module 2 Classification - Virginia DEQModule 2 The...

SheerWeave 7500 Swatch Card V09 · Fire Classification:...

Remote Pilot - Small Unmanned Aircraft Systems Study...

Small Report Face Classification Based on Color Features

Remote Pilot - Small Unmanned Aircraft Systems … ›...

EXTENDED NEAREST NEIGHBOR CLASSIFICATION METHODS FOR...

Classification of small agricultural fields using combined.....

A small footprint for audio and music classification

Matlab Matlab Sigmoid Sigmoid Perceptron Perceptron Linear.....

Radiologic Classification of Small Adenocarcinoma of the...