Top Banner
Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park
18

Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Performance Evaluation in Computer Vision

Kyungnam KimComputer Vision Lab,

University of Maryland, College Park

Page 2: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Contents Error Estimation in Pattern Recognition

Jain et al., “Statistical Pattern Recognition: A Review”, IEEE PAMI 2000 (Section 7 Error Estimation).

Assessing and Comparing Algorithms Adrian Clark and Christine Clark, “Performance

Characterization in Computer Vision: A Tutorial”. Receiver Operating Characteristic (ROC) curve Detection Error Trade-off (DET) curve Confusion Matrix McNemar’s test

http://peipa.essex.ac.uk/benchmark/

Page 3: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Error Estimation in Pattern Recognition Reference - Jain et al., “Statistical Pattern Recognition: A

Review”, IEEE PAMI 2000 (Section 7 Error Estimation).

It is very difficult to obtain a closed-form expression for error rate Pe.

In practice, the error rate must be estimated from all the available samples split into training and test sets.

Error estimate = percentage of misclassified test samples.

Reliable error estimate – (1) Large sample size, (2) Independent training and test samples.

Page 4: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Error Estimation in Pattern Recognition The error estimate (function of the specific training and

test sets used) is random variable. Given a classifier, t is # of misclassified test samples out

of n. The probability density function of t has a binomial distribution.

The maximum-likelihood estimate, Pe, of Pe is given by Pe=t/n,

with E(Pe) = Pe and Var(Pe) = Pe(1- Pe)/n. Pe is a random variable a confidence interval (shrink

as n increases)

Page 5: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

versions of cross-validation approach

leave all in

resamplingbased on the analogypopulation samplesample sample

http://www.uvm.edu/~dhowell/StatPages/Resampling/Bootstrapping.htmlhttp://www.childrens-mercy.org/stats/ask/bootstrap.asphttp://www.cnr.colostate.edu/class_info/fw663/bootstrap.pdfhttp://www.maths.unsw.edu.au/ForStudents/courses/math3811/lecture9.pdf

Page 6: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Error Estimation in Pattern Recognition Receiver Operating Characteristic (ROC) Curve

detailed later.

‘Reject Rate’: reject doubtful patterns near the decision boundary (low confidence).

A well-known reject option is to reject a pattern if its maximum a posteriori probability is below a threshold.

Trade-off between ‘reject rate’ and ‘error rate’.

Page 7: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Next seminar: Dimensionality Reduction/Manifold Learning ?

Page 8: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

classification method

Page 9: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.
Page 10: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.
Page 11: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Assessing and Comparing Algorithms Reference: Adrian Clark and Christine Clark, “Performance

Characterization in Computer Vision: A Tutorial”. http://peipa.essex.ac.uk/benchmark/tutorials/essex/tutorial.pdf

The same training and test sets. Some standard sets – FERET, PETS.

Simply to see which has the better success rate? Not enough. A standard statistical test, McNemar’s test is required.

Two types of testing: Technology evaluation: the response of an underlying generic

algorithm to factors such as adjustment of its tuning parameters, noisy input date, etc.

Application evaluation: how well an algorithm performs a particular task

Page 12: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Assessing and Comparing Algorithms

Receiver Operating Characteristic (ROC) curve

FPTN

FP rate positive false

FNTP

TP rate positive true

Page 13: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Assessing and Comparing Algorithms

Detection Error Trade-off (DET) curve

- logarithmic scales on both axes- more spread out, easier to distinguish- close to linear

Page 14: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Assessing and Comparing Algorithms

Detection Error Trade-off (DET) curve

- Forensic applications: track down a suspect- High security applications: ATM machines- EER (equal error rate)

- Comparisons of algorithms tend to be performed with a specific set of tuning parameter values (Running them with settings that correspond to the EER is probably the most sensible.)

Page 15: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Assessing and Comparing Algorithms

Crossing ROC curves

Comparisons of algorithms tend to be performed with a specific set of tuning parameter values (Running them with settings that correspond to the EER is probably the most sensible.)

Page 16: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Assessing and Comparing Algorithms

Confusion Matrices

Page 17: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Assessing and Comparing Algorithms

McNemar’s testAn appropriate statistical test must take into account

not only # of FP, etc. but also ‘# of tests’.

(a form of chi-square test)http://www.zephryus.demon.co.uk/geography/resources/fieldwork/stats/chi.htmlhttp://www.isixsigma.com/dictionary/Chi_Square_Test-67.htm

Page 18: Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.

Assessing and Comparing Algorithms

McNemar’s testIf # of tests > 30, the central limit theorem applies