Sublinear Evolutive
Design of Error Correcting Output
Codes Student: Miguel Angel Bautista
Directors: Dr. Sergio Escalera & Dr. Xavier Baró
Outline � Categoriza/on problems
� Error Correc/ng Output Codes
� SVMs with Gaussian-‐RBF kernel
� Gene/c op/miza/on
� Experiments & results
� Conclusions
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 2
Categoriza/on problems
� Humans are involved in classifica/on tasks from their early days.
� Classifica/on is an unavoidable task in intelligent systems.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 3
Mul/ -‐ c l a s s c a tego r i za/on problems
� Real-‐world problems have more than 2 categories to iden/fy.
� There are several ways to treat mul/-‐class categoriza/on problems.
� Ensemble learning techniques are oMen used in this type of scenarios.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 4
Error Correc/ng Output Codes (ECOC)
� ECOCs are an ensemble learning methodology which allow to combine dichotomizers (base classifiers) to treat mul/class problems.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010
+ +
Base classifier 1 Base classifier 2 Base classfier 3 PROBLEM
10/19/10 5
ECOC coding � ECOCs can be represented as matrices, which columns
represent the different sub-‐problems to treat.
� Each column has values that dis/nguish categories in two groups.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010
+ + +
+ +
+ +
+
+ +
+ One
VS
One
One
VS
All
10/19/10 6
ECOC decoding � Each sub-‐problem is trained and the set of predic/ons are
compared to the codewords.
� Various types of decoding based on Euclidean and Hamming distances (only binary codings).
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 7
¿?
Base classifier: SVM with an RBF kernel � Each binary problem is learned by a base classifier.
� SVM with RBF kernels have shown a good performance on those kind of problems.
� This type of SVM needs the parameters (C & Gamma) to be op/mized.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 8
9
Mo/va/on: complexity in terms of the number of classifiers used.
One VS One One VS All Theore/cal lower-‐bound
• The number of classifiers needed by state-‐of-‐the-‐art approaches becomes inefficient when the number of classes in the problem increases.
10/19/10 Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010
Global overview
Sublinear
ECOC coding
Joint Gene/c op/miza/on of ECOC & Base classifier
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 10
Sublinear coding � Define the lowest number of base classifiers needed to
discriminate N categories.
� Taking profit of Informa/on theory only log2 N bits are needed to discriminate N categories.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010
10/19/10 11
Global overview
Sublinear
ECOC coding
Joint Gene/c op/miza/on of ECOC & Base classifier
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 12
Gene/c algorithms � Op/miza/on algorithms based on the evolu/on theory of
Darwin. • Op/miza/on processes based on evolu/on of individuals. • Each possible solu/on is coded into a chromosome. • Individuals are evaluated by means of its adapta/on to the
environment.
� Recommendable method when the space is not con/nuous neither differen/able.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 13
Evolu/onary op/miza/on I � Each ECOC individual is seen as a binary vector (chromosome)
and evaluated by means of its classifica/on error.
� Standard gene/c operators are used, sca\ered crossover and gaussian add unit muta/on.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010
01100011
1110 1011 A) Optimize the SVMs looking for suitable parameters.
B) Optimize the coding matrix and return to step A.
10/19/10 14
Evolu/onary op/miza/on II � An inner op/miza/on process is carried out to tune the
parameters of the SVMs.
� Once each base classifier is op/mized, the Sublinear coding matrix is op/mized.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 15
Experiments characteris/cs
� UCI dataset characteris/cs.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 16
Experiments Characteris/cs � Computer Vision datasets.
� ARFace:520 x 120, 20 classes.
� Traffic: 3481 x 100, 36 classes.
� MPEG: 1400 x 70, 20 classes.
� Cleafs: 4098x65, 7 classes.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010
10/19/10
17
Results on UCI problems � As we can see the evolu/ve Sublinear performs be\er than
the standard codings.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 18
Results on Computer Vision problems
� In this experiments we can see how evolutionary approaches outperform standard ECOC codings while decreasing the number of classifiers dramatically.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 19
Conclusions
� The Sub-linear ECOC represents the lower-bound in terms of number of classifiers.
� The evolutive ECOC optimization obtains comparable results to the standard coding designs (sometimes better) while using far less number of dichotomizers.
� This design is suitable when classifying problems with large number of classes.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 20
Sc ien t i f i c publ i ca t ions associated
� Supervised and Unsupervised ensemble learning and applications 2010, SUEMA-ECML 2010.
� Pattern Recognition Letters Journal (Submitted).
� CVC-RD Workshop, 2010.
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 21
Thank you
QUESTIONS?
Sublinear Evolutive Design of Error Correctiong Output Codes, Miguel Angel Bautista, ACIA Prize, CCIA 2010 10/19/10 22