Multi-objective Evolutionary Approaches for ROC Performance Maximization Ke Tang USTC-Birmingham Joint Research Institute in Intelligent Computation and Its Applications (UBRI) School of Computer Science and Technology University of Science and Technology of China July 2014 @ USTC 1
30
Embed
Multi-objective Evolutionary Approaches for ROC ...staff.ustc.edu.cn/~ketang/PPT/ROC201407.pdf · Multi-objective Evolutionary Approaches for ROC Performance Maximization Ke Tang
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Multi-objective Evolutionary Approaches for ROC Performance Maximization
Ke Tang
USTC-Birmingham Joint Research Institute in Intelligent Computation and Its Applications (UBRI) School of Computer Science and Technology
University of Science and Technology of China
July 2014 @ USTC
1
Outline
• Introduction to ROC analysis
• Related works
• A Multi-Objective Evolutionary Approach to ROCCH maximization (CH-MOEA)
• Conclusions
2 2
Introduction to ROC Analysis
• Many real-world classification problems are either cost sensitive or have imbalanced class distribution.
• In such situations, a classifier with large classification accuracy might not make sense at all.
• Alternative performance metric is needed.
• In Big Data era, misclassification cost and class distribution may even change over time.
3 3
Introduction to ROC Analysis
• Confusion Matrix
4 4
Predicted Positive Predicted Negative
Positive True Positive rate False Negative rate
Negative False Positive rate True Negative rate
Introduction to ROC Analysis
• Receiver Operating Characteristic (ROC)
5 5
Introduction to ROC Analysis
• ROC Curve A “curve” in the ROC space, generated by tuning the threshold of a classifier.
6 6
f(x)=wTx+b
Introduction to ROC Analysis
• From ROC analysis to performance measure – simple version: Area Under the ROC Curve (AUC) – Complicated version: ROC Convex Hull (ROCCH)
7 7
Introduction to ROC Analysis
• An important characteristic of ROCCH:
Under any target cost and class distributions, the best classifier for those conditions must be a vertex or on the edge of the convex hull of all classifiers.
8 8
Related Work
• Both AUC and ROCCH can be used as objective functions for training a classifier/learner.
• When seeking a (soft) classifier with maximum AUC or
ROCCH, we actually seek a set of (hard) classifiers, e.g., classifiers with different thresholds.
• More intuitively, we tries to find a classifier that is roughly good
(robust) and can be easily adapted to different misclassification costs, or class distributions.
9 9
Related Work
• AUC maximization – is (in some circumstances), equivalent to a bipartite ranking problem, and
can be addressed with learning-to-rank approaches. – Rank-SVM (Joachims, 2005) – Rankboost (Freund, 2003)
• ROCCH maximization – more challenging than AUC-maximization problem. – Can only be tackled with heuristic approaches – PRIE (Fawcett, 2008)
10 10
CH-MOEA
• Existing approaches tries to obtain a set of homogenous classifiers in the sense that the classifiers only adopts different thresholds.
• Question: why the classifiers must be homogeneous? – Heterogeneous classifiers might spread better in the ROC space.
– The difference between homogenous and heterogeneous classifiers make little difference in practical implementation.
11 11
CH-MOEA
• Our Target: Train a set of (Heterogeneous) classifiers such that the ROCCH is maximized.
• A set-based optimization problem can could hardly be solved with existing mathematical programming tools.
• Evolutionary Algorithms provides a natural way to search for
the desired classifier set.
12 12
CH-MOEA
• In particular, multi-objective evolutionary algorithms are off-the-shelf tools for this problem. – Maximize TP – Minimize FP
13 13
CH-MOEA
• General framework of EAs
14 14
c�Xin Yao 3'
&
$
%
What Is an Evolutionary Algorithm?
(OK, you can open your eyes and wake up now.)
1. Generate the initial population P (0) at random, and seti← 0;
2. REPEAT
(a) Evaluate the fitness of each individual in P (i);
(b) Select parents from P (i) based on their fitness in P (i);
(c) Generate offspring from the parents using crossover andmutation to form P (i + 1);
(d) i← i + 1;
3. UNTIL halting criteria are satisfied
CH-MOEA
• What is the most famous MOEAs so far?
• Probably NSGA-II (Kalyan Deb, 2002), mainly famous for its selection scheme:
15 15
CH-MOEA
• However, directly application of NSGA-II (or ay other MOEA) might be inappropriate as: – A non-dominated (or pareto optimal) solution is not necessarily on the
convex hull. – The objective space of the problem is essentially discrete (may cause
redundant solutions)
16 16
CH-MOEA
• Our approach: Convex Hull-based MOEA (CH-MOEA)
• New features of CH-MOEA: – Redundancy elimination – A new sorting scheme dedicated to ROOCH maximization.
17 17
CH-MOEA
• Redundancy Elimination
18 18
CH-MOEA
• New sorting scheme for ROCCH maximization
19 19
CH-MOEA
• The CH-MOEA can be combined with any learning models that can be evolved – Neural Network – Decision Tree – SVM – …
• Genetic Programming is adopted in our work, which can be viewed as the evolving a decision tree.
20 20
CH-MOEA
• Pseudo-code of CH-MOGP
21 21
CH-MOEA
• Dataset for empirical studies
22 22
CH-MOEA
• Compared methods
23 23
CH-MOEA
• CH-MOGP outperformed state-of-the-art MOEAs
24 24
CH-MOEA
• CH-MOGP outperformed other non-EA methods.
25 25
CH-MOEA
• CH-MOGP outperformed other non-EA methods.
26 26
Conclusions
• Cost-sensitive or class imbalance learning are commonly encountered in the real world.
• ROCCH fits these type of problem very well for its insensitivity with respect to misclassification cost and class distribution
• ROCCH is formulated as a special MOP that has not been well addressed by existing MOEAs.
• A new MOEA, namely CH-MOEA, is proposed to tackle this learning problem.
• CH-MOEA could be extended to any machine learning model. 27 27
Reference
• P. Wang, M. Emmerich, R. Li, K. Tang, T. Baeck and X. Yao, “Convex Hull-Based Multi-objective Genetic Programming for Maximizing Receiver Operating Characteristic Performance,” IEEE Transactions on Evolutionary Computation, in press (DOI: 10.1109/TEVC.2014.2305671).
• P. Wang, K. Tang, T. Weise, E. P. K. Tsang and X. Yao, “Multiobjective Genetic Programming for Maximizing ROC Performance,” Neurocomputing, 125: 102-118, February 2014.
28 28
Collaborators
• Dr. Pu Wang • Prof. Xin Yao • Prof. Edward Tsang • Dr. Thomas Weise • Dr. Michael Emmerich • Dr. Rui Li • Prof. Thomas Baeck