www.postersession.com Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning Yu-An Chung 1 Hsuan-Tien Lin 1 Shao-Wen Yang 2 1 Dept. of Computer Science and Information Engineering 2 Intel Labs National Taiwan University, Taipei, Taiwan Intel Corporation, USA Cost-sensitive Classification What is the status of the patient? H1N1-infected Cold-infected Healthy l Cost of each kind of mis-prediction: C = 0 1000 100 0 3000 100 30 0 Healthy Cold H1N1 H1N1 Cold Healthy Predicted Actual Predict H1N1-infected as Healthy: very high cost! Predict Cold-infected as Healthy: high cost Predict correctly: no cost l Input: a training set = x + , + +./ 0 and a Cost Matrix C, where x + ∈ , + ∈= 1,2,…, , and C , is the cost of classifying a class example as class l Goal: Use and C to train a classifier : ⇢ such that the expected cost C , x on test example x , is minimal Our Goal & Contributions Shallow Models (e.g., SVM) Deep Learning Regular (Cost-insensitive) Classification Well-studied Popular and undergoing Cost-sensitive Classification Well-studied Our work lies here! l First work that studies thoroughly on Cost-sensitive Deep Learning 1) a novel cost-sensitive loss function for any deep model 2) a Cost-sensitive Autoencoder (CAE) equipped with the loss function for pre-training fully-connected deep model 3) a combination of 1) and 2) as a complete cost-sensitive deep learning (CSDNN) solution The Input-to-Cost Regression Network l Regression network: estimate the costs l Train a regression network • any end-to-end loss function for regression (e.g., MSE linear regression) could be applied • a loss function built on top of [Tu and Lin, 2010] is derived in this work, given a training set = x + , + +./ 0 and C, we define +,= ≡ ln 1 + exp +,= E = x + − C + , , where +,= ≡2 c + = C + , − 1. • train the regression network by minimizing the derived Cost-Sensitive Loss (CSL) over the training set : LMN =O O +,= P =./ 0 +./ l Prediction: x ≡ argmin /V=VP = x Cost-sensitive Autoencoder (CAE) l Autoencoder (AE): pre-training a fully-connected neural network (FCNN) for regular classification l Cost-sensitive Autoencoder (CAE): pre-training the DNN for cost-sensitive classification Autoencoder (AE) • Goal: to reconstruct the original input x • Reconstructed errors measured by the cross- entropy loss LW Cost-sensitive Autoencoder (CAE) • Goal: to reconstruct both the original input x and the cost information C ,: • Mixture reconstructed errors: LXW = 1− E LW +E LMN Conclusions l CSL: make any deep model cost-sensitive (see paper for CNN with CSL) l CSDNN = CAE pre-training + CSL training: both techniques lead to significant improvements Cost-aware Experiments l FCNN: traditional fully-connected neural network for regular classification l FCNN_CSL: the fully-connected regression network trained by the loss function LMN l The proposed Cost-sensitive Deep Neural Network (CSDNN)