Feature Selection with Kernel Class Separability 指導教授：王振興電機所 N28961523 林哲偉電機所 N26974164 曾信輝電機所 N26974172 吳俐瑩 Date: 2009.01.14

425

1

0011 0010 1010 1101 0001 0100 1011Feature Selection with Kernel

Class Separability

指導教授：王振興電機所 N28961523 林哲偉

電機所 N26974164 曾信輝電機所 N26974172 吳俐瑩

Date: 2009.01.14

Lei Wang, “Feature selection with kernel class separability,” IEEE Tras. Pattern Analysis and Machine Intelligence, vol. 30, no. 9, pp.1534-1546, 2008

112/04/21 1

425

1

0011 0010 1010 1101 0001 0100 1011

Outline

• Introduction

• Feature Selection

• Feature Selection Criterion

• Characteristic Analysis

• Experimental Results

• Conclusions

• Future work

112/04/21 2

425

1

0011 0010 1010 1101 0001 0100 1011

Introduction

• Classification can often benefit from efficient feature selection.

• A class separability criterion is developed in a high-dimensional kernel space.

• The criterion is applied to a variety of selection modes using different search strategies.

112/04/21 3

425

1

0011 0010 1010 1101 0001 0100 1011

Feature Selection

• Feature selection often consists of a selection criterion and a search strategy.

• In this paper, the author compared 5 different selection criteria, and 3 search strategy.

• The author executed 30 trials for each.

112/04/21 4

425

1

0011 0010 1010 1101 0001 0100 1011

Flow Chart

112/04/21 5

10 15 20 25 30 35 40 45 50

30 randomly chosen data

425

1

0011 0010 1010 1101 0001 0100 1011

Feature Selection Criterion

• Correlation coefficient– Higher relevance– Cannot handle linearly nonseparable data

• Kolmogorov-Smirnov test– Less possibility or higher test value– Needs a sufficient number of samples

112/04/21 6

425

1

0011 0010 1010 1101 0001 0100 1011

Feature Selection Criterion

• Class separability (Non-kernel)– Simple– Cannot handle linearly nonseparable data

• Radius-margin bound– Well handles linearly nonseparable data– Not computationally efficient

• Kernel class separability– Better performance than above

112/04/21 7

425

1

0011 0010 1010 1101 0001 0100 1011

Characteristic Analysis• In “Class separability” approach, the criterion is

tr(SB)/tr(SW).– tr(． ) denotes as “trace” of a matrix

– –

• In “Kernel-based class separability” approach, the criterion is TΦ=tr(SB

Φ)/tr(SWΦ).

– T* = max(TΦ)

• Using Gaussian kernel function

112/04/21 8

1

tr( ) .n

iiia

A

1 1 1

( )( ) , ( )( ) .inc c

T TW ij i ij i B i i i

i j i

S x m x m S n m m m m

2

2

|| ||( , ) exp( ).

2i j

i jK

x x

x x

425

1

0011 0010 1010 1101 0001 0100 1011

9

425

1

0011 0010 1010 1101 0001 0100 1011

112/04/21 10

425

1

0011 0010 1010 1101 0001 0100 1011

Experimental Results

• Synthetic Dataset 600 data points 52 features 2 classes

112/04/21 11

425

1

0011 0010 1010 1101 0001 0100 1011

Implementation

112/04/21 12

425

1

0011 0010 1010 1101 0001 0100 1011

Time Cost

112/04/21 13

425

1

0011 0010 1010 1101 0001 0100 1011• Use SVM test error to evaluate the significance of KCSM and RMB.

SVM Classifier

112/04/21 14

425

1

0011 0010 1010 1101 0001 0100 1011

15

425

1

0011 0010 1010 1101 0001 0100 1011

Conclusions and Discussions

• From our simulation results, the proposed kernel-based class separability measure is the best choice for feature selection in these 5 measures.

• However, the time cost increases dramatically with the growing number of data.

112/04/21 16

425

1

0011 0010 1010 1101 0001 0100 1011

Future work

112/04/21 17

• US Postal Service

7291 training samples and 2007 test samples. Each sample is characterized by 256 features.

We will try to implement the USPS dataset for further investigation.

Feature Selection with Kernel Class Separability 指導教授：王振興 電機所 N28961523 林哲偉 電機所 N26974164 曾信輝 電機所 N26974172 吳俐瑩 Date: 2009.01.14

Documents

Feature Selection with Kernel Class Separability 指導教授：王振興電機所 N28961523 林哲偉電機所 N26974164 曾信輝電機所 N26974172 吳俐瑩 Date: 2009.01.14