Top Banner
Feature Selection Robot Image Credit: Viktoriya Sukhanova © 123RF.com
24

Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

Jun 26, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

FeatureSelection

Robot Image Credit: Viktoriya Sukhanova © 123RF.com

Page 2: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

FeatureSelection• Givenasetofn features,thegoaloffeatureselection istoselecta

subsetofd features(d < n)inordertominimizetheclassificationerror.

• Why perform feature selection?– Data interpretation\knowledge discovery (insights into which factors

which are most representative of your problem)– Curse of dimensionality (amount of data grows exponentially with # of

features O(2")

• Fundamentallydifferentfromdimensionalityreduction(wewilldiscussnexttime)basedonfeaturecombinations(i.e.,featureextraction).

dimensionality reduction

Page 3: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

FeatureSelectionvs.DimensionalityReduction

• FeatureSelection– Whenclassifyingnovelpatterns,onlyasmall numberoffeatures

needtobecomputed(i.e.,fasterclassification).– Themeasurementunits(length,weight,etc.)ofthefeaturesare

preserved.

• DimensionalityReduction(nexttime)– Whenclassifyingnovelpatterns,all featuresneedtobecomputed.– Themeasurementunits(length,weight,etc.)ofthefeaturesare

lost.

Page 4: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

FeatureSelectionSteps

• Featureselectionisanoptimization problem.

– Step1: Searchthespaceofpossiblefeaturesubsets.

– Step2: Pickthesubsetthatisoptimalornear-optimalwithrespecttosomeobjectivefunction.

Page 5: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

FeatureSelectionSteps(cont’d)

Search strategies– Optimal– Heuristic

Evaluation strategies- Filter methods - Wrapper methods

Page 6: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

EvaluationStrategies

• Filter Methods– Evaluation is independent of

the classification algorithm.

– The objective function evaluates feature subsets by their information content, typically interclass distance, statistical dependence or information-theoretic measures.

Page 7: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

EvaluationStrategies

• Wrapper Methods– Evaluation uses criteria related to the classification algorithm.

– The objective function is a pattern classifier, which evaluates feature subsets by their predictive accuracy (recognition rate on test data) by statistical resampling or cross-validation.

Page 8: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

Filtervs.WrapperApproaches

Page 9: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

FiltervsWrapperApproaches

Page 10: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

SearchStrategies• Assuming n features, an exhaustive search would

require:

– Examining all possible subsets of size d.

– Selecting the subset that performs the best according to the criterion function.

• The number of subsets grows combinatorially, making exhaustive search impractical.

• In practice, heuristics are used to speed-up search but they cannot guarantee optimality.

10

ndæ öç ÷è ø

Page 11: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

NaïveSearch

• Sort the given n features in order of their probability of correct recognition.

• Select the top d features from this sorted list.

• Disadvantage– Correlation among features is not considered.– The best pair of features may not even contain the best

individual feature.

Page 12: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

Sequentialforwardselection(SFS)(heuristicsearch)

• First, the best single feature is selected (i.e., using some criterion function).

• Then, pairs of features are formed using one of the remaining features and this best feature, and the best pair is selected.

• Next, triplets of features are formed using one of the remaining features and these two best features, and the best triplet is selected.

• This procedure continues until a predefined number of features are selected.

12

SFS performsbest when the optimal subset issmall.

Page 13: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

Example

13

Resultsofsequentialforwardfeatureselectionforclassificationofasatelliteimageusing28features.x-axisshowstheclassificationaccuracy(%)andy-axisshowsthefeaturesaddedateachiteration(thefirstiterationisatthebottom).Thehighestaccuracyvalueisshownwithastar.

features added at each iteration

Page 14: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

Sequentialbackwardselection(SBS)(heuristicsearch)

• First, the criterion function is computed for all nfeatures.

• Then, each feature is deleted one at a time, the criterion function is computed for all subsets with n-1 features, and the worst feature is discarded.

• Next, each feature among the remaining n-1 is deleted one at a time, and the worst feature is discarded to form a subset with n-2 features.

• This procedure continues until a predefined number of features are left.

14

SBS performsbest when the optimal subset islarge.

Page 15: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

Example

15

Resultsofsequentialbackward featureselectionforclassificationofasatelliteimageusing28features.x-axisshowstheclassificationaccuracy(%)andy-axisshowsthefeaturesremovedateachiteration(thefirstiterationisatthetop).Thehighestaccuracyvalueisshownwithastar.

features removed at each iteration

Page 16: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

BidirectionalSearch(BDS)• BDSappliesSFSandSBS

simultaneously:– SFSisperformedfromthe

emptyset.– SBSisperformedfromthe

fullset.• ToguaranteethatSFSandSBS

convergetothesamesolution:– Featuresalreadyselectedby

SFSarenotremovedbySBS.– Featuresalreadyremovedby

SBSarenotaddedbySFS.3

Page 17: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

LimitationsofSFSandSBS

• ThemainlimitationofSFSisthatitisunabletoremove featuresthatbecomenonusefulaftertheadditionofotherfeatures.

• ThemainlimitationofSBSisitsinabilitytoreevaluate theusefulnessofafeatureafterithasbeendiscarded.

• WewillexaminesomegeneralizationsofSFSandSBS:– Plus-L,minus-R”selection(LRS)– Sequentialfloatingforward/backwardselection(SFFSandSFBS)

Page 18: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

“Plus-L,minus-R”selection(LRS)

• AgeneralizationofSFSandSBS– IfL>R,LRSstartsfromtheempty setand:

• RepeatedlyaddLfeatures• RepeatedlyremoveRfeatures

– IfL<R,LRSstartsfromthefull setand:• RepeatedlyremovesRfeatures• RepeatedlyaddLfeatures

ItsmainlimitationisthelackofatheorytohelpchoosetheoptimalvaluesofLandR.

Page 19: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

Sequentialfloatingforward/backwardselection(SFFSandSFBS)

• AnextensiontoLRS:– RatherthanfixingthevaluesofLandR,floatingmethods

determinethesevaluesfromthedata.– Thedimensionalityofthesubsetduringthesearchcanbe

thoughttobe“floating”upanddown

• Twofloatingmethods:– Sequentialfloatingforwardselection(SFFS)– Sequentialfloatingbackwardselection(SFBS)

P. Pudil, J. Novovicova, J. Kittler, Floating search methods in feature selection, Pattern Recognition Lett. 15 (1994) 1119–1125.

Page 20: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

Sequentialfloatingforwardselection(SFFS)

• Sequentialfloatingforwardselection(SFFS)startsfromtheemptyset.

• Aftereachforwardstep,SFFSperformsbackwardstepsaslongastheobjectivefunctionincreases.

Page 21: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

Sequentialfloatingbackwardselection(SFBS)

• Sequentialfloatingbackwardselection(SFBS)startsfromthefullset.

• Aftereachbackwardstep,SFBSperformsforwardstepsaslongastheobjectivefunctionincreases.

Page 22: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

FeatureSelectionusingGAs(randomizedsearch)

ClassifierFeature Subset

Data Feature Extraction

Feature Selection

(GA)

Feature Subset

• GAsprovideasimple,general,andpowerfulframeworkforfeatureselection.

Page 23: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

FeatureSelectionUsingGAs(cont’d)

• Binary encoding:1 means“choosefeature”and0means“donotchoose”feature.

• Fitnessevaluation(tobemaximized)

1 N

Fitness=w1 ´ accuracy + w2 ´ #zerosClassification accuracy using a validation set

Number offeatures

w1>>w2

Page 24: Feature Selection - Georgia Institute of Technology · Results of sequential forward feature selection for classification of a satellite image using 28 features. x-axis shows the

FeatureSelectionSummary

• Hastwo-foldadvantageofprovidingsomeinterpretationofthedataandmakingthelearningproblemeasier

• Findingglobaloptimumimpracticalinmostsituations,relyonheuristicsinstead(greedy\randomsearch)

• Filteringisfastandgeneralbutcanpickalarge#offeatures

• WrappingconsidersmodelbiasbutisMUCHslowerduetotrainingmultiplemodels