Receiver Operating Characteristic (ROC) Curveshomepage.stat.uiowa.edu/.../ROC_introduction.pdf · move up on the ROC curve (correct choice as ‘disease’) or to the right (incorrect

ReceiverOperatingCharacteristic(ROC)

Curves

Evaluatingaclassifierandpredictiveperformance

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

1-Specificity (i.e. % of true negatives incorrectly declared positive)

Sen

sitiv

ity (%

of t

rue

posi

tives

dec

lare

d po

sitiv

e)

Falsepositiverate

True

positivera

te

Classification(supervisedlearning)•  Insupervisedlearning,weareinterestedinbuildingamodeltopredicttheclass(oroutcome)ofanewobservationbasedonobservablepredictorsusingthetrain-set/test-setframework.– Whichstudentsarenotlikelytoreturnfortheirsecondyearofcollege?

– Whichanimalsadmittedtoaveterinaryhospitalarelikelytosurvive(ornotsurvive)?

– Whichbankcustomersarelikelytodefaultonaloan?

Someclassificationprocedures:•  Logisticregression(0-1response)– Choosevariables,fitmodel,calculate(i.e.)–  0.5predictedclass1,0.5predictedclass0

•  Classificationtree– Use‘greedy’algorithmtochoosevariablesandcreatedecisiontree(predictionattreeleaves)

•  LinearDiscriminantAnalysis(LDA)– Builtonmultivariatenormal(spheroids)– Classifynewobservationtonearestcentroid

p̂p̂ ≥ p̂ <

ŷ

Assumptions:•  Logisticregression– Predictorsarerelatedtoresponsewithasigmoidalmeanstructure

– ConditionalBernoulligiventhemean•  Classificationtree(maybe?)– Thedatacanbedescribedbyfeatures– Thereexistssometreethatisn’ttoobigthatcanpredictreasonablywell

•  Lineardiscriminantanalysis– DatafollowmultivariateGaussian– Equalcovariancematrices

Pros/Consoftheseclassifiers:

NOTE:Insimulationstudies,wehavefoundthateachoftheseclassifiersperformsbetterthantheotherswhenthedataweresimulatedunderthegivenmodelassumptions.

Pros ConsLogistic,Regression

Interpretation,of,covariates.

Computationally,complex,interative,algorithm,and,may,not,converge.

More,robust,to,deviations,from,assumptions,than,LDA.

Depends,on,sigmoidal,mean,structure,and,conditional,binomial,distributions.

Trees Fewer,assumptions,of,data,,more,flexible,algorithm,,easy,interpation,of,certain,characteristics.

Instability,(random,forests,can,help),,lack,of,smoothness,,uses,a,greedy,algorithm.

LDA Dimensionality,reduction,,estimation,under,assumptions,uses,maximum,likelihood

Not,flexible,when,assumptions,deviate,from,multivariate,normal,assumptions.

Whatmakesagoodclassifier?

•  Yourpredictionsarecorrect.– Truepositiveswerecorrectlyclassified– Truenegativeswerecorrectlyclassified

•  Example:Diseasediagnostictest– Patientsaregiventhediagnostictest– Testgivesthosewiththediseasea‘positive’

•  Highsensitivity(correctlyclassifyingtruepositives)– Testgivesthosewithoutthediseasea‘negative’

•  Highspecificity(correctlyclassifyingtruenegatives)

Misclassificationrate

•  Misclassification– Theobservationbelongstooneclass,butthemodelclassifiesitasamemberofadifferentclass.

•  Aclassifierthatmakesnoerrorswouldbeperfect– Notlikelytofindsuchaclassifierintherealworld– Noise– Otherrelevantvariablesnotintheavailabledataset

Misclassificationrate

•  Wecanpresenttheaccuracyoftheresultswithaconfusionmatrix.

•  Overall error rate = (18+5)/1000 = 2.3%

•  NOTE: The prediction value is often calculated through leave-one-out cross-validation in logistic regression.

Predict(as(1 Predict(as(0Actual(1 20 5Actual(0 18 957

Beatingthenaïveclassifier

•  Inthisdataset,the0’sweremuchmoreprevalentthanthe1’s.

•  WhatifwejustclassifiedALLobservationsas0?Howwelldowedo?

•  Overall error rate = (25)/1000 = 2.5%

Predict(as(1 Predict(as(0Actual(1 20 5Actual(0 18 957

Distinguishingbetweenthetwotypesofmistakestobemade

•  Sensitivity=#oftruepositivesdeclared‘positive’#oftruepositivesIsyourdiseasediagnostictoolgettingtheonesyouwant?

•  Specificity=#oftruenegativesnotdeclared‘positive’#oftruenegativesIsyourdiseasediagnostictoolNOTgettingtheonesyouDON’Twant?

•  Foragivenclassifier,wewouldlikebothtobehigh.

ReceiverOperatingCharacteristic(ROC)curve

•  TocreateanROCcurve,wefirstorderthepredictedprobabilitiesfromhighesttolowest.

•  Highestprobabilitiesarepredictedtohavethedisease(we’llwanttoclassifythoseas‘disease’).

•  Lowestprobabilitiesarepredictedtonothavethedisease(we’llwanttoclassifythoseas‘notdisease’).

•  Probability=0.5?Flipofthecoin.

ROCcurve•  TocreateanROCcurve,westartatthehighestpredictedprobabilityandworkourwaydownthelist(i.e.highestprobtolowestprob),westopateachpositionC(i.e.potentialcutoff)anduseitastheclassifier(i.e.Corabove=1,belowC=0),andask“Howwelldoesthisclassifierdo?Sensitivity?Specificity?”

preds[1,] 0.9299[2,] 0.9033[3,] 0.8918[4,] 0.8851[5,] 0.8687..

!Mostlikelytohavedisease

Forexample,ifweclassifycaseswithC=0.8918andhigheras‘1’andprobabilitieslessthanC=0.8918as‘0’,howwelldowedo?

Classifiedas‘1’

Classifiedas‘0’

ROCcurve

•  NOTE:Ifwestartatthetopofthelistandwedon’tgoveryfardownthelistforC,we’llprobablygetmostlytruepositives(i.e.lowfalsepositiverate,good),buttherearestilllotsoftruepositivesthatwemissed(lowtruepositiverate,bad).

!Mostlikelytohavedisease preds[1,] 0.9299[2,] 0.9033[3,] 0.8918[4,] 0.8851[5,] 0.8687..

Classifiedas‘1’

Classifiedas‘0’

TheROCcurveshowsuswhathappenstosensitivityandspecificityaswemoveourC(thresholdforclassifier)downthelist.

Cathighpredictedprobability(veryfewcasesdeclared‘disease’).Lowfalsepositiverate(good).Lowtruepositiverate(bad).

Catlowpredictedprobability(almostallcasesdeclared‘disease’).Highfalsepositiverate(bad).Hightruepositiverate(good).

Falsepositiverate

True

positivera

te

Sensitivityandspecificityinfofromthe“first”Cthreshold.

Sensitivityandspecificityinfofromthe“last”Cthreshold.

Falsepositiverate

True

positivera

te

EachmovedownthelistofprobabilitiescoincideswitheitheramoveupontheROCcurve(correctchoiceas‘disease’)ortotheright(incorrectchoicewhendeclared‘disease’).Thus,weseea‘stairstep’phenomenonintheROCcurve.Forsmalldatasets,thisisveryapparent.

Weessentiallystartatthepoint(0,0).Ifthefirstcase(highestprobability)iscorrectlyclassifiedwhendeclared‘disease’,wemoveverticallyup(truepositive).Ifthefirstcase(highestprobability)isincorrectlyclassifiedas‘disease,wemovetotheright(falsepositive).

Aswestartmovingdownthelist,wewanttomoveupontheROCcurve,nottotheright.

SeeanimationofROCcurvecreation

http://homepage.stat.uiowa.edu/~rdecook/stat6220/ROC_animated1.html

WhatROCshapesaysit’sagoodclassifier?

•  Wewantittogoupverticallyveryquickly–  i.e.Aswemovedownthelistofpredictedprobabilities,we’regettingall‘diseased’casesandno‘non-diseased’.

•  Weknowintheend(wheneveryoneisclassifiedasa‘disease’case)allthenegativeswillbemisclassifiedandallthepositiveswillbecorrectlyclassified.So,we’llalwaysendat(1,1).

ComparingclassifierswithROCanalysisBestofthese Worstofthese

Choosingatrandom

AUCorareaunderthecurvecomparestheclassifierstoo.

ROCcurve:Provost’sOfficeClient

•  Example:Predictwhichstudentswillnotreturnfortheirsecondyearofcollege.

– Weusethepredictedprobabilitiesfromthelogisticregressionforclassification.

–  Iftheprobabilityofacasebeinginclass1(notretained)isequaltoorgreaterthan0.5,thatcaseisclassifiedasa1.

– Anycasewithanestimatedprobabilityoflessthan0.5wouldbeclassifiedasa0(retained).

Provost’sOfficeExample

•  Logisticregressionvariables– Staffordloan– Liveoncampus– Firstgenerationcollegestudent– RAI– Selectiveprogramofstudy– HighSchoolGPA

Provost’sOfficeExample

•  Thepredictedprobabilityofnotbeingretainedisusedtoclassifyeachcase.

•  Studentswithaveryhighprobabilityareexpectedtonotreturn.

•  Studentswithaverylowprobabilityareexpectedtoreturnintheir2ndyear.

Provost’sOfficeExampleROC from logistic regression classifier

False positive rate

True

pos

itive

rate

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

TheROCcurvefromthegivenclassifier:logisticregressionpredictedprobabilities…meh

*PlotgeneratedfromROCRpackageinR.

Provost’sOfficeExampleThebluepointrepresentstheclassifierbasedonprobability=0.5cutoff.

Sensitivity:41/859=4.77%Specificity:

4762/4799=99.23%OverallMisclassification:

855/5658=0.1511

ROC from logistic regression classifier

False positive rate

True

pos

itive

rate

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

*Pointcalculatedbyhandandadded.

Provost’sOfficeExampleWecanalsolookataccuracy(1-misclassificationrate)vs.thecutoff.Thebestcutoffforhighestaccuracyis0.4623.

Sensitivity:66/859=7.68%Specificity:

4750/4799=98.97%OverallMisclassification:

842/5658=0.1488

Cutoff

Accuracy

0.0 0.2 0.4 0.6

0.2

0.3

0.4

0.5

0.6

0.7

0.8

*PlotgeneratedfromROCRpackageinR.

(Prettyclosetoour0.5threshold)

Provost’sOfficeExampleThegreenpointrepresentstheoptimalclassifierbasedonequalimportanceofsensitivityandspecificity(asmaxof“sensitivity+specificity”)whichisprob=0.191cutoff.

Sensitivity:55.2%Specificity:77.8%OverallMisclassification:

1456/5658=0.2573

ROC from logistic regression classifier

False positive rate

True

pos

itive

rate

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

*PointdeterminedfromEpipackage.

0.0 0.2 0.4 0.6 0.8

0.0

0.2

0.4

0.6

0.8

1.0

cutoff (prob)

SpecificitySensitivity

SensitivitySpecificitySens+SpecDist from top left to ROC curve

Provost’sOfficeExampleOptimalcutoffscanbefoundfromsoftwareorcalculated.

Maxsens+speccutoff(0.191) Our0.5cutoff

Mineucliddistancecutoff(0.146)

Cutoffwhereeveryonedeclaredpositive.

Cutoffwherenoonedeclaredpositive.

ComparingclassifierswithROCsThecurvesmayvisually“look”different,butaretheyreally?Ifwecollecteddataagain,wouldtheylookthisdifferent?

ROC curve with 95% CIs

Specificity (%)

Sen

sitiv

ity (%

)

020

4060

80100

100 80 60 40 20 0

AUC: 71.8% (69.9%–73.7%)AUC: 71.8% (69.9%–73.7%)

BootstrappingtheROC

Canusebootstrappingtocreateaconfidencebandforthesensitivity(givenspecificity)fortheROCcurve.

*PlotgeneratedfrompROCpackageinR.

BootstrappingtheROC

Butthebootstrapresultsdon’talwayslooksonice.

ROC curve with 95% CIs

Specificity (%)

Sen

sitiv

ity (%

)

020

4060

80100

100 80 60 40 20 0

AUC: 83.0% (60.4%–100.0%)AUC: 83.0% (60.4%–100.0%)

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

1-Specificity (i.e. % of true negatives incorrectly declared positive)

Sen

sitiv

ity (%

of t

rue

posi

tives

dec

lare

d po

sitiv

e)

*PlotgeneratedfrompROCpackageinR.

BootstrappingROCcurvesforcomparison

Semi-transparentcoloringinRcancome-inhandy.# semi-transparent red: col="#ff000030"

# semi-transparent blue: col="#0000ff20"

ROC curve with 95% Bootstrapped CIs

Specificity (%)

Sen

sitiv

ity (%

)

020

4060

80100

100 80 60 40 20 0*PlotgeneratedfrompROCpackageinR.

############################################ 1a) Manually scripting the ROC curve:############################################

## ‘NotRetained’ is the 0-1 response variable.## ‘preds’ are the leave-one-out predicted probabilities.

## Use 0.5 as the cutoff for classifying:pred.outcome=round(preds)

## Gather the data for the ROC curve:ROC.info.LogReg=data.frame(NotRetained,preds,pred.outcome)## Put in order by predicted probability, largest first:ROC.info.LogReg=ROC.info.LogReg[rev(order(ROC.info.LogReg$preds)),]

## A 'positive' will be considered as someone who was not retained.(num.true.pos=sum(NotRetained==1))(num.true.neg=sum(NotRetained==0))plot(c(0,1),c(0,1),type="n",xlab="1-Specificity (i.e. % of true neg's incorrectly declared positive)",ylab="Sensitivity (% of true pos's declared positive)",main="ROC curve (logistic regression)",sub="False Positive Rate")x=cumsum(1-ROC.info.LogReg$NotRetained)/num.true.negy=cumsum(ROC.info.LogReg$NotRetained)/num.true.poslines(x,y)abline(0,1)

#################################### 1b) Manually calculating AUC:####################################

## Calculate AUC thinking as geometric trapezoid shapes:#http://stats.stackexchange.com/questions/145566/how-to-calculate-area-under-the-curve-auc-or-the-c-statistic-by-hand

## Same x,y labeling as previous slide:x=cumsum(1-ROC.info.LogReg$NotRetained)/num.true.negy=cumsum(ROC.info.LogReg$NotRetained)/num.true.pos

height = (y[-1]+y[-n])/2width = diff(x) sum(height*width)

########################################################## 1c) Manually calculating sensitivity, specificity, ## ## misclassification rate for 0.5 threshold: ##########################################################

## All the needed info is in the following table:table(NotRetained,pred.outcome) # 0 1# 0 4762 37# 1 818 41

################################################ 2) ROCR package for creating ROC curve: ################################################library(ROCR)

pred=prediction(preds,NotRetained)

perf=performance(pred,"tpr","fpr")

plot(perf,main="ROC from logistic regression classifier")

(AUC.ROCR=performance(pred,"auc")@y.values[[1]])

## Plot accuracy (1-misclassification rate) vs. cutoff:acc = performance(pred, "acc")(ac.val = max(unlist([email protected])))#[1] 0.8511842

th = unlist([email protected])[unlist([email protected]) == ac.val]

plot(acc)abline(v=th, col='grey', lty=2)th#[1] 0.462324

################################################ 3) pROC package for bootstrapped ROC: ################################################library(pROC)

## ‘NotRetained’ is the 0-1 response variable.## ‘preds’ are the leave-one-out predicted probabilities.

## Use 0.5 as the cutoff for classifying:pred.outcome=round(preds)

rocobj=plot.roc(NotRetained, preds, main="ROC curve with 95% CIs", percent=TRUE, ci=TRUE, print.auc=TRUE)

## Calculate CI of sensitivity at select set of ## specificities and form a 'band' (might take a bit): ciobj=ci.se(rocobj,specificities=seq(0, 100, 5)) plot(ciobj, type="shape", col="#1c61b6AA") # blue band

## Use bootstrap to add CI in both directions at "best" cutoff: plot(ci(rocobj, of="thresholds", thresholds="best"),col="yellow",lwd=2)

## Overlay Bootstrapped ROC's:rocobj=plot.roc(NotRetained, preds, main="ROC curve with 95% Bootstrapped CIs", percent=TRUE, ci=TRUE, print.auc=FALSE)

ciobj=ci.se(rocobj,specificities=seq(0, 100, 5)) plot(ciobj, type="shape", col="#ff000030") # semi-transparent red color

## Gather info on second classifier:ROC.info.LogReg.2=data.frame(NotRetained,preds.2,pred.outcome.2)

## Overlay the second ROC curve onto the first:rocobj2=plot.roc(ROC.info.LogReg.2$NotRetained, ROC.info.LogReg.2$preds.2, percent=TRUE, ci=TRUE, print.auc=FALSE, add=TRUE)

ciobj2=ci.se(rocobj2,specificities=seq(0, 100, 5)) plot(ciobj2, type="shape", col="#0000ff20") # semi-transparent blue color

Somereferences

•  Flach,P.A.,(2016).ROCAnalysis.ChapterinEncyclopediaofMachineLearningandDataMining.–  https://research-information.bristol.ac.uk/files/94977288/Peter_Flach_ROC_Analysis.pdf

•  James,G.,Witten,D.,Hastie,T.andTibshirani,R.(2015).AnIntroductiontoStatisticalLearningwithApplicationsinR.–  http://www-bcf.usc.edu/~gareth/ISL/–  Click‘DownloadthebookPDF’

•  Fawcett,T.(2006).AnIntroductiontoROCAnalysis.PatternRecognitionLetters27pp.861-874.

Receiver Operating Characteristic (ROC) Curveshomepage.stat.uiowa.edu/.../ROC_introduction.pdf · move up on the ROC curve (correct choice as ‘disease’) or to the right (incorrect

Documents