Top Banner
Receiver Operating Characteristic (ROC) Curves Evaluating a classifier and predictive performance 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 ROC curve 1-Specificity (i.e. % of true negatives incorrectly declared positive) Sensitivity (% of true positives declared positive) False positive rate True positive rate
37

Receiver Operating Characteristic (ROC) Curveshomepage.stat.uiowa.edu/.../ROC_introduction.pdf · move up on the ROC curve (correct choice as ‘disease’) or to the right (incorrect

Jan 27, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • ReceiverOperatingCharacteristic(ROC)

    Curves

    Evaluatingaclassifierandpredictiveperformance

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    ROC curve

    1-Specificity (i.e. % of true negatives incorrectly declared positive)

    Sen

    sitiv

    ity (%

    of t

    rue

    posi

    tives

    dec

    lare

    d po

    sitiv

    e)

    Falsepositiverate

    True

    positivera

    te

  • Classification(supervisedlearning)•  Insupervisedlearning,weareinterestedinbuildingamodeltopredicttheclass(oroutcome)ofanewobservationbasedonobservablepredictorsusingthetrain-set/test-setframework.– Whichstudentsarenotlikelytoreturnfortheirsecondyearofcollege?

    – Whichanimalsadmittedtoaveterinaryhospitalarelikelytosurvive(ornotsurvive)?

    – Whichbankcustomersarelikelytodefaultonaloan?

  • Someclassificationprocedures:•  Logisticregression(0-1response)– Choosevariables,fitmodel,calculate(i.e.)–  0.5predictedclass1,0.5predictedclass0

    •  Classificationtree– Use‘greedy’algorithmtochoosevariablesandcreatedecisiontree(predictionattreeleaves)

    •  LinearDiscriminantAnalysis(LDA)– Builtonmultivariatenormal(spheroids)– Classifynewobservationtonearestcentroid

    p̂p̂ ≥ p̂ <

  • Assumptions:•  Logisticregression– Predictorsarerelatedtoresponsewithasigmoidalmeanstructure

    – ConditionalBernoulligiventhemean•  Classificationtree(maybe?)– Thedatacanbedescribedbyfeatures– Thereexistssometreethatisn’ttoobigthatcanpredictreasonablywell

    •  Lineardiscriminantanalysis– DatafollowmultivariateGaussian– Equalcovariancematrices

  • Pros/Consoftheseclassifiers:

    NOTE:Insimulationstudies,wehavefoundthateachoftheseclassifiersperformsbetterthantheotherswhenthedataweresimulatedunderthegivenmodelassumptions.

    Pros ConsLogistic,Regression

    Interpretation,of,covariates.

    Computationally,complex,interative,algorithm,and,may,not,converge.

    More,robust,to,deviations,from,assumptions,than,LDA.

    Depends,on,sigmoidal,mean,structure,and,conditional,binomial,distributions.

    Trees Fewer,assumptions,of,data,,more,flexible,algorithm,,easy,interpation,of,certain,characteristics.

    Instability,(random,forests,can,help),,lack,of,smoothness,,uses,a,greedy,algorithm.

    LDA Dimensionality,reduction,,estimation,under,assumptions,uses,maximum,likelihood

    Not,flexible,when,assumptions,deviate,from,multivariate,normal,assumptions.

  • Whatmakesagoodclassifier?

    •  Yourpredictionsarecorrect.– Truepositiveswerecorrectlyclassified– Truenegativeswerecorrectlyclassified

    •  Example:Diseasediagnostictest– Patientsaregiventhediagnostictest– Testgivesthosewiththediseasea‘positive’

    •  Highsensitivity(correctlyclassifyingtruepositives)– Testgivesthosewithoutthediseasea‘negative’

    •  Highspecificity(correctlyclassifyingtruenegatives)

  • Misclassificationrate

    •  Misclassification– Theobservationbelongstooneclass,butthemodelclassifiesitasamemberofadifferentclass.

    •  Aclassifierthatmakesnoerrorswouldbeperfect– Notlikelytofindsuchaclassifierintherealworld– Noise– Otherrelevantvariablesnotintheavailabledataset

  • Misclassificationrate

    •  Wecanpresenttheaccuracyoftheresultswithaconfusionmatrix.

    •  Overall error rate = (18+5)/1000 = 2.3%

    •  NOTE: The prediction value is often calculated through leave-one-out cross-validation in logistic regression.

    Predict(as(1 Predict(as(0Actual(1 20 5Actual(0 18 957

  • Beatingthenaïveclassifier

    •  Inthisdataset,the0’sweremuchmoreprevalentthanthe1’s.

    •  WhatifwejustclassifiedALLobservationsas0?Howwelldowedo?

    •  Overall error rate = (25)/1000 = 2.5%

    Predict(as(1 Predict(as(0Actual(1 20 5Actual(0 18 957

  • Distinguishingbetweenthetwotypesofmistakestobemade

    •  Sensitivity=#oftruepositivesdeclared‘positive’#oftruepositivesIsyourdiseasediagnostictoolgettingtheonesyouwant?

    •  Specificity=#oftruenegativesnotdeclared‘positive’#oftruenegativesIsyourdiseasediagnostictoolNOTgettingtheonesyouDON’Twant?

    •  Foragivenclassifier,wewouldlikebothtobehigh.

  • ReceiverOperatingCharacteristic(ROC)curve

    •  TocreateanROCcurve,wefirstorderthepredictedprobabilitiesfromhighesttolowest.

    •  Highestprobabilitiesarepredictedtohavethedisease(we’llwanttoclassifythoseas‘disease’).

    •  Lowestprobabilitiesarepredictedtonothavethedisease(we’llwanttoclassifythoseas‘notdisease’).

    •  Probability=0.5?Flipofthecoin.

  • ROCcurve•  TocreateanROCcurve,westartatthehighestpredictedprobabilityandworkourwaydownthelist(i.e.highestprobtolowestprob),westopateachpositionC(i.e.potentialcutoff)anduseitastheclassifier(i.e.Corabove=1,belowC=0),andask“Howwelldoesthisclassifierdo?Sensitivity?Specificity?”

    preds[1,] 0.9299[2,] 0.9033[3,] 0.8918[4,] 0.8851[5,] 0.8687..

    !Mostlikelytohavedisease

    Forexample,ifweclassifycaseswithC=0.8918andhigheras‘1’andprobabilitieslessthanC=0.8918as‘0’,howwelldowedo?

    Classifiedas‘1’

    Classifiedas‘0’

  • ROCcurve

    •  NOTE:Ifwestartatthetopofthelistandwedon’tgoveryfardownthelistforC,we’llprobablygetmostlytruepositives(i.e.lowfalsepositiverate,good),buttherearestilllotsoftruepositivesthatwemissed(lowtruepositiverate,bad).

    !Mostlikelytohavedisease preds[1,] 0.9299[2,] 0.9033[3,] 0.8918[4,] 0.8851[5,] 0.8687..

    Classifiedas‘1’

    Classifiedas‘0’

  • TheROCcurveshowsuswhathappenstosensitivityandspecificityaswemoveourC(thresholdforclassifier)downthelist.

    Cathighpredictedprobability(veryfewcasesdeclared‘disease’).Lowfalsepositiverate(good).Lowtruepositiverate(bad).

    Catlowpredictedprobability(almostallcasesdeclared‘disease’).Highfalsepositiverate(bad).Hightruepositiverate(good).

    Falsepositiverate

    True

    positivera

    te

    Sensitivityandspecificityinfofromthe“first”Cthreshold.

    Sensitivityandspecificityinfofromthe“last”Cthreshold.

  • Falsepositiverate

    True

    positivera

    te

    EachmovedownthelistofprobabilitiescoincideswitheitheramoveupontheROCcurve(correctchoiceas‘disease’)ortotheright(incorrectchoicewhendeclared‘disease’).Thus,weseea‘stairstep’phenomenonintheROCcurve.Forsmalldatasets,thisisveryapparent.

    Weessentiallystartatthepoint(0,0).Ifthefirstcase(highestprobability)iscorrectlyclassifiedwhendeclared‘disease’,wemoveverticallyup(truepositive).Ifthefirstcase(highestprobability)isincorrectlyclassifiedas‘disease,wemovetotheright(falsepositive).

    Aswestartmovingdownthelist,wewanttomoveupontheROCcurve,nottotheright.

  • SeeanimationofROCcurvecreation

    http://homepage.stat.uiowa.edu/~rdecook/stat6220/ROC_animated1.html

  • WhatROCshapesaysit’sagoodclassifier?

    •  Wewantittogoupverticallyveryquickly–  i.e.Aswemovedownthelistofpredictedprobabilities,we’regettingall‘diseased’casesandno‘non-diseased’.

    •  Weknowintheend(wheneveryoneisclassifiedasa‘disease’case)allthenegativeswillbemisclassifiedandallthepositiveswillbecorrectlyclassified.So,we’llalwaysendat(1,1).

  • ComparingclassifierswithROCanalysisBestofthese Worstofthese

    Choosingatrandom

    AUCorareaunderthecurvecomparestheclassifierstoo.

  • ROCcurve:Provost’sOfficeClient

    •  Example:Predictwhichstudentswillnotreturnfortheirsecondyearofcollege.

    – Weusethepredictedprobabilitiesfromthelogisticregressionforclassification.

    –  Iftheprobabilityofacasebeinginclass1(notretained)isequaltoorgreaterthan0.5,thatcaseisclassifiedasa1.

    – Anycasewithanestimatedprobabilityoflessthan0.5wouldbeclassifiedasa0(retained).

  • Provost’sOfficeExample

    •  Logisticregressionvariables– Staffordloan– Liveoncampus– Firstgenerationcollegestudent– RAI– Selectiveprogramofstudy– HighSchoolGPA

  • Provost’sOfficeExample

    •  Thepredictedprobabilityofnotbeingretainedisusedtoclassifyeachcase.

    •  Studentswithaveryhighprobabilityareexpectedtonotreturn.

    •  Studentswithaverylowprobabilityareexpectedtoreturnintheir2ndyear.

  • Provost’sOfficeExampleROC from logistic regression classifier

    False positive rate

    True

    pos

    itive

    rate

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    TheROCcurvefromthegivenclassifier:logisticregressionpredictedprobabilities…meh

    *PlotgeneratedfromROCRpackageinR.

  • Provost’sOfficeExampleThebluepointrepresentstheclassifierbasedonprobability=0.5cutoff.

    Sensitivity:41/859=4.77%Specificity:

    4762/4799=99.23%OverallMisclassification:

    855/5658=0.1511

    ROC from logistic regression classifier

    False positive rate

    True

    pos

    itive

    rate

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    *Pointcalculatedbyhandandadded.

  • Provost’sOfficeExampleWecanalsolookataccuracy(1-misclassificationrate)vs.thecutoff.Thebestcutoffforhighestaccuracyis0.4623.

    Sensitivity:66/859=7.68%Specificity:

    4750/4799=98.97%OverallMisclassification:

    842/5658=0.1488

    Cutoff

    Accuracy

    0.0 0.2 0.4 0.6

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    *PlotgeneratedfromROCRpackageinR.

    (Prettyclosetoour0.5threshold)

  • Provost’sOfficeExampleThegreenpointrepresentstheoptimalclassifierbasedonequalimportanceofsensitivityandspecificity(asmaxof“sensitivity+specificity”)whichisprob=0.191cutoff.

    Sensitivity:55.2%Specificity:77.8%OverallMisclassification:

    1456/5658=0.2573

    ROC from logistic regression classifier

    False positive rate

    True

    pos

    itive

    rate

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    *PointdeterminedfromEpipackage.

  • 0.0 0.2 0.4 0.6 0.8

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    cutoff (prob)

    SpecificitySensitivity

    SensitivitySpecificitySens+SpecDist from top left to ROC curve

    Provost’sOfficeExampleOptimalcutoffscanbefoundfromsoftwareorcalculated.

    Maxsens+speccutoff(0.191) Our0.5cutoff

    Mineucliddistancecutoff(0.146)

    Cutoffwhereeveryonedeclaredpositive.

    Cutoffwherenoonedeclaredpositive.

  • ComparingclassifierswithROCsThecurvesmayvisually“look”different,butaretheyreally?Ifwecollecteddataagain,wouldtheylookthisdifferent?

  • ROC curve with 95% CIs

    Specificity (%)

    Sen

    sitiv

    ity (%

    )

    020

    4060

    80100

    100 80 60 40 20 0

    AUC: 71.8% (69.9%–73.7%)AUC: 71.8% (69.9%–73.7%)

    BootstrappingtheROC

    Canusebootstrappingtocreateaconfidencebandforthesensitivity(givenspecificity)fortheROCcurve.

    *PlotgeneratedfrompROCpackageinR.

  • BootstrappingtheROC

    Butthebootstrapresultsdon’talwayslooksonice.

    ROC curve with 95% CIs

    Specificity (%)

    Sen

    sitiv

    ity (%

    )

    020

    4060

    80100

    100 80 60 40 20 0

    AUC: 83.0% (60.4%–100.0%)AUC: 83.0% (60.4%–100.0%)

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    ROC curve

    1-Specificity (i.e. % of true negatives incorrectly declared positive)

    Sen

    sitiv

    ity (%

    of t

    rue

    posi

    tives

    dec

    lare

    d po

    sitiv

    e)

    *PlotgeneratedfrompROCpackageinR.

  • BootstrappingROCcurvesforcomparison

    Semi-transparentcoloringinRcancome-inhandy.# semi-transparent red: col="#ff000030"

    # semi-transparent blue: col="#0000ff20"

    ROC curve with 95% Bootstrapped CIs

    Specificity (%)

    Sen

    sitiv

    ity (%

    )

    020

    4060

    80100

    100 80 60 40 20 0*PlotgeneratedfrompROCpackageinR.

  • ############################################ 1a) Manually scripting the ROC curve:############################################

    ## ‘NotRetained’ is the 0-1 response variable.## ‘preds’ are the leave-one-out predicted probabilities.

    ## Use 0.5 as the cutoff for classifying:pred.outcome=round(preds)

    ## Gather the data for the ROC curve:ROC.info.LogReg=data.frame(NotRetained,preds,pred.outcome)## Put in order by predicted probability, largest first:ROC.info.LogReg=ROC.info.LogReg[rev(order(ROC.info.LogReg$preds)),]

    ## A 'positive' will be considered as someone who was not retained.(num.true.pos=sum(NotRetained==1))(num.true.neg=sum(NotRetained==0))plot(c(0,1),c(0,1),type="n",xlab="1-Specificity (i.e. % of true neg's incorrectly declared positive)",ylab="Sensitivity (% of true pos's declared positive)",main="ROC curve (logistic regression)",sub="False Positive Rate")x=cumsum(1-ROC.info.LogReg$NotRetained)/num.true.negy=cumsum(ROC.info.LogReg$NotRetained)/num.true.poslines(x,y)abline(0,1)

  • #################################### 1b) Manually calculating AUC:####################################

    ## Calculate AUC thinking as geometric trapezoid shapes:#http://stats.stackexchange.com/questions/145566/how-to-calculate-area-under-the-curve-auc-or-the-c-statistic-by-hand

    ## Same x,y labeling as previous slide:x=cumsum(1-ROC.info.LogReg$NotRetained)/num.true.negy=cumsum(ROC.info.LogReg$NotRetained)/num.true.pos

    height = (y[-1]+y[-n])/2width = diff(x) sum(height*width)

  • ########################################################## 1c) Manually calculating sensitivity, specificity, ## ## misclassification rate for 0.5 threshold: ##########################################################

    ## All the needed info is in the following table:table(NotRetained,pred.outcome) # 0 1# 0 4762 37# 1 818 41

  • ################################################ 2) ROCR package for creating ROC curve: ################################################library(ROCR)

    pred=prediction(preds,NotRetained)

    perf=performance(pred,"tpr","fpr")

    plot(perf,main="ROC from logistic regression classifier")

    (AUC.ROCR=performance(pred,"auc")@y.values[[1]])

    ## Plot accuracy (1-misclassification rate) vs. cutoff:acc = performance(pred, "acc")(ac.val = max(unlist([email protected])))#[1] 0.8511842

    th = unlist([email protected])[unlist([email protected]) == ac.val]

    plot(acc)abline(v=th, col='grey', lty=2)th#[1] 0.462324

  • ################################################ 3) pROC package for bootstrapped ROC: ################################################library(pROC)

    ## ‘NotRetained’ is the 0-1 response variable.## ‘preds’ are the leave-one-out predicted probabilities.

    ## Use 0.5 as the cutoff for classifying:pred.outcome=round(preds)

    rocobj=plot.roc(NotRetained, preds, main="ROC curve with 95% CIs", percent=TRUE, ci=TRUE, print.auc=TRUE)

    ## Calculate CI of sensitivity at select set of ## specificities and form a 'band' (might take a bit): ciobj=ci.se(rocobj,specificities=seq(0, 100, 5)) plot(ciobj, type="shape", col="#1c61b6AA") # blue band

    ## Use bootstrap to add CI in both directions at "best" cutoff: plot(ci(rocobj, of="thresholds", thresholds="best"),col="yellow",lwd=2)

  • ## Overlay Bootstrapped ROC's:rocobj=plot.roc(NotRetained, preds, main="ROC curve with 95% Bootstrapped CIs", percent=TRUE, ci=TRUE, print.auc=FALSE)

    ciobj=ci.se(rocobj,specificities=seq(0, 100, 5)) plot(ciobj, type="shape", col="#ff000030") # semi-transparent red color

    ## Gather info on second classifier:ROC.info.LogReg.2=data.frame(NotRetained,preds.2,pred.outcome.2)

    ## Overlay the second ROC curve onto the first:rocobj2=plot.roc(ROC.info.LogReg.2$NotRetained, ROC.info.LogReg.2$preds.2, percent=TRUE, ci=TRUE, print.auc=FALSE, add=TRUE)

    ciobj2=ci.se(rocobj2,specificities=seq(0, 100, 5)) plot(ciobj2, type="shape", col="#0000ff20") # semi-transparent blue color

  • Somereferences

    •  Flach,P.A.,(2016).ROCAnalysis.ChapterinEncyclopediaofMachineLearningandDataMining.–  https://research-information.bristol.ac.uk/files/94977288/Peter_Flach_ROC_Analysis.pdf

    •  James,G.,Witten,D.,Hastie,T.andTibshirani,R.(2015).AnIntroductiontoStatisticalLearningwithApplicationsinR.–  http://www-bcf.usc.edu/~gareth/ISL/–  Click‘DownloadthebookPDF’

    •  Fawcett,T.(2006).AnIntroductiontoROCAnalysis.PatternRecognitionLetters27pp.861-874.