Top Banner

of 12

Logistic Regression With Low Event Rate (Rare Events)

Jun 01, 2018

Download

Documents

Tejamoy Ghosh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    1/12

    Logistic Regression withLow Event Rate (or Rare

    Events)

    1/28/15 Tejamoy Ghosh Data Science ATG - New Delhi, In ia

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    2/12

    Contents:

    !"o#lem with lo$istic "e$"ession with low e%ent"ate

    &ay o't

    (ow to o them in SAS)

    (ow to o them in *)

    1/28/15 Tejamoy Ghosh Data Science ATG - New Delhi, In ia

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    3/12

    A typical conversationAnalyst 1+ I m in some t"o'#le, my mana$e" wants me to #'il a lo$istic"e$"ession mo el #'t I ha%e only a 2 e%ent "ate in my ata. The lo$istic"e$"ession won t #e a $oo choice he"e the 0 estimate will #e #iase .Analyst 2+ Not necessa"ily. It s the total co'nt "athe" than the e"centa$e oe%ents that matte"s. (ow many cases o yo' ha%e o" the "a"e" e%ent anhow #i$ is yo'" ataset)

    Analyst1+ &e %e $ot a#o't 1833 o e%ents in a ataset o a#o't 133,333cases a less than 2 scena"ioAnalyst2+ (mm. &ith these many cases o" the "a"e" e%ent, yo' can %e"y well'se lo$istic "e$"ession. The"e a"e metho s to a "ess s'ch s4ewe , o" s a"se

    ata sit'ations.

    Analyst1+ &ow. *eally !lease tell me mo"eAnalyst2+ The"e a"e co' le o alte"nati%es. 6o" one yo' can 'se exactlogistic regression this is to #e 'se when sam le si e is too small o"yo'" 's'al lo$istic "e$"ession 'sin$ the "e$'la" ma9im'm-li4elihoo -#aseestimation. Anothe" o tion in yo'" scena"io is to 'se the penalized-likelihood estimation metho . This secon one has the a %anta$e o #ein$

    com 'tationally less eman in$ than the exact logistic metho .1/28/15 Tejamoy Ghosh Data Science ATG - New Delhi, In ia

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    4/12

    a s wrong w myregular logistic regression

    when the event rate is low0ow e%ent "ate/*a"e :%ent+In the c'""ent conte9t, this "e e"s to the scena"io whe"e 'n e" a #ina"yo'tcome s ace ;"es onse/no-"es onse, $oo /#a , e a'lt/no- e a'lt,

    '"chase/no- '"chase, etc.< one o the two e%ents a"e a" ewe" than the othe" S' ose in a sam le o 1333 a licants o" a osition only 23 a"e selecte he"e the

    e%ent o #ein$ selecte is the "a"e e%ent with a low e%ent "ate o 2

    S' ose, in a sam le o 133,333 '"chases "om an online "etaile", a#o't 1833 a"e"et'"ne #y the c'stome" he"e the e%ent o $oo s #ein$ "et'"ne is the "a"e e%entwith a low e%ent "ate o 1.8

    Some "eal li e e9am les+ =ha"$e #ac4s in c"e it ca" t"ansactions Goo s "et'"ne in online "etailin$

    &hy is this a "o#lem o" lo$istic "e$"ession it s still #ina"y anyway) The "o#lem he"e is with the estimation metho the 's'al maximum-likelihood method is s'sce ti#le to >small sam le #ias an this #ias isst"on$ly e en ent on the co'nt ;as o ose to e"centa$e< o the "a"e" othe e%ents

    1/28/15 Tejamoy Ghosh Data Science ATG - New Delhi, In ia

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    5/12

    Whats the way out then?

    In case o small sam le an /o" %e"y 'n#alance #ina"y ata ;&hen yo'ha%e j'st 23 cases in a sam le o 1333< >e9act lo$istic "e$"ession is to#e 'se

    :9act lo$istic "e$"ession a "oach "o%i es an alte"nati%e to the ma9im'mli4elihoo metho o" ma4in$ in e"ences a#o't the a"amete"s o the lo$istic

    "e$"ession mo el The metho is #ase on a "o "iate e9act ist"i#'tions o s'?cient statistics o"

    a"amete"s o inte"est an the estimates $i%en #y e9act lo$istic "e$"ession o note en on asym totic "es'lts

    It is 'se 'l o" analy in$ small o" 'n#alance #ina"y ata with co%a"iates This metho is 's'ally %e"y com 'tationally intensi%e

    I , howe%e", yo' ha%e a la"$e" co'nt o the "a"e" o the two e%ents, say,

    1333, ;e%en #ette" i it s 2333< in a sam le o 133,333 with the same lowe%ent "ate ;1 to 2 < yo' can 'se lo$istic "e$"ession the estimation willha%e to #e one 'sin$ > enali e li4elihoo metho ;also calle 6i"th s

    enali e li4elihoo a "oach, a te" its in%ento"&hile we mentione this metho in the conte9t o only small sam le si e/"a"ee%ent scena"io, this is a metho o a "essin$ iss'es o se a"a#ility, smallsam le si es, an #ias o the a"amete" estimates

    1/28/15 Tejamoy Ghosh Data Science ATG - New Delhi, In ia

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    6/12

    How to o them in !A!

    1/28/15 Tejamoy Ghosh Data Science ATG - New Delhi, In ia

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    7/12

    E"act Logistic !A! co e

    !"oc 0o$istic Data @ o'"*a"e:%entData escen in$B6"eC =ell=o'ntB / the =ell=o'nt %a"ia#le is wei$ht %#l he"e /mo el *a"e:%ent @ E1 E2B:9act E1 / estimate @ #othB

    *'n B

    o' can a othe" o tions o" what yo' want to ha%e in yo'"o't 't

    The o tion :9act a te" the mo el statement an the 6"eCstatements a"e the 4ey iFe"ences he"e

    An alte"nati%e :%ent/T"ial Synta9+!"oc 0o$istic Data @ o'"*a"e:%entDataBmo el *a"e:%ent / =ell=o'nt @ E1 E2B:9act E1 / estimate @ #othB*'n B

    1/28/15 Tejamoy Ghosh Data Science ATG - New Delhi, In ia

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    8/12

    #enali$e Logistic !A!co e!"oc 0o$istic Data @ o'"*a"e:%entDataBclass =ate$o"ical #l1 =ate$o"ical #l2/ a"am @"e B

    o el @ =ate$o"ical #l1 =ate$o"ical #l2 E1 E1 /

    H"th B"'n B

    o' can a othe" o tions o" what yo' want toha%e in yo'" o't 't

    The o tion >6I*T( in the mo el statement is the4ey he"e

    1/28/15 Tejamoy Ghosh Data Science ATG - New Delhi, In ia

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    9/12

    How to o them in R

    1/28/15 Tejamoy Ghosh Data Science ATG - New Delhi, In ia

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    10/12

    E"act Logistic in R

    !ac4a$e *eC'i"e +>el"m

    This ac4a$e im lements ;a "o9imate

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    11/12

    #enali$e Logistic in R

    !ac4a$e +>lo$ist

    This ac4a$e "'ns 6i"th s #ias "e 'ce lo$istic "e$"ession

    a "oach with enali e "oHle li4elihoo #aseconH ence inte"%als o" a"amete" estimates

    Anothe" ac4a$e > enali e "'ns enali e$ene"ali e linea" mo els, enali e "e$"essionmo els

    1/28/15 A"' G'ha - In ian Instit'te o 6o"ei$n T"a e - New Delhi, In ia

  • 8/9/2019 Logistic Regression With Low Event Rate (Rare Events)

    12/12

    ata Sciences ATG

    E%&CA' *

    Econometrics+!tatistics+Economics,an er-ilt+Cincinnati+ n ian!tatistical nstitute+

    .awaharlal *ehru&niversityResearch !cholars

    .ournal Articles

    Free Solutions to Challenging ataProblems

    E/#ER E*CE01 years com-ine +2ar3etinganalytics+ Ris3analytics+ 4inancialanalytics+ Analytic!olution 5 'ools

    evelopment+Analytics CoE set6up+ A vanceAnalytics 'raining

    E/ !' *78!ER,E%CL E*'!A large 7lo-al9everage company+ Asmall insurancecompany+A renowne -usinessschool+ A large7lo-al HR 5CompensationConsulting 7roup+ Alarge 7lo-al 'Research group+ Athir party analyticsven or+ A mi si$eanalytics consulting

    E/#ER' !E#re ictivemo elling+!egmentation+2ar3et research+Clic3stream ataanalysis+4orecasting+4inancial 'ime!eries+ !imulation+9ayesianeconometrics+2achine Learning'echni ues+%ecision 'rees+!A!+ !#!!+ R+

    ctave+ !tata+Eviews+ 2atla-+2a"ima+ *etlogo

    What we ont

    o; Quick and dirty back of theenvelope calculationUse jargon presentations with littleimpact on your problem

    Hide that we are stumped

    What we o;

    FREE analytics help to stuckanalysts and consultants

    Customi ed analytics solutions toinstitutes and companies

    FREE snapshot to companiesconsidering entering analytics

    !pply analytics in non"traditional

    areas including films # education

    FREE data analysis help tostudents researchers and faculty