Page 1
CompanyRecommendationforNewGraduatesviaImplicitFeedbackMultipleMatrixFactorizationwithBayesianOptimization
IEEE BIG DATA2016 Washington D.C.
MasahiroKazama1,Issei Sato2,HaruakiYatabe3,Tairiku Ogihara3,Tetsuro Onishi3,HiroshiNakagawa21.RecruitTechnologiesCo.,Ltd.2.UniversityofTokyo3.RecruitCareerCo.,Ltd
Page 2
Outline
• ProblemSettings• DataDescription• ProposedMethod• Experiments• Results• Conclusion
Page 3
ProblemSetting
• UniquejobhuntingactivitiesofJapanesestudents• Thestartingtimeforjobhuntingisfixed• Allstudentsapplyatthesametime
Example.jobhuntingscheduleofstudentswhograduatein2015
Startjobhunting activities StartInterview Graduate/Join
Dec1,2013 April1,2014 April1,2015
Page 4
ProblemSetting
• Studentshavetosendapplicationsheetformanycompaniestogetajoboffer• Manystudentsspendmuchtimeonjobhuntingactivities.ThisisabigsocialprobleminJapan• Manystudentssendapplicationsheettothepopularcompaniesatthebeginning.Buttheyhaveahighcompetitionrate,thereforetheycannotgetajoboffer.
Page 5
Popularitybias• Browsingconcentratesonsomecompanies
5Company(orderedbypopularity)
Low-browsedcompanies(Bottom80%)
High-browsedcompanies(Top20%)
Numbe
rofStude
nts
Page 6
ProblemSetting
• Itisimportanttofindacompanysuitableforstudentsatanearlystageofjobhuntingactivities• ItisimportanttoconsidernotonlyHigh-browsedcompaniesbutalsoLow-browsedcompanies
Page 7
Solutions
• Werecommendsuitablecompaniestostudentsatanearlystage• Wefocusonlow-browsedcompanies
Page 8
Data
• Ourcompany(Recruit.Co.Ltd)providesajobrecruitingservice• Almostallstudentsuseourservice
• Wehavethreetypesofdata1. Browsingdata2. Entrydata3. Student/Companyinformation
Page 9
Browsingdata• Browsingdataofstudentsonourrecruitingservice• Usedfortrainingourmodel
• period: 2013/12/1〜2014/3/31
9
Page 10
Entrydata• Entrydataofstudentsonourrecruitingservice• Usedforevaluatingourmodel
• period: 2013/12/1〜2014/3/31
10
Page 11
Browsing(click)data
11
click i1 i2 i3 i4
j1 0 4 0 21
j2 71 31 0 18
j3 3 1 2 0
Students
Company
Page 12
Entrydata
12
entry i1 i2 i3 i4
j1 0 1 0 0
j2 0 1 0 1
j3 1 0 1 0
Student
Company
Page 13
Student/Companyinfo
13
Student
FacultyDepartmentetc..
Company
Industry typeLocationNumber of employees
Page 14
Overview
14
Purpose
Solution
・Usingbrowsingdataandstudent/companyinformation,werecommendsuitablecompaniestostudents・Wefocusonlow-browsedcompanies
• Usingbrowsingdata->Implicitfeedbackrecommendation• Low-browseditemrecommendation->Popularitybias• Hyperparametersearch→Bayesianoptimization
Page 15
ExplicitVSImplicit
15
Explicit feedback Implicit feedbackThedatauserexplicitlygive.
Theuseractiondataforguessinguserpreference
e.g. Amazon 5starrating Clicklog
Pros Good quality Easy to getMuch data
Con Difficult to get NoisePopularity bias
Page 16
Popularitybias• Browsingconcentratesonsomecompanies→High-browsedcompaniesaremorelikelytoberecommended
16Company(orderedbypopularity)
Low-browsedcompany(Bottom80%)Wewanttorecommendthese
High-browsedcompany(Top20%)
Numbe
rofstude
nts
Page 17
Implicitfeedbackmatrixfactorization
17
Numberofclicks
CollaborativeFilteringforImplicitFeedbackDatasets(2008)Yifan Hu,YehudaKoren,ChrisVolinsky
rui =10
rui > 0rui = 0
!"#
$#
confidence
preferencei1 i2 i3
j1 41
j2 2
j3 24 3 51
Browsingdata
Page 18
Problem• High-browsedcompaniesaremorelikelytoberecommended
18
i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 i12Company
Numberof
clicks
Low-browsedcompaniesWewanttorecommendthese
Likelytoberecommended
Page 19
Proposedmethod
19
=Numberofuserswhobrowsedthecompanyi(Company’spopularity)
cisbiggerwhenthecompanyhasfewerclicks→Low-browsedcompaniesarelikelytoberecommended
Page 20
Proposedmethodwithsideinformation
20
Studentinformation
Companyinformation
Page 21
Hyperparametersearch
• WeightofBrowsingα、β、Regularizationλ1,λ2,λ3• Whenthenumberofhyperparameter islarge,gridsearchdoesn’tworkwell
• UseBayesianoptimization forhyperparametersearch21
Page 22
Bayesianoptimization
22
x y=f(x) y
OptimizationforBlack-box→Gaussianprocessisassumedfordistributionoffunctionf(x)→Itsuggeststhenexthyperparametertoevaluate
x:Hyperparameter α、β、λ1,λ2,λ3f(x) :RecallWewanttofindhyperparameterthatmaximizeRecall
Mockus,1978
Page 23
DataandEvaluationRecall@100(lowbrowsed)
23
c01 c02 c03 c04 c05 c06 c07 c08 c09 c10Browsing
10 20 1 8 5 10 3 7 23 13
Entry ◯ ◯ ◯ ◯
60% 20% 20%
TrainingSetformatrixfactorization
ValidationSetforBayesianOptimization(BO)
EvaluationSet
Page 24
Results
0 0.1 0.2 0.3 0.4 0.5
BO+Huetal.
BO+Fangetal.
Proposed
Proposedwithside
Proposedmodelsgetbetterrecall
Page 25
TrialsofBayesianOptimizationIncreasingthetrials,wegetbetterrecall.->wecanfindbetterhyperparameters
Page 26
Conclusions
• Webuiltarecommendationsystemthatrelaxespopularitybias• Byusingthesideinformation,therecommendationperformanceofthelow-browsedcompaniesimproved• HyperparameteroptimizationwasperformedusingBayesianoptimization