Top Banner
Recruiting Solutions Machine Learning for Search @ Viet Ha-Thuc Search Quality - LinkedIn 1 Viet Ha- Thuc
36

Machine Learning for Search at LinkedIn

Jan 25, 2017

Download

Internet

Viet Ha-Thuc
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Machine Learning for Search at LinkedIn

Recruiting SolutionsRecruiting SolutionsRecruiting Solutions

Machine Learning for Search @

Viet Ha-ThucSearch Quality - LinkedIn

1

Viet Ha-Thuc

Page 2: Machine Learning for Search at LinkedIn

2

• 200+ countries and territories

• 2+ new members per second

Page 3: Machine Learning for Search at LinkedIn

3

● Dual Roles of Search○ Enable talent discover opportunity○ Help companies to search for the right talent

Page 4: Machine Learning for Search at LinkedIn

4

FLAGSHIP SEARCH

RECRUITER SEARCH

SALES NAVIGATOR

Page 5: Machine Learning for Search at LinkedIn

Unique Nature of LinkedIn Search

▪Heterogeneous sourcesPeople, jobs, companies, slideshares, members’ posts, groups

▪Scale

▪Deep Personalization

▪Support many use-casesHiring, connecting, job seeking, research, sales, etc.

5

Page 6: Machine Learning for Search at LinkedIn

Overview

6

Query

Federated SearchSpell CorrectionQuery Tagging

People Companies

Federated SearchBlending

Name Title Skill

Jobs

Page 7: Machine Learning for Search at LinkedIn

Overview

7

Query

Federated SearchSpell CorrectionQuery Tagging

People Companies

Federated SearchBlending

Name Title Skill

Jobs

Page 8: Machine Learning for Search at LinkedIn

Agenda

▪Introduction

▪Vertical Ranking–People Search by Skills [BigData’15,SIGIR’16]–Job Search [KDD’16]

▪Federation [CIKM’15]

▪Lessons 8

Page 9: Machine Learning for Search at LinkedIn

Introduction

▪Skills– 40K+ standardized skills– Members get endorsed on

skills– Represent professional

expertise

9

Page 10: Machine Learning for Search at LinkedIn

Introduction▪Unique challenges to LinkedIn expertise Search

– Scale: 400M members x 40K standardized skills

– Sparsity of skills in profiles

– Personalization

10

Page 11: Machine Learning for Search at LinkedIn

ReputationInformation a decision maker uses to make a

judgment on an entity with a record (*)

11

(*) “Building web reputation systems”, Glass and Farmer, 2010

Page 12: Machine Learning for Search at LinkedIn

Skill Reputation Scores [BigData’15]

12

▪Decision Maker: searcher

▪Record: Professional career

▪Skill reputation: member expertise on a skill

▪Judgment: Hire?

Page 13: Machine Learning for Search at LinkedIn

Estimating Skill Reputation

13

Endorse profile

browsemap

? .85 .45? ? .35

? .42 ?

? ? .05Mem

bers

Skills

P(expert| member, skill)

Supervised Learning algorithm

Page 14: Machine Learning for Search at LinkedIn

Estimating Skill Reputation

14

Endorse profile

browsemap

? .85 .45

? ? .35

? .42 ?

? ? .05Mem

bers

Skills0.5 1

0.7 0

0 0.6

0.1 0

0.2 0.3 0.5

0.5 0.7 0.2

Mem

bers

Skills

Each row is a representation of a member in latent space

Each column represents a skill in

latent space

Matrix Factorization

Page 15: Machine Learning for Search at LinkedIn

Estimating Skill Reputation

15

Endorse profile

browsemap

? .85 .45

? ? .35

? .42 ?

.02 ? ?Mem

bers

Skills0.5 1

0.7 0

0 0.6

0.1 0

0.2 0.3 0.5

0.5 0.7 0.2

Mem

bers

Skills

.6 .85 .45

.14 .21 .35

.3 .42 .12

.02 .03 .05Mem

bers

SkillsFill in unknown cells in

the original matrix

Page 16: Machine Learning for Search at LinkedIn

Features▪Reputation feature

▪Social Connection

▪Homophily– Geo– Industry

▪Textual Features

16

Page 17: Machine Learning for Search at LinkedIn

Learning to Rank

▪Listwise– Consider relevance is relative to every query– Allow optimizing quality metric directly

▪Objective function– Normalized Discounted Cumulative Gain (NDCG@K)– Graded relevance labels

17

Page 18: Machine Learning for Search at LinkedIn

Labeling Strategy

18

▪Logs + Top-K randomization

Uncertain (removed)

Bad: label = 0

Good: label = 1click

InMail Perfect: label = 3

Page 19: Machine Learning for Search at LinkedIn

Experiments

CTR@10 # Messages per Search

Flagship +11% +20%

Premium +18% +37%

19

▪Query Tagging

▪Target Segment: skill and no-name▪ Baseline

– No skill reputation feature– Hand-tuned

Page 20: Machine Learning for Search at LinkedIn

Agenda

▪Introduction

▪Vertical Ranking–People Search by Skills [BigData’15, SIGIR’16]–Job Search [KDD’16]

▪Federation [CIKM’15]

▪Lessons 20

Page 21: Machine Learning for Search at LinkedIn

Challenges of Job Search

▪“Hidden” structures

▪Query only represents a small fraction of information need–“San Francisco”, “software engineer”, “java”“Hidden” structures

▪Job attractiveness varies on many aspects–“Hot” titles: “data scientist”–Top companies: Google, Facebook, etc. –Trending skills: machine learning, big data, etc.,–Location

21

Page 22: Machine Learning for Search at LinkedIn

Entity-Aware Matching

22

Page 23: Machine Learning for Search at LinkedIn

Expertise Homophily

▪“Classic” homophily in social networks–People tend to interact with similar ones

▪Expertise homophily in job search–Searcher tends to apply for jobs with similar expertise–Apply rate of job results with overlapping skills is 2x higher

▪Expertise: skill reputation scores

23

Page 24: Machine Learning for Search at LinkedIn

Entity-faceted CTRs

▪Job attractiveness– Historical CTRs for individual jobs

– Challenge: job lifetime is short -> unreliable estimation

▪Entity-faceted historical CTRs– CTRs of jobs with standardized tile “data scientist”– CTRs of jobs from company IBM – CTRs of jobs requiring trending skill: machine learning, big data, etc.

▪Advantages– Alleviate data sparseness by grouping jobs by facets– Resolve cold start problem

24

Page 25: Machine Learning for Search at LinkedIn

Experiment Results

▪Baseline▪All of the existing features except entity-aware ones▪Machine learned▪Optimized for the same objective function

25

CTR Apply RateImprovement +11.3% +5.3%

Page 26: Machine Learning for Search at LinkedIn

Agenda

▪Introduction

▪Vertical Ranking–People Search by Skills [BigData’15, SIGIR’16]–Job Search [KDD’16]

▪Federation [CIKM’15]

▪Lessons 26

Page 27: Machine Learning for Search at LinkedIn

Personalized Blending

Page 28: Machine Learning for Search at LinkedIn

Personalized Blending▪ Why do we need this?

– Not to overwhelm the user with too much information – Make results personally relevant

Page 29: Machine Learning for Search at LinkedIn

Blending Flow

Page 30: Machine Learning for Search at LinkedIn

Learning Model▪ Training data: click logs▪ Features

– Relevance scores from base rankers– Searcher intent– Query intent– Prior scores

Page 31: Machine Learning for Search at LinkedIn

Calibrate Scores across Verticals▪ Relevance scores from vertical rankers are incomparable

Page 32: Machine Learning for Search at LinkedIn

Calibrate Scores across Verticals▪ Relevance scores from vertical rankers are incomparable▪ Construct composite features

People relevance score of searcher if result is People

f 1= ⎨0, otherwise

Page 33: Machine Learning for Search at LinkedIn

Searcher IntentSearcher’s job seeking intent if result is job vertical cluster

Searcher’s job seeking intent if result is individual job

Searcher’s recruiting intent if result is people vertical cluster

Searcher’s recruiting intent if result is individual people ...

Page 34: Machine Learning for Search at LinkedIn

Take-Aways▪Text match is still important but not enough

▪Advanced features based on semi-structured data

–People search: skill reputation scores–Job Search: expertise homophily

▪Personalized Learning-to-Rank is crucial

34

Page 35: Machine Learning for Search at LinkedIn

35

Email: [email protected]

Page 36: Machine Learning for Search at LinkedIn

References

▪“Personalized Expertise Search at LinkedIn”, Ha-Thuc, Venkataraman, Rodriguez, Sinha, Sundaram and Guo, BigData, 2015▪“Personalized Federated Search at LinkedIn”, Arya, Ha-Thuc and Sinha, CIKM, 2015▪“Learning to Rank Personalized Search Results in Professional Networks”, Ha-Thuc and Sinha, SIGIR, 2016▪“How to Get Them a Dream Job?”, Li, Arya, Ha-Thuc, Sinha, KDD, 2016

36