Top Banner
Query Suggestion Using Hitting Time Qiaozhu Mei , Dengyong Zhou , Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond
25

Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Mar 29, 2015

Download

Documents

Cali Wildey
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Query Suggestion Using Hitting Time

Qiaozhu Mei †, Dengyong Zhou ‡, Kenneth Church ‡

† University of Illinois at Urbana-Champaign‡ Microsoft Research, Redmond

Page 2: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Motivating Examples

2

MSG

1. Difficult for a user to express information need2. Difficult for a Search engine to infer information need

Query Suggestions: Accurate to express the information need;

Easy to infer information need

Sports

center

Food Additiv

e

Page 3: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Motivating Examples (Cont.)

3

Welcome to the hotel california

Suggestions

hotel california

eagles hotel california

hotel california band

hotel california by the eagles

hotel california song

lyrics of hotel california

listen hotel california eagle

Page 4: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Motivating Examples: Personalization

4

Mountain safety research

Metropolis Street Racer

Molten salt reactorMars Sample Return

Magnetic Stripe Reader

MSR

Actually Looking for Microsoft Research…

Page 5: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Research Questions

5

• How can we generate query suggestions in a principled way?

• Can we generate personalized query suggestions using the same method?

• Can this method be generalized to other search related tasks?

Page 6: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

6

Rest of This Talk

• Random Walk, Hitting Time, and Bipartite Graph• Generating Query Suggestion• Personalized Query Suggestion• Experiments• Discussion and Summary

Page 7: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Random Walk and Hitting Time

7

i

k

A

jP = 0.7

P = 0.3

• Hitting Time– TA: the first time that the random

walk is at a vertex in A

• Mean Hitting Time– hi

A: expectation of TA given that the walk starts from vertex i

0.3

0.7

Page 8: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Computing Hitting Time

8

i

kA

j

TA: the first time that the random walk is at a vertex in A

}0,:min{ tAXtT tA

A ifor ,1)( Vj

Ajhjip

Aih

A ifor ,0

Iterative Computation

hiA: expectation of TA given that the

walk starting from vertex i

A i

h = 0

hiA = 0.7 hj

A + 0.3 hkA + 1

0.7

0.7

Apparently, hiA = 0 for those

Page 9: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Bipartite Graph and Hitting Time

9

Expected proximity of query i to the query A : hitting time of i A, hi

A

Bipartite Graph:- Edges between V1 and V2

- No edge inside V1 or V2

- Edges are weighted- e.g., V1 = query; V2 = Url

A

ijw(i, j) = 3

4

5

0.7

0.4V1 V2

7 1

)73(

3),()(

id

jiwjip

A

ij

4

5

0.7

0.4V1 V2

7 1

)13(

3),()(

jd

jiwijp

A

k

ij

4

5

0.7

0.4V1 V2

7 1

2

),(),()(

Vj ji d

jkw

d

jiwkip

• convert to a directed graph, even collapse one group

Page 10: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Generate Query Suggestion

10

Taa

american airline

mexiana

www.aa.com

www.theaa.com/travelwatch/planner_main.jsp

en.wikipedia.org/wiki/Mexicana

300

15

Query Url• Construct a (kNN)

subgraph from the query log data (of a predefined number of queries/urls)

• Compute transition probabilities p(i j)

• Compute hitting time hiA

• Rank candidate queries using hi

A

Page 11: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Intuition

• Why it works?– A url is close to a query if freq(q, url)

dominates the number of clicks on this url (most people use q to access url)

– A query is close to the target query if it is close to many urls that are close to the target query

11

Page 12: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Personalized Query Suggestion

• Queries are ambiguous• Different user different information need

different query suggestions• Simple approach: build the graph, compute

hitting time solely based on the user’s history• Data Sparseness

– E.g., you cannot see a query if you never used it

• Alternative: modify the bipartite graph instead of rebuilding all

12

Page 13: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Personalize the Bipartite Graph

13

Taa

american airline

alcoholics anonymous

www.aa.com

www.theaa.com/travelwatch/planner_main.jsp

www.alcoholics-anonymous.org

Query Url

en.wikipedia.org/wiki/Alcoholics_Anonymous

P“aa” + user

pseudo query:

Introduce a

pseudo (personalized query)

Reweight edges using

personalized

Probs.

• Key: How to compute – From w(url, user, query) – Sparse data!– Compute a smoothed p(Url | User, Query)

),|( QueryUserUrlp

),|( UrlUserQueryp

),|( QueryUserUrlp

Page 14: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Personalization with Backoff (Mei and Church 08)

14

),|(

),|(

),|(

),|(

),|(),|(

00

11

22

33

44

QIPUrlP

QIPUrlP

QIPUrlP

QIPUrlP

QIPUrlPQIPUrlP

156.111.188.243

156.111.188.*

156.111.*.*

156.*.*.*

*.*.*.*

Full personalization: sparse data!

No personalization: lose the

opportunity

Personalization with backoff:

We don’t have enough data for everyone!- Backoff to classes of users (e.g., IP)

Page 15: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Experiments

• Query Suggestion using Query Logs– commercial search engine log (1.5 year)– 637 million queries; 585 million urls– Query-click bipartite graph

• Author/keyword suggestion using DBLP– titles and authors from DBLP– 110k of papers, 580k authors– Coauthor graph, keyword graph, author-keyword

bipartite graph

• Baselines: nearest neighbor; personalized pagerank

15

Page 16: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Result: Query Suggestion

16

Hitting time

wikipedia friends

friends tv show wikipedia

friends home page

friends warner bros

the friends series

friends official site

friends(1994)

Google

friendship

friends poem

friendster

friends episode guide

friends scripts

how to make friends

true friends

Yahoo

secret friends

friends reunited

hide friends

hi 5 friends

find friends

poems for friends

friends quotes

Query = friends

Page 17: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Result: Query Suggestion (II)

17

Yahoo

aa route planner

aa route finder

aa airlines

aa meetings

aa autoroute

aa road map

Live

aa route finder

aa route planner

aa airlines

american airlines

aa meeting

aa road map

Query = aaHitting time

alcoholics anonymous

automobile association

theaa

american airlines

american air

american airline ticket reservation

Hitting Time

learning to rank

ndcg measure ir

ndcg

lambdarank

Chris burges

pairwise testQuery = ranknet

Page 18: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Results: Personalized Query Suggestion

Query = msr

18

No personalization

mountian safety research

msrcorp

msr outdoor equipment

msr camp stoves

msr snowshoes

msr racing

Personalized

Microsoft research

research

what is research

research website

microsoft research anddevelopment

yahoo research labs

Page 19: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Result: Author Suggestion

Query = Jon Kleinberg

19

Hitting time

Aleksandrs Slivkins

Mark Sandler

Tom Wexler

Lars Backstrom

Elliot Anshelevich

Xiangyang Lan

Nearest Neighbor;

Prabhakar Raghavan

Eva Tardos

Daniel P. Huttenlocher

David Kempe

Amit Kumar

Andrew Tomkins

Favor students, especially

current students

(personalized Pagerank is similar)

Famous research

ers + former

students

Page 20: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Query = olap

Dimension updates

OLAP data

OLAP cubes

OLAP queries

View size

Hierarchical cluster

Result: Keyword Suggestion

Query = social network

Knowledge collaboration

Community structure

Resource organization

Information kiosks

Efficient searching

Network extraction

20

Query = pagerank

Pagerank computation

Ranking systems

Pagerank approximation

Incremental computations

Web spam

Iterative computation

Page 21: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Result: Keyword Suggestion for Author

21

Baselines

mining

data

frequent

Efficient

pattern

data mining

Baselines

learning

statistical

kernel

markov

inference

model

Hitting Time

large databases

frequent pattern

sequential pattern

pattern mining

frequent

multi dimensional

Query = Michael I. Jordan

Query = Jiawei Han

Hitting time

Dirichlet process

approximate inference

dirichlet

mean field

supervised learning

graphic models

Page 22: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Discussions

• Hitting time effectively boosts infrequent queries– Nearest Neighbor & personalized pagerank favorites

frequent queries

• Fast convergence: a few iterations and a subgraph gets most of the value

• No parameter to tune• Can be generalized to many other tasks (on

different graphs)

22

Page 23: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Ranking on Query log Graph and Search Tasks

• Query Query: query suggestion• Url Url: finding related pages

www.cs.jhu.edu/~brill • "research.microsoft.com/users/brill”

• IP IP: finding similar users• Url Query: Annotation, Summarization, ads term• Query Url: Search• IP, Query Url: Personalized Search• IP, Query Query: Personalized Query Suggestion• Many other opportunities!

Page 24: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Summary

• Generate query suggestions using hitting time on query-click graph

• Personalized query suggestion• Generalizable to other search tasks• Future work:

– Different types of graphs: e.g., query sessions– Combine with other features – Large scale evaluation

24

Page 25: Query Suggestion Using Hitting Time Qiaozhu Mei, Dengyong Zhou, Kenneth Church University of Illinois at Urbana-Champaign Microsoft Research, Redmond.

Thanks!

25