Top Banner
Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR
66

Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

SeesawPersonalized Web Search

Jaime Teevan, MITwith Susan T. Dumais

and Eric Horvitz, MSR

Page 2: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.
Page 3: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Query expansion

Personalization Algorithms

Standard IR

Document

Query

User

Server

Client

Page 4: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Query expansion

Personalization Algorithms

Standard IR

Document

Query

User

Server

Client

v. Result re-ranking

Page 5: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Result Re-Ranking

Ensures privacyGood evaluation frameworkCan look at rich user profileLook at light weight user models

Collected on server side Sent as query expansion

Page 6: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Seesaw Search EngineSeesawSeesaw

dog 1cat 10india 2mit 4search 93amherst 12vegas 1

Page 7: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Seesaw Search Engine

query

dog 1cat 10india 2mit 4search 93amherst 12vegas 1

Page 8: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Seesaw Search Engine

query

dog 1cat 10india 2mit 4search 93amherst 12vegas 1

dog cat monkey banana

food

baby infant

child boy girl

forest hiking

walking gorp

baby infant

child boy girl

csail mit artificial research

robotweb

search retrieval ir

hunt

Page 9: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Seesaw Search Engine

query

dog 1cat 10india 2mit 4search 93amherst 12vegas 1

1.6 0.26.0

0.2 2.7

1.3

Search results page

web search

retrieval ir hunt

1.3

Page 10: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Calculating a Document’s Score

Based on standard tf.idf

web search

retrieval ir hunt

1.3

Page 11: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Calculating a Document’s Score

Based on standard tf.idf

(ri+0.5)(N-ni-R+ri+0.5)

(ni-ri+0.5)(R-ri+0.5)wi = log

1.30.1 0.5

0.05 0.35 0.3

User as relevance feedback Stuff I’ve Seen index More is better

Page 12: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Finding the Score Efficiently

Corpus representation (N, ni) Web statistics Result set

Document representation Download document Use result set snippet

Efficiency hacks generally OK!

Page 13: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Evaluating Personalized Search

15 evaluatorsEvaluate 50 results for a query

Highly relevant Relevant Irrelevant

Measure algorithm quality

DCG(i) = { Gain(i),DCG(i–1) + Gain(i)/log(i),

if i = 1otherwise

Page 14: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Evaluating Personalized Search

Query selection Chose from 10 pre-selected queries Previously issued query

cancerMicrosofttraffic…

bison friseRed Soxairlines…

Las VegasriceMcDonalds…

Pre-selected

53 pre-selected (2-9/query)

Total: 137

JoeMary

Page 15: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Seesaw Improves Text Retrieval

0

0.1

0.2

0.3

0.4

0.5

0.6

Rand RF SS Web Combo

DC

G

RandomRelevance

FeedbackSeesaw

Page 16: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Text Features Not Enough

0

0.1

0.2

0.3

0.4

0.5

0.6

Rand RF SS Web Combo

DC

G

Page 17: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Take Advantage of Web Ranking

0

0.1

0.2

0.3

0.4

0.5

0.6

Rand RF SS Web Combo

DC

G

Page 18: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Further Exploration

Explore larger parameter spaceLearn parameters

Based on individual Based on query Based on results

Give user control?

Page 19: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Making Seesaw Practical

Learn most about personalization by deploying a system

Best algorithm reasonably efficientMerging server and client

Query expansion Get more relevant results in the set to be re-ranked

Design snippets for personalization

Page 20: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

User Interface Issues

Make personalization transparentGive user control over personalization

Slider between Web and personalized results Allows for background computation

Creates problem with re-finding Results change as user model changes Thesis research – Re:Search Engine

Page 21: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Thank [email protected]

Page 22: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

END

Page 23: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Personalizing Web Search

MotivationAlgorithmsResultsFuture Work

Page 24: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Personalizing Web Search

MotivationAlgorithmsResultsFuture Work

Page 25: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Study of Personal Relevancy

15 participants Microsoft employees Managers, support staff, programmers, …

Evaluate 50 results for a query Highly relevant Relevant Irrelevant

~10 queries per person

Page 26: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Study of Personal Relevancy

Query selection Chose from 10 pre-selected queries Previously issued query

cancerMicrosofttraffic…

bison friseRed Soxairlines…

Las VegasriceMcDonalds…

Pre-selected

53 pre-selected (2-9/query)

Total: 137

JoeMary

Page 27: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Relevant Results Have Low Rank

1 5 9 13 17 21 25 29 33 37 41 45 49

Rank

Highly Relevant

Relevant

Irrelevant

Page 28: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

1 5 9 13 17 21 25 29 33 37 41 45 49

Rank

Relevant Results Have Low Rank

Highly Relevant

Relevant

Irrelevant

Rater 1 Rater 2

Page 29: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Same Results Rated Differently

Average inter-rater reliability: 56%Different from previous research

Belkin: 94% IRR in TREC Eastman: 85% IRR on the Web

Asked for personal relevance judgmentsSome queries more correlated than others

Page 30: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Same Query, Different Intent

Different meanings “Information about the astronomical/astrological

sign of cancer” “information about cancer treatments”

Different intents “is there any new tests for cancer?” “information about cancer treatments”

Page 31: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Same Intent, Different Evaluation

Query: Microsoft “information about microsoft, the company” “Things related to the Microsoft corporation” “Information on Microsoft Corp”

31/50 rated as not irrelevant Only 6/31 do more than one agree All three agree only for www.microsoft.com Inter-rater reliability: 56%

Page 32: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Search Engines are for the Masses

Joe Mary

Page 33: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Much Room for Improvement

Group ranking Best improves on

Web by 38% More people

Less improvement

1.2

1.25

1.3

1.35

1.4

1 2 3 4 5 6

Number of People

DC

G

Group

Page 34: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Much Room for Improvement

Group ranking Best improves on

Web by 38% More people

Less improvement

Personal ranking Best improves on

Web by 55% Remains constant

1.2

1.25

1.3

1.35

1.4

1 2 3 4 5 6

Number of People

DC

G

Personalized Group

Page 35: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

- Seesaw Search Engine- See- Seesaw

Personalizing Web Search

MotivationAlgorithmsResultsFuture Work

Page 36: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

BM25

N

ni

N

ni

wi = log

ri

R

with Relevance Feedback

Score = Σ tfi * wi

Page 37: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

N

ni

(ri+0.5)(N-ni-R+ri+0.5)

(ni-ri+0.5)(R-ri+0.5)

ri

R

wi = log

Score = Σ tfi * wi

BM25 with Relevance Feedback

Page 38: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

(ri+0.5)(N-ni-R+ri+0.5)

(ni- ri+0.5)(R-ri+0.5)

User Model as Relevance Feedback

N

ni

R

ri

Score = Σ tfi * wi

(ri+0.5)(N’-ni’-R+ri+0.5)

(ni’- ri+0.5)(R-ri+0.5)wi = log

N’ = N+R

ni’ = ni+ri

Page 39: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

User Model as Relevance Feedback

N

ni

R

ri

World

User

Score = Σ tfi * wi

Page 40: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

User Model as Relevance Feedback

R

ri

User

N

ni

World

World related to query

Nni

Score = Σ tfi * wi

Page 41: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

User Model as Relevance Feedback

N

ni

R

ri

World

UserWorld related to query

User related to query

R

Nni

ri

Query Focused Matching

Score = Σ tfi * wi

Page 42: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

User Model as Relevance Feedback

N

ni

R

ri

World

UserWeb related to query

User related to query

R

N ri

Query Focused Matching

ni

World Focused Matching

Score = Σ tfi * wi

Page 43: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Parameters

Matching

User representation

World representation

Query expansion

Page 44: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused

Page 45: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused

Page 46: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

User Representation

Stuff I’ve Seen (SIS) index MSR research project [Dumais, et al.] Index of everything a user’s seen

Recently indexed documentsWeb documents in SIS indexQuery historyNone

Page 47: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyNone

Page 48: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Parameters

Matching

User representation

World representation

Query expansion

Query Focused

World Focused All SISRecent SISWeb SISQuery HistoryNone

Page 49: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

World Representation

Document Representation Full text Title and snippet

Corpus Representation Web Result set – title and snippet Result set – full text

Page 50: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyNone

Full textTitle and snippet

WebResult set – full textResult set – title and snippet

Page 51: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyNone

Full textTitle and snippet

WebResult set – full textResult set – title and snippet

Page 52: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Query Expansion

All words in document

Query focused

The American Cancer Society is dedicated to eliminating cancer as a major health problem by preventing cancer, saving lives, and diminishing suffering through ...

The American Cancer Society is dedicated to eliminating cancer as a major health problem by preventing cancer, saving lives, and diminishing suffering through ...

Page 53: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyNone

Full textTitle and snippet

WebResult set – full textResult set – title and snippet

All words

Query focused

Page 54: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyNone

Full textTitle and snippet

WebResult set – full textResult set – title and snippet

All words

Query focused

Page 55: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Personalizing Web Search

MotivationAlgorithmsResultsFuture Work

Page 56: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Best Parameter Settings

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyNone

Full textTitle and snippet

WebResult set – full textResult set – title and snippet

All words

Query focused

All SISRecent SISWeb SIS

All SISRecent SISWeb SISQuery historyNone

Full text

All words

Query focused

World focused

Result set – title and snippet

Web

Query focused

All SIS

Title and snippet

Result set – title and snippet

Query focused

Page 57: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Seesaw Improves Retrieval

0

0.1

0.2

0.3

0.4

0.5

0.6

None Rand RF SS Web Combo

DC

G

No user model

RandomRelevance

FeedbackSeesaw

Page 58: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Text Alone Not Enough

0

0.1

0.2

0.3

0.4

0.5

0.6

None Rand RF SS Web Combo

DC

G

Page 59: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Incorporate Non-text Features

0

0.1

0.2

0.3

0.4

0.5

0.6

None Rand RF SS Web Combo

DC

G

Page 60: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Summary

0

0.2

0.4

0.6

0.8

1

None SS Web Group ?

Rich user model important for search personalization

Seesaw improves text based retrievalNeed other featuresto improve WebLots of room for improvement

future

Page 61: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Personalizing Web Search

MotivationAlgorithmsResultsFuture Work

Further exploration Making Seesaw practical User interface issues

Page 62: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Further Exploration

Explore larger parameter spaceLearn parameters

Based on individual Based on query Based on results

Give user control?

Page 63: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Making Seesaw Practical

Learn most about personalization by deploying a system

Best algorithm reasonably efficientMerging server and client

Query expansion Get more relevant results in the set to be re-ranked

Design snippets for personalization

Page 64: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

User Interface Issues

Make personalization transparentGive user control over personalization

Slider between Web and personalized results Allows for background computation

Creates problem with re-finding Results change as user model changes Thesis research – Re:Search Engine

Page 65: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Thank you!

Page 66: Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Search Engines are for the Masses

Best common ranking

DCG(i) = { Sort results by number marked highly relevant,

then by relevant

Measure distance with Kendall-TauWeb ranking more similar to common

Individual’s ranking distance: 0.469 Common ranking distance: 0.445

Gain(i), if i = 1DCG(i–1) + Gain(i)/log(i), otherwise