Top Banner
Modeling and Detecting Changes in User Satisfaction Julia Kiseleva*, Eric Crestan, Riccardo Brigo, Roland Dittel *Eindhoven University of Technology Microsoft Bing
32

Cikm 2014 v2

Jul 15, 2015

Download

Data & Analytics

Julia Kiseleva
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cikm 2014 v2

Modeling and Detecting

Changes in User Satisfaction

Julia Kiseleva*, Eric Crestan, Riccardo Brigo, Roland Dittel

*Eindhoven University of Technology

Microsoft Bing

Page 2: Cikm 2014 v2

Want to go to CIKM

conference

QUERY SERP

What is User Satisfaction?

Page 3: Cikm 2014 v2

What is User Satisfaction?

QUERY SERP,

Page 4: Cikm 2014 v2

What is User Satisfaction?

QUERY SERP,

Page 5: Cikm 2014 v2

What is User Satisfaction?

QUERY SERP,Pr (Ref.)

Assumption: If a “significant” amount of users

reformulate a query with a particular SERP it is an

indication of changing in user preferences

Page 6: Cikm 2014 v2

World May Change User Preferences

Page 7: Cikm 2014 v2

QUERY SERP,

QUERY SERP

ti

ti+1 ,

Tim

elin

e

Pr ti =

Pr ti+1 =

How Can We Detect the Changes?

Page 8: Cikm 2014 v2

QUERY SERP,

QUERY SERP

ti

ti+1 ,| Pr ti - Pr ti+1 |

Tim

elin

e

Pr ti =

Pr ti+1 =

How Can We Detect the Changes?

Page 9: Cikm 2014 v2

• There are many definitions in the literature

• We use the query expansion

o new years wallpaper IS REFORMULATED WITH 2014

o medals Olympics IS REFORMULATED WITH 2014

o ct 40ez IS REFORMULATED WITH 2013

o march 31 holiday IS REFORMULATED WITH 2014

o …

Detecting Query Reformulation

Page 10: Cikm 2014 v2

An Example of the Drift inReformulation Signal

Page 11: Cikm 2014 v2

The Explanation of the Drift

Before November 2013 After November 2013

The Question:

“How to detect

this kind of

changes?”

Page 12: Cikm 2014 v2

• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can

change over time yielding the phenomenon of concept drift

o The real concept drift refers to changes in the conditional distribution of the output

(i.e., target variable) given the input (input features)

• Concept drift types:

Change Detection Techniques

Page 13: Cikm 2014 v2

• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can change over time

yielding the phenomenon of concept drift

o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)

given the input (input features)

• Concept drift types:

Time

Data

mean

Sudden/abrupt

Disambiguation

such as

“flawless Beyoncé”

Change Detection Techniques

Page 14: Cikm 2014 v2

• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can change over time

yielding the phenomenon of concept drift

o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)

given the input (input features)

• Concept drift types:

Time

Data

mean

Incremental

Disambiguation

such as

“cikm conference

2014”

Change Detection Techniques

Page 15: Cikm 2014 v2

• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can change over time

yielding the phenomenon of concept drift

o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)

given the input (input features)

• Concept drift types:

Time

Data

mean

Gradual

Breaking news

such as

“idaho bus crash

investigation”

Change Detection Techniques

Page 16: Cikm 2014 v2

• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can change over time

yielding the phenomenon of concept drift

o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)

given the input (input features)

• Concept drift types:

Time

Data

mean

Reoccurring

Seasonal change

such as

“black Friday 2014”

Change Detection Techniques

Page 17: Cikm 2014 v2

• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can change over time

yielding the phenomenon of concept drift

o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)

given the input (input features)

• Concept drift types:

Time

Data

mean

Change Detection Techniques

Page 18: Cikm 2014 v2

• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can

change over time yielding the phenomenon of concept drift

o The real concept drift refers to changes in the conditional distribution of the output

(i.e., target variable) given the input (input features)

• Concept drift types:

Time

Data

mea

n

Sudden/abru

ptIncremental Gradual

Reoccurring

concepts

Outlier

(not concept drift)

Disambiguation

such as

“medal olympics

2014”

Seasonal change

such as

“black Friday

2014”

Breaking news

such as

“idaho bus crash

investigation”

Disambiguation

such as

“cikm conference

2014”

Change Detection Techniques

Page 19: Cikm 2014 v2

Detecting Drifts in Reformulation Signal

Query: “cikm conference”

0.1

TimeLinet0

0.1 0.2 0.2 0.3

Reformulation: “2014”

Window W0ti

Page 20: Cikm 2014 v2

Detecting Drifts in Reformulation Signal

Query: “cikm conference”

0.1

TimeLinet0 ti+ t

0.1 0.2 0.2 0.3 0.7 0.8 0.8

Reformulation: “2014”

Window W0 Window W1ti

E(W0) E(W1)

Size of Window W1 = n1Size of Window W0 = n0

The

upcoming

conference

event

If |E(W1) - E(W2)|> eout

Then Drift Detected

Page 21: Cikm 2014 v2

Calculating Threshold eout

Confidence

Variance at W = W0 U W1

m = 1/(1/n0 + 1/n1)

eout

Page 22: Cikm 2014 v2

Learn

reformulation

model M

User Behavior

Logs

t0 Timelineti+

Page 23: Cikm 2014 v2

Learn

reformulation

model M

User Behavior

Logs

t0

Incoming User

Behavior logs

Timeline

Detect changes in model M

If change detected

else Do Nothing

ti ti+ t

Page 24: Cikm 2014 v2

Learn

reformulation

model M

User Behavior

Logs

ti

Incoming User

Behavior logs

Timeline

Detect changes in model M

If change detected

else Do Nothing

ti+w1 ti+w1+w2

Alarm:Change of user

satisfaction

detected

for pairs :

{<Qi,

SERPi>}1<i<n

Page 25: Cikm 2014 v2

Learn

reformulation

model M

User Behavior

Logs

t0

Incoming User

Behavior Logs

Timeline

Detect changes in model M

If change detected

else Do Nothing

ti ti+ t

1) List of reformulation terms

per query

2) List of URLs per

reformulation

Alarm:Change of user

satisfaction

detected

for pairs :

{<Qi,

SERPi>}1<i<n

Page 26: Cikm 2014 v2

o Dataset consists of 6 months

of the behavioral log data

from a commercial search

engine

o The training window size is

one month

o The test window size is two

weeks

Experimentation

Page 27: Cikm 2014 v2

Evaluation

Page 28: Cikm 2014 v2

Results

Page 29: Cikm 2014 v2

oWe successfully leveraged the concept drift detection

techniques to detect changes in user satisfaction

o The proposed technique works in unsupervised way

o Large scale evaluation has been performed

oClassification of the drift type is needed

o Prediction of the lifetime of the drift would help

Conclusion and Future Work

Page 30: Cikm 2014 v2

Questions?

Page 31: Cikm 2014 v2

Questions?

Page 32: Cikm 2014 v2

oWe successfully leveraged the concept drift detection

techniques

o The proposed technique works in unsupervised way

o Large scale evaluation has been performed

oClassification of the drift type is needed

o Prediction of the lifetime of the drift would help

Conclusion and Future Work