Page 1
On serendipity in
recommender systems
Giovanni Semeraro
University of Bari Aldo Moro, Italy
Advances in Recommender Systems
Social and Semantic Aspects
RecSoc 2015
Haifa, June16-17, 2015
Semantic Web Access and Personalization research group http://www.di.uniba.it/~swap
Page 2
Focus: Emotions as implicit feedback for
assessing serendipity of recommendations
Page 3
Acknowledgments
Marco de
Gemmis
Pasquale
Lops
Semantic Web Access and Personalization research group http://www.di.uniba.it/~swap
Cataldo
Musto Marko
Tkalcic
Page 4
Serendipity
4
Serendip = “Simhala dvipa” (Sanskrit) the old name of the island
of Ceylon, now Sri Lanka
Page 5
Outline
Serendipity and Evaluation
Research questions
Operationally induced serendipity:
Knowledge Infusion (KI) process
Item-to-Item correlation matrix
Random Walk with Restart boosted by KI
Experimental evaluation
Noldus FaceReader ™
Dataset
Design of the experiment
Metrics
Questionnaire analysis
Analysis of user emotions
Conclusions
Page 6
Serendipity in Information Seeking
Information seeking metaphor investigated in literature
(Toms 2000, André et al 2009, Bordino et al. 2013)
Toms suggests 4 strategies
Blind luck or “role of chance” random
Pasteur Principle or “chance favors only the prepared mind”
flashes of insight don’t just happen, but they are the products
of a “prepared mind”
Anomalies and exceptions or “searching for dissimilarities”
identification of items dissimilar to those the user liked in the
past
Reasoning by analogy abstraction mechanism allowing the
system to discover the applicability of an existing schema to a
new situation
(Toms 2000) E. Toms. Serendipitous Information Retrieval. Proc.1st DELOS NoE Workshop on Information Seeking, Searching and Querying
in Digital Libraries, Zurich, Switzerland: ERCIM, 2000.
(André 2009) P. André, J. Teevan, S.T. Dumais. From x-rays to silly putty via Uranus: serendipity and its role in web search. Proc. ACM CHI
2009, ACM, New York, NY, USA, 2009,
(Bordino et al. 2013) I. Bordino, Y. Mejova, M. Lalmas, Penguins in sweaters, or serendipitous entity search on user-generated content.
Proc.22nd ACM CIKM 2013, ACM, New York, NY, USA, 2013, pp. 109–118.
6
Page 7
Serendipitous recommendations
“Suggestions which help the user to find surprisingly
interesting items she might not have discovered by herself”
(Herlocker et al. 2004)
Both attractive and unexpected
“The experience of receiving an unexpected and fortuitous
item recommendation” (McNee et al. 2006)
“Serendipity involves a positive emotional response of the
user about novel items” (Shani and Gunawardana 2011)
(Herlocker et al. 2004) Herlocker, L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating Collaborative Filtering Recommender Systems. ACM
Transactions on Information Systems 22(1): 5–53, 2004.
(McNee et al. 2006) S.M. McNee, J. Riedl, and J. A. Konstan. Being accurate is not enough: How accuracy metrics have hurt recommender
systems. In CHI ’06 Extended Abstracts on Human Factors in Computing Systems, CHI EA ’06, 1097–1101, ACM, New York, NY, USA, 2006.
(Shani and Gunawardana 2011) G. Shani, A. Gunawardana, Evaluating Recommendation Systems. In F. Ricci, L. Rokach, B. Shapira, P.B.
Kantor (Eds.), Recommender Systems Handbook, Springer, 2011, pp. 257–297.
7
Page 8
Serendipitous recommendations
A response to the overspecialization problem and the filter
bubble (Pariser 2011)
tendency to provide the user with items within her existing
range of interests
suggesting “STAR TREK” to a science-fiction fan:
Accurate but obvious, thus actually not useful
users don’t want algorithms that produce better ratings, but
sensible recommendations
(Pariser 2011) E. Pariser. The Filter Bubble: What the Internet Is Hiding from You. Penguin Group, May 2011.
Page 9
Obviousness in recommendations: homophily
The tendency to surround ourselves by like-minded
people [E. Zuckerman. Homophily, serendipity, xenophilia. April 25, 2008.
www.ethanzuckerman.com/blog/2008/04/25/homophily-serendipity-xenophilia/]
opinions taken to extremes cultural impoverishment
threat for biodiversity?
Page 10
Homophily in the digital world
in the physical world, one of the strongest sources of homophily is
locality, due to geographic proximity, family ties, and
organizational factors (school, work, etc.)
in the digital world, physical locality is less important. Other
factors, such as common interests, might play a central role
2 main questions:
Are two users more likely to be friends if they share common
interests?
Are two users more likely to share common interests if they are
friends?
In (Lauw et al. 2010), the answer to both questions is
YES
(Lauw et al. 2010) Lauw, H.W., Schafer, J.C., Agrawal, R., & A. Ntoulas. Homophily in the Digital World: A
LiveJournal Case Study. IEEE Internet Computing 14(2):15-23, March-April 2010.
Page 11
The homophily trap
Does homophily hurt RecSys?
try to tell Amazon that you liked the movie “War
Games”…
Page 12
The homophily trap
Recommendations by other GEEKS!
Page 13
“Item-to-Item” homophily…
…Harry Potter for ever?
Page 14
Serendipity & Search Engines
Poll
Is Personalization A Form Of Censorship?
Yes: 73%
No: 23%
Other: 4%
L. Carr and S. Harnad. Offload Cognition onto the Web. IEEE Intelligent
Systems 26(1): 33-39, 2011.
Page 15
Evaluation of Serendipity: research questions
Is user’s emotional response
useful for assessing serendipity?
Can emotions observed in facial
expressions be considered as a
trustworthy implicit feedback
for assessing the pleasant surprise
serendipity should convey?
15
Page 16
Outline
Serendipity and Evaluation
Research questions
Operationally induced serendipity:
Knowledge Infusion (KI) process
Item-to-Item correlation matrix
Random Walk with Restart boosted by KI
Experimental evaluation
Noldus FaceReader ™
Dataset
Design of the experiment
Metrics
Questionnaire analysis
Analysis of user emotions
Conclusions
Page 17
Operationally induced Serendipity: A Quick Look
at the Recommendation Algorithm
Novel method for computing item
similarity
tries to find “hidden associations” instead of
computing attribute similarity
knowledge intensive process that allows
deeper understanding of item descriptions
Knowledge Infusion (KI)
provides the RecSys with a background
knowledge built from external sources
Content-Based (CB) approach that
exploits the knowledge base to
compute a correlation index between
items
17
Page 18
Operationally induced Serendipity:
Knowledge Infusion (KI)
Which “words”?
Words that induce positive emotions
Relevant/attractive words able to surprise
the conversation partner
A form of nudging?
18
“Language is the Skin of my Thought”
Arundhati Roy. Power Politics. South End Press, January 2001.
“Words” Recommender System
Page 19
Recommending Words:
the Architecture of the KI process
sci-fi
conflicts/
fights
Page 20
KI@Work
CLUE#1
Knowledge
Source #1
Knowledge
Source #2
Knowledge
Source #3
. . .
Knowledge
Source #n
CLUE#2
BACKGROUND KNOWLEDGE
CLUE#3 CLUE#4 CLUE#5
SPREADING ACTIVATION NETWORK
KEYWORD1
KEYWORD2
…
NEW KEYWORDS ASSOCIATED WITH CLUES
20
G. Semeraro, M. de Gemmis, P. Lops, P. Basile. An Artificial Player for a Language Game. IEEE Intelligent Systems
27(5): 36-43, 2012.
P. Basile, M. de Gemmis, P. Lops, G. Semeraro. Solving a Complex Language Game by using Knowledge-based Word
Associations Discovery. IEEE Transactions on Computational Intelligence and AI in Games, 2015 (in press). DOI:
10.1109/TCIAIG.2014.2355859.
Page 21
21
KI as a novel method for computing
associations between items
BM25 retrieval score
clues
Page 22
22
KI as a Serendipity Engine: Item-to-Item similarity
matrix Item-to-Item correlation matrix
wij computed in different
ways
#users co-rated items Ii and I
j
cosine similarity between
descriptions of items Ii and I
j
Knowledge Infusion
Correlation index
Recommendation list
computed by
Random Walk with
Restart (Lovasz 1996)
augmented with
KI (RWR-KI)
(Lovasz 1996) L. Lovasz. Random Walks on Graphs: a Survey. Combinatronics 2:1–46, 1996.
wij
Page 23
Outline
Serendipity and Evaluation
Research questions
Operationally induced serendipity:
Knowledge Infusion (KI) process
Item-to-Item correlation matrix
Random Walk with Restart boosted by KI
Experimental evaluation
Noldus FaceReader ™
Dataset
Design of the experiment
Metrics
Questionnaire analysis
Analysis of user emotions
Conclusions
Page 24
Evaluation of Serendipity: research questions
Is user’s emotional response
useful for assessing serendipity?
Can emotions observed in facial
expressions be considered as a
trustworthy implicit feedback
for assessing the pleasant surprise
serendipity should convey?
24
Page 25
Experimental Evaluation: Goal
25
Validation of the hypothesis that recommendations
produced by RWR-KI are serendipitous
(relevant/attractive & unexpected/surprising)
Not only an issue of metrics!
Difficulty of detecting and providing an objective
assessment of the emotional response conveyed by
serendipitous recommendations
Difficulty of assessing the user perception of
serendipity of recommendations and their acceptance
(in terms of relevance and unexpectedness)
Difficulty of assessing unexpectedness
M. de Gemmis, P. Lops, G. Semeraro, C. Musto. An Investigation on the Serendipity Problem in
Recommender Systems. Information Processing and Management, 2015 (in press) DOI:
10.1016/j.ipm.2015.06.008
Page 26
Experimental Evaluation
26
2 experiments
In-vitro
User study
In-vitro experiment
Unexpectedness measured as deviation from a
standard prediction criterion (Murakami et al. 2008)
Standard prediction criterion: (non-personalized)
popularity
User study
Analysis performed using Noldus FaceReader™
Allows to analyze users’ facial expressions and gather
implicit feedback about their reactions
(Murakami et al. 2008) T. Murakami, K. Mori, R. Orihara, Metrics for Evaluating the Serendipity of
Recommendation Lists, in K. Satoh, A. Inokuchi, K. Nagao, T. Kawamura (Eds.), New Frontiers in Artificial
Intelligence, Lecture Notes in Computer Science 4914, pp. 40–46, Springer, 2008.
Page 27
27
Noldus FaceReader™
Recognize basic emotions: 6 categories of
emotions, proposed by Ekman (1999)
happiness
anger
sadness
(Ekman 1999) P. Ekman, Basic Emotions, in T. Dalgleish, M.J. Power (Eds.), Handbook of Cognition and
Emotion, 45–60, John Wiley & Sons, 1999.
fear
disgust
surprise
Page 28
Basic emotions (Ekman, 1999)
Discrete classes model
Different sets
Darwin (1872) The expression of the emotions in man and
animals
Ekman definition (6 + neutral)
Happiness
Sadness
Fear
Anger
Surprise
Disgust
Page 29
The problem
• Classification accuracy
~ 90% on Radboud Faces Database (RaFD) (Langner et al.
2010)
(Langner et al. 2010) O. Langner, R. Doetsch, G. Bijlstra, D.H.J. Wigboldus, S.T. Hawk, A. van Knippenberg.
Presentation and Validation of the Radboud Faces Database, Cognition and Emotion 24(8), 1377-1388, 2010.
Page 30
Experimental Evaluation: Noldus FaceReader™
30
Page 31
Experimental Evaluation (user study): Dataset
31
Experimental units: 40 master students (engineering,
architecture, economy, computer science and
humanities)
26 male (65%), 14 female (35%)
Age distribution: from 20 to 35
Dataset
2, 135 movies released between 2006 and 2011
Movie content – title, poster, plot keywords, cast, director,
summary – crawled from the Internet Movie Database (IMDb)
Vocabulary of 32, 583 plot keywords
Average: 12.33 keywords/item
Page 32
Experimental Evaluation (user study): Design of
the experiment
32
Between-subjects controlled experiment
20 users randomly assigned to test RWR-KI
20 users randomly assigned to test RANDOM (control
group), a baseline inspired by the blind luck principle
which produces random suggestions that showed
surprisingly good performance in the 1st In-vitro
experiment
Procedure
Users interact with a web application
– shows details of movies
– displays 5 recommendations (movie poster & title)
per user
Recommended items displayed 1 at a time
Page 33
Web application
33
Page 34
Experimental Evaluation (user study): Design of
the experiment
34
Procedure
2 binary questions to assess user acceptance
– “Did you know this movie?”
“Have you ever heard about this movie?” (unexpectedness)
– “Do you like this movie?” (relevance)
– (NO,YES) answers serendipitous recommendation
Video started when a movie is recommended to the user
and stopped when the answers to the 2 questions are
collected
5 videos per user
Noldus FaceReader™ used to analyze videos and assess
user emotional response when exposed to
recommendations
Page 35
Experimental Evaluation (user study):
Design of the experiment
35
Questionnaire analysis
Quality of RWR-KI and RANDOM
Metrics
Relevance@N = #relevant_items/N
Unexpectedness@N = #unexpected_items/N
Serendipity@N = #serendipitous_items/N
= #(relevant_items unexpected_items)/N
N = size of the recommendation list
Page 36
Experimental Evaluation (user study): Design of
the experiment
36
Questionnaire analysis
ResQue model (Chen et al. 2010)
– category: Perceived System Qualities
– sub-category: Quality of Recommended Items
– Relevance = perceived accuracy
– Unexpectedness = novelty
(Chen et al. 2010) L. Chen, P. Pu, A User-Centric Evaluation Framework of Recommender Systems, in: B.P. Knijnenburg, L. Schmidt-
Thieme, D. Bollen (Eds.), Proceedings of the ACM RecSys 2010 Workshop on User-Centric Evaluation of Recommender Systems and
Their Interfaces (UCERSTI), CEUR Workshop Proceedings 612, 14-21, CEUR-WS.org, 2010.
Page 37
Experimental Evaluation (user study): Results
37
Questionnaire analysis
Serendipity: RWR-KI outperforms RANDOM
Statistically significant differences (Mann-Whitney U test,
p<0.05)
~ Half of the recommendations are deemed
serendipitous!
RWR-KI: a better Relevance-Unexpectedness trade-off
RANDOM: more unbalanced towards Unexpectedness
Page 38
Experimental Evaluation (user study): Results
38
Questionnaire analysis: distribution of serendipitous
items within Top-5 lists
Almost all users (19 out of 20) received 1 serendipitous
suggestions
Most of RWR-KI lists: 2-3 serendipitous items
Most of RANDOM lists: 1-2 serendipitous items
Page 39
Experimental Evaluation (user study): Results
39
Analysis of user emotions
Hypothesis: users’ facial expressions convey a
mixture of emotions that helps to measure the
perception of serendipity of recommendations
Serendipity associated to surprise and happiness
ResQue model: attractiveness
200 videos (40 users x 5 recommendations)
41 videos filtered out (< 5 seconds)
159 videos, FaceReader™ computed the
distribution of detected emotions + duration
(emotions lasting < 1 sec. filtered out)
Page 40
Circumplex model
Maps basic emotions dimensional model
Arousal
Valence
high
negative positive
low
neutr
al
sadne
ss
fear
disgu
st
surpri
se
joy
anger
Russell, James (1980). "A circumplex model of affect". Journal of Personality and Social Psychology 39:
1161–1178. doi:10.1037/h0077714
Page 41
Frequency analysis of user emotions associated to
serendipitous suggestions (69 videos=81–12)
Surprise: 17% RWR-KI vs 9% RANDOM
Happiness: 14% RWR-KI vs 9% RANDOM
RWR-KI produces more serendipitous suggestions than
RANDOM! (confirm questionnaires results)
High values of negative emotions (sadness and anger); why?
Experimental Evaluation (user study): Results
41
39 videos
30 videos
Page 42
Experimental Evaluation (user study): Results
42
Frequency analysis of user emotions associated to
non-serendipitous suggestions (90 videos=119–29)
General decrease of surprise and happiness
High values of negative emotions (sadness and anger), also in
this case
Explanation: Negative emotions due to the fact that users
assumed troubled expressions since they were very
concentrated on the task
39 videos
51 videos
Page 43
Outline
Serendipity and Evaluation
Research questions
Operationally induced serendipity:
Knowledge Infusion (KI) process
Item-to-Item correlation matrix
Random Walk with Restart boosted by KI
Experimental evaluation
Noldus FaceReader ™
Dataset
Design of the experiment
Metrics
Questionnaire analysis
Analysis of user emotions
Conclusions
Page 44
Experimental Evaluation (user study):
Conclusions
44
Positive emotions:
marked difference between RWR-KI and RANDOM
Positive emotions:
marked difference between serendipitous and
non-serendipitous recommendations
Agreement between
questionnaires (explicit feedback) &
facial expressions/emotions (implicit feedback)
Emotions can help to assess the actual perception of
serendipity
A step forward to the creation of a ground truth for
evaluation purposes
Page 45
Thanks…Questions?
Semantic Web Access and Personalization research group http://www.di.uniba.it/~swap
Pierpaolo Basile
Marco de Gemmis
Pasquale Lops
Fedelucio Narducci
Annalina Caputo
Leo Iaquinta
Cataldo Musto
Marco Polignano
Giovanni Semeraro
Page 46
! איר אין וויןזען (see you in Vienna!)
9th ACM Conference on Recommender Systems
Vienna, Austria
16th-20th September 2015
Page 47
References
(André 2009) P. André, J. Teevan, S.T. Dumais. From x-rays to silly putty via Uranus: serendipity
and its role in web search. Proc. ACM CHI 2009, ACM, New York, NY, USA, 2009.
(Bordino et al. 2013) I. Bordino, Y. Mejova, M. Lalmas, Penguins in sweaters, or serendipitous entity
search on user-generated content. Proc. 22nd ACM CIKM 2013, ACM, New York, NY, USA,
2013, pp. 109–118.
(Basile et al. 2014) P. Basile, M. de Gemmis, P. Lops, G. Semeraro. Solving a Complex Language
Game by using Knowledge-based Word Associations Discovery. IEEE Transactions on
Computational Intelligence and AI in Games, 2015 (in press). DOI:
10.1109/TCIAIG.2014.2355859.
(Chen et al. 2010) L. Chen, P. Pu, A User-Centric Evaluation Framework of Recommender Systems,
in: B.P. Knijnenburg, L. Schmidt-Thieme, D. Bollen (Eds.), Proceedings of the ACM RecSys 2010
Workshop on User-Centric Evaluation of Recommender Systems and Their Interfaces (UCERSTI),
CEUR Workshop Proceedings 612, 14-21, CEUR-WS.org, 2010.
(de Gemmis et al. 2014) M. de Gemmis, P. Lops, G. Semeraro, C. Musto. An Investigation on the
Serendipity Problem in Recommender Systems. Information Processing and Management (in
press). DOI: 10.1016/j.ipm.2015.06.008.
(Ekman 1999) P. Ekman, Basic Emotions, in T. Dalgleish, M.J. Power (Eds.), Handbook of Cognition
and Emotion, 45–60, John Wiley & Sons, 1999.
(Herlocker et al. 2004) Herlocker, L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating
Collaborative Filtering Recommender Systems. ACM Transactions on Information Systems
22(1): 5–53, 2004.
(Kramer et al. 2014) Kramer, Adam D. I.; Guillory, Jamie E.; Hancock, Jeffrey T. Experimental
evidence of massive-scale emotional contagion through social networks. Proceedings of the
National Academy of Sciences of the United States of America, vol. 11, issue 29, 8788-8790,
2014.
(Langner et al. 2010) O. Langner, R. Doetsch, G. Bijlstra, D.H.J. Wigboldus, S.T. Hawk, A. van
Knippenberg. Presentation and Validation of the Radboud Faces Database, Cognition and
Emotion 24(8), 1377-1388, 2010.
Page 48
References
(Lauw et al. 2010) Lauw, H.W., Schafer, J.C., Agrawal, R., & A. Ntoulas. Homophily in the Digital
World: A LiveJournal Case Study. IEEE Internet Computing 14(2):15-23, March-April 2010.
(Lovasz 1996) L. Lovasz. Random Walks on Graphs: a Survey. Combinatronics 2:1–46, 1996.
(McNee et al. 2006) S. M. McNee, J. Riedl, and J. A. Konstan. Being accurate is not enough: How
accuracy metrics have hurt recommender systems. In CHI ’06 Extended Abstracts on Human
Factors in Computing Systems, CHI EA ’06, pages 1097–1101, ACM, New York, NY, USA,
2006.
(Murakami et al. 2008) T. Murakami, K. Mori, R. Orihara, Metrics for Evaluating the Serendipity of
Recommendation Lists, in K. Satoh, A. Inokuchi, K. Nagao, T. Kawamura (Eds.), New Frontiers
in Artificial Intelligence, Lecture Notes in Computer Science 4914, pp. 40–46, Springer, 2008.
(Pariser 2011) E. Pariser. The Filter Bubble: What the Internet Is Hiding from You. Penguin Group,
May 2011.
(Roy 2001) Arundhati Roy. Power Politics. South End Press, January 2001.
(Russell 1980) Russell, James. A circumplex model of affect. Journal of Personality and Social
Psychology 39: 1161–1178, 1980. doi:10.1037/h0077714
(Semeraro et al. 2012) G. Semeraro, M. de Gemmis, P. Lops, P. Basile. An Artificial Player for a
Language Game. IEEE Intelligent Systems 27(5): 36-43, 2012.
(Shani and Gunawardana 2011) G. Shani, A. Gunawardana, Evaluating Recommendation Systems.
In F. Ricci, L. Rokach, B. Shapira, P.B. Kantor (Eds.), Recommender Systems Handbook,
Springer, 2011, pp. 257–297.
(Toms 2000) E. Toms. Serendipitous Information Retrieval. Proc.1st DELOS NoE Workshop on
Information Seeking, Searching and Querying in Digital Libraries, Zurich, Switzerland: ERCIM,
2000.
(Zuckerman 2008) E. Zuckerman. Homophily, serendipity, xenophilia. April 25, 2008.
www.ethanzuckerman.com/blog/2008/04/25/homophily-serendipity-xenophilia/