Top Banner
A Model of Information Foraging via Ant Colony Simulation Matthew Kusner
11

A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Jan 12, 2016

Download

Documents

Maryann Fox
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

A Model of Information Foraging via Ant Colony Simulation

Matthew Kusner

Page 2: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Information Foraging

Theory Background

– People search for information in roughly the same way that animals search for food in their surroundings.

Information Scent

– Ex: “the text associated with Web links” (Fu, 2007)

– Background knowledge

– Recommendations

Page 3: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Ant Colony Simulation

Pheromone trails

– Laid by ants who've found food.

– Followed by other ants with probability p.

– Path Evaporation Path Optimization Simulation specifics

Page 4: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

AOL Data Set 21 million queries (March 1– May 31, 2006) 650k users 19 million click-through events Quantities: query time of query click URL user ID clicked link rank

Page 5: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Information Foraging → Ant Colony

user → ant clicked link → food information scent → pheromone path website importance → food distance where website importance is defined by:

– 1. Rank

– 2. Popularity of website

– 3. Combination of above methods

Page 6: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Distancing Methods

• Ranking

• Popularity

• Combination

[based on data in Joachims et al., 2005]

Page 7: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Results• AOL user-visit per website vector

– [numWvisits1, numWvisits

2, ..., numWvisits

n]

• Simulation ant-visit per food vector

– [numAvisits1, numAvisits

2, ..., numAvisits

n]

• Pearson Correlation Score (PCS)

• Permutation Test → 95% Coverage Interval

– (AOL_datai, simulation_data

i) selection with

replacement

• Bootstrapping → p-value

– Shuffle AOL vector

Page 8: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Query Type of distancing

# of users

# of clicked links

# of distinct websites visited

Average PCS

Average 95% CI

Start

Average 95% CI

End

Significant p-val?

ranking 125 59 19 0.8182 0.3203 0.9364 Yes

vacation popularity 125 59 19 0.1296 -0.1768 0.6624

combination 125 59 19 0.1488 -0.3819 0.3920

ranking 39 25 6 0.7631 -0.4781 0.9854

rhino popularity 39 25 6 0.3906 -0.2484 0.9919

combination 39 25 6 0.2013 -0.7389 0.9657

ranking 53 61 12 -0.1825 -0.5426 0.4706

zebra popularity 53 61 12 -0.0110 -0.4667 0.5079

combination 53 61 12 0.1558 -0.3655 0.6754

ranking 52 39 9 0.6118 -0.1797 0.9214

lion popularity 52 39 9 0.0699 -0.5776 0.7296

combination 52 39 9 0.0304 -0.6170 0.6609

ranking 194 56 21 0.5358 -0.0952 0.9301

football popularity 194 56 21 0.2693 -0.1583 0.6722

combination 194 56 21 0.4149 -0.0223 0.7612

ranking 220 74 16 0.7137 -0.4225 0.9529

basketball popularity 220 74 16 0.2228 -0.1755 0.6455

combination 220 74 16 0.1415 -0.3470 0.6661

Page 9: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Results• Queries with significant p-values:

– vacation” (ranking), “baseball” (ranking), “reebok” (ranking), “adidas” (ranking), “marbles” (ranking), “helicopter” (ranking), “car” (ranking), “potatoes” (ranking), “coffee” (ranking), “farming” (ranking), “rock” (popularity), “shirts” (ranking), “playstation” (ranking), “sega” (popularity), “tom cruise” (ranking), “mel gibson” (ranking), “burger king” (ranking), “chicago” (ranking), “los angeles” (ranking), and “paris” (ranking)

• Distancing methods without 95% CI overlap:– Ranking:

• “potatoes” - neither popularity, nor combination

• “shirts” - not popularity

• “playstation” - not popularity

• “burger king” - not combination

Page 10: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Discussion• Disadvantages of popularity and combination

methods

– “vacation” example

• Possible reasons for 95% CI overlap

– Randomness

– Disregard of structure

• Significance of queries with low p-values

– Search engine matching

• Future directions

– Different Simulation

– Other similarity metrics

– Random beginnings

Page 11: A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

References

• Fu, W., & Pirolli, P. (2007). SNIF-ACT: a cognitive model of user navigation on the World Wide Web. Human-Computer Interaction, 22(4), 355-412.

• T. Joachims, L. Granka, B. Pang, H. Hembrooke, and G. Gay (2005). Accurately Interpreting Clickthrough Data as Implicit Feedback, Proceedings of the ACM Conference on Research and Development on Information Retrieval (SIGIR).