Page 1
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.1
Chapter 3 : The Problem of Web Navigation
• User’s often get “lost in hyperspace” when– Following links on web pages, or– Jumping to and from search engine results.
• Machine learning can provide a sound basis for improving web intreraction.
Page 2
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.2
Getting lost in hyperspace
Figure 3.1: The navigation problem
Page 3
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.3
Getting lost in hyperspace
Figure 3.2: Being lost in hyperspace
Page 4
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.4The Naïve Bayes Classifier:
Automatic classification of web pages can widen the scope and size of web directories
Page 5
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.5
Trails should be First-Class Objects
Figure 3.3: Example web site
Page 6
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.6
Trails should be First-Class Objects
Figure 3.4: Four trails within a web site
Page 7
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.7
Trails should be First-Class Objects
Figure 3.5: Query results for “mark research”
Page 8
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.8
Trails should be First-Class Objects
Figure 3.6: Relevant trail for “mark research”
Page 9
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.9
Markov chains
• Markov chains have been extensively studied by statisticians and have been applied in a wide variety of areas.
Page 10
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.10
The probabilities of following links
Figure 3.7: Markov chain for example web site
Page 11
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.11
The probabilities of following links
Figure 3.8: Two trails in the Markov chain
Page 12
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.12
The probabilities of following links
Figure 3.9: Probabilities of the four trails
Page 13
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.13
The relevance of links
Figure 3.10: Scoring web pages
Page 14
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.14
The relevance of links
Figure 3.11: Constructing a chain from scores
Page 15
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.15
Conflict Between Web Site Owner and Visitor
• The web site owner has objectives related to the business model of the site, e.g. selling products in an e-commerce site.
• The objectives of visitors are related to their information needs, e.g. gathering information in an e-commerce site.
• Web site owners would like to identify their visitors (e.g. via cookies), while visitors may prefer to remain anonymous.
Page 16
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.16
Conflict Between Semantics of Web Site and Business Model
• E.g. the objective of an e-commerce site is to convert visitors into customers.
• But to keep visitors satisfied a web site must provide solutions to users’ information needs.
• There must be a balance between web site navigability and the business objectives of the site.