Top Banner
LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah
30

LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

LinkSelector: Select Hyperlinks for Web Portals

Prof. Olivia ShengXiao Fang

School of Accounting and Information SystemsUniversity of Utah

Page 2: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

2

Agenda

Introduction Problem definition -- Hyperlink

Selection Solution -- LinkSelector Evaluation Collaboration

Page 3: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

3

Introduction

Size of WWW More than 3 billion web pages (Google.com, 2001) 1 million pages added daily (Lawrence and

Giles,1999)

How to find information on the Web Using search engines (best coverage 38.3%)

(Lawrence and Giles,1999) Clicking through hyperlinks

Page 4: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

4

Introduction

  Product Category List ABCDEF

  

Product Category AProduct List A1A2A3A4A5

  Product A2 Price: 1000Detailed description 

Click on A

Click on A2

Web Page 1

Web Page 2

Web Page 3

B2

Page 5: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

5

Introduction

Portal page: is a specific web page which serves as the entrance to a website.

Portal page Important Mainly consisting of hyperlinks

Page 6: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

6

Introduction Web portal is a personalized entrance to

a website. (e.g., My Yahoo!)

Default Web Portal/Portal Page

Most My Yahoo! users never customize their default web portals (Manber et al., 2000).

Page 7: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

7

Introduction

Homepage of a Website/Portal Page

Page 8: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

8

Introduction Not all hyperlinks in a website can be placed in

the portal page of the website

Hyperlinks in a portal page are selected from a hyperlink pool which is a set of hyperlinks pointing to top-level web pages, e.g., hyperlinks in a site index page.

Page 9: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

9

Portal page

Page 10: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

10

Hyperlink pool

Page 11: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

11

Portal page

Page 12: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

12

Hyperlink pool

Page 13: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

13

Introduction

Number of hyperlinks in a portal page one to several dozens (e.g., 14 in My Yahoo!). (Neilson, 1999)

Number of hyperlinks in a hyperlink pool: one to several hundreds (e.g., 102 in My Yahoo!).

Page 14: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

14

Introduction

It is too computational expensive to do an exhaustive search (e.g., ).

Current practice of hyperlink selection – expert selection Based on domain experts’ experiences Subjective and slower to adapt

165.95E14102 C

Page 15: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

15

Introduction Our approach is based on

Web access patterns extracted from a web log – objective (web surfers’ actual visiting behaviors)

Web structural patterns extracted from an existing website – objective and dynamically adaptive

Page 16: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

16

Hyperlink Selection Metrics to measure the quality of a

portal page Effectiveness Efficiency Usage

The quality of a portal page is measured using a web log.

A web log can be divided into sessions.

Page 17: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

17

Hyperlink Selection

Effectiveness: is the percentage of the user-sought top-level web pages that can be easily accessed from a portal page.

Efficiency measures the usefulness of hyperlinks placed in a portal page.

Usage : how often a portal page is visited.

Page 18: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

18

Hyperlink Selection

Given

the hyperlink pool of a website, HP,

the number of hyperlinks to be placed in the

portal page of the website, N, where N < |HP|;

Construct the portal page by selecting N hyperlinks

from

the hyperlink pool HP

Objective: optimize the effectiveness, efficiency and

usage

of the resulting portal page

Page 19: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

19

LinkSelector

LinkSelector is based on relationships between hyperlinks in a hyperlink pool.

Structure Relationship

Access Relationship

Page 20: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

20

LinkSelector

Structure RelationshipL2

L4

L6

L8L1

L3

Web page 1

Web page 2

L5

L7Web page 3

Other Structure relationships:

L1L4 L1L6 L1L8

L3L5 L3L7

Structure relationship:

L1L2

L1: initial hyperlink

L2: terminal hyperlink

Page 21: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

21

LinkSelector

A k-HS is denoted as a hyperlink set with k hyperlinks. e.g., {L1,L2} is a 2-HS

The support of a k-HS is the percentage of sessions in which hyperlinks in the k-HS are accessed together.

Example: If L1 and L2 are accessed together in 20 sessions out of total 100 sessions, then the support of the 2-HS {L1,L2} is 20%.

Access Relationship

Page 22: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

22

LinkSelector Access Relationship

Definition : For a k-HS , where , there exists an

access relationship among hyperlinks in the k-HS

if and only if its support is greater than a

pre-defined threshold.

2k

Example: If threshold = 0.15 and the support of the 2-HS {L1, L2} is 0.2

then, there exists an access relationship between hyperlinks L1 and L2 and the support of the relationship is 0.2

Page 23: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

23

LinkSelector Discover structure relationships

Parse the existing website

Discover access relationships

Data Preprocessing Web log cleaning Session identification

Association rule mining (Agrawal and Srikant,1994 )

Page 24: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

24

LinkSelector

Page 25: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

25

Evaluation Summary of Data

Hyperlink pool: site-index page of the UA web Site

110 links

Page 26: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

26

Evaluation

Summary of Data

Web log: collected from the UA web server in Sep. 2001

10 M records (raw) 4.2 M records (clean)

total 344 K sessions 262 K sessions Training data (23 days) 82 K sessions Testing data (7 days)

Page 27: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

27

EvaluationAverage improvement: 12.7%

Improvement decrease from 22.1% to 8.4%

Average number of sessions per day: 11.5k

0.3

0.34

0.38

0.42

0.46

0.5

0.54

0.58

0.62

0.66

2 3 4 5 6 7 8 9 10

Number of Selected Hyperlinks (N)

Eff

ec

tiv

en

es

s

LinkSelector

Expert Selection

Top-Link Selection

Page 28: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

28

Evaluation

Group II relationship: 0.2% of the training sessionsGroup I relationship

/shared/sports-entertain.shtml /shared/athletics.shtml

Page 29: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

29

EvaluationAverage improvement: 17.0%

Improvement decreases from 30.2% to 9.4%

605/day more user-sought top-level web pages can be easily accessed from the portal page constructed using LinkSelector than from those constructed using the other two approaches

50000

55000

60000

65000

70000

75000

80000

2 3 4 5 6 7 8 9 10

Number of Selected Hyperlinks (N)

Usa

ge

LinkSelecter

Top-Link Selection

Expert Selection

Page 30: LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

30

EvaluationAverage improvement: 16.9%

Improvement decrease from 30.2% to 9.3%

0.075

0.1

0.125

0.15

0.175

0.2

0.225

0.25

0.275

0.3

0.325

0.35

0.375

0.4

2 3 4 5 6 7 8 9 10

Number of Selected Hyperlinks (N)

Eff

eici

ency

LinkSelecter

Top-Link Selection

Expert Selection